Monday, March 10, 2014

Data Vs. Document Vs. Content

Remember letters?  Typeset documents on official-looking letterhead?  When we communicated primarily via letters, no one wondered what to call the media transmitting information.  It was a Document, plain and simple.  Then came the Internet and websites and eyeballs, and suddenly it was all about Content.  Keeping your content fresh, managing your content, re-using content.  Now it is all about the Data – Big Data, of course.  So, what’s the difference?  Data vs. content vs. document – is there a difference?  In theory, not much, but in practice, yes there is.

Let’s start with Documents.  Documents can be physical or virtual, but they typically have a defined start and end, often delineated by page.  Documents have a specific purpose: they were created by someone, for someone, and they are meant to convey information.  Documents carry with them an air of significance, importance and validity.  That’s why we have phrases like, “Legal documents, financial documents and immigration documents.”  Good examples of documents:  Your tax form, your birth certificate, a receipt from a purchase, your boarding pass.

Content is amorphous.  Though it too can be physical or virtual, it is generally thought of as virtual/electronic in nature only.  Content may or may not have a specific purpose.  It may be written by someone, or sometimes auto-generated.  Content is often not meant to stand on its own, but rather be a supporting player.  Content can be ephemeral, biased and taken out of context.  Because of this, content is not always trusted and carries less validity than documents.  Good examples of content:  blog entries, news, chapters in a book.

Data is virtual.  It is reported, stored or derived from other systems and carries with it a factual and scientific nature.  Data is meant to be bias-free and exist for measurement or tracking purposes.  Good examples of data:  your height and weight, stock prices, bank account balances.

To call information data is to expand on the original intent of what we understand data to be.  However, because our information today is generated and stored electronically, it feels like data, and we (or savvy marketers) have started calling it data.  Thus stored information has becomes data, with all the attached concepts typically assigned to data (factual, bias-free, etc.).  Data, therefore, feels trustworthy and valid – a strong case for managing its exposure.

For more information on the difference between Data and Records, see my article in this month's ARMA newsletter.  When is Data A Record?  (See pages 23-25)