Endless book tunnel in Prague libraryAs the information tide relentlessly rises, many organizations simply see an IT problem, to be fixed with a purely IT solution – more storage capacity, more tools, or both.  But merely adding more storage is a reaction, not a strategy.  And adding technology tools without the right governance rules invariably makes things worse, not better.

This is not a criticism of your IT team.  Instead, the problem lies in a misunderstanding of the fundamental challenge.  Just as you shouldn’t bring a knife to a gun fight, you shouldn’t merely bring more storage capacity and IT tools-without-rules to your fight to regain control over your organization’s information.  What’s needed is governance.

More Storage is Not the Answer

If the accelerating, worldwide growth of data were a movie, it would star Vin Diesel – Fast & Furious.  It’s hard to wrap one’s head around the magnitude and velocity.  Try this – for context, the total content of all catalogued books in the Library of Congress has been estimated variously at 10 to 15 terabytes of data.  IDC’s Digital Universe Study pegged the world’s 2015 data volume at 4.4 zetabytes (4.4 billion terabytes), and forecasted that data volume will double each year, reaching 44 zetabytes by 2020, a tenfold increase.  In case your head hasn’t exploded yet, IDC now forecasts that the world’s 2025 data volume will reach 180 zetabytes.  Apparently 1,000 zetabytes is a yottabyte, and as of yet there is no officially recognized International System of Units name for 1,000 of those (I propose “Lottabyte”).

Why the dizzying growth?  Internet use is certainly a contributor (a lot can happen there each minute).  But it is the Internet of Things, combined with the Industrial Internet, that will increasingly generate gobsmacking quantities of device and machine data.

Let’s hone in on the reality faced by individual organizations. Unstructured data (documents, spreadsheets, presentations, audio and video files, email, and the like) can comprise 80% to 90% of total enterprise data.  Unstructured data is often largely uncontrolled, scattered across network drives, user’s computers, and the organization’s electronic content management (ECM), collaboration, and e-communication systems.

Veritas’ Data Genomics Project produced an interesting 2016 study that analyzed tens of billions of unstructured data files, with over 8000 file extensions, at Fortune 500 companies.  Key finding?  Storage capacity grows each year, but so does data volume – 39% annual growth in the number of unstructured data files, year over year.  Just as a bigger closet or garage at home results in the accumulation of more stuff, when businesses add larger on-premise or cloud repositories without governance controls, it inevitably leads to larger data volumes.  More storage simply enables more data hoarding.

Tools Without Rules are No Help Either

Total spending on data management hardware, software, and services is forecast to nearly double from 2015 to 2020.  That’s a lot of tools, encompassing operational databases, analytic databases, reporting and analytics, data management, performance management, event/stream processing, distributed data grid/cache, Hadoop, and search-based data platforms and analytics.  But these tools need rules, with governance strategies yielding procurement and deployment requirements, and ultimately implementation rules.  Without right-sized rules, the tools can’t do the job.

For example, a robust data crawler can find data types across the enterprise’s repositories … if it’s properly told what to look for.  Common analytic tools can rather easily identify files untouched for a period of years … but does that indicate lack of value?  The least frequently accessed documents at home are usually one’s will and life insurance policy.  And big data/predictive analytic tools can be invaluable in attacking legacy troves of unstructured data … but what rules will guide the machine learning?  The key to solving the volume problem is understanding which data are valuable, and to be actionably clear about value, the organization needs clarity about information governance.

AIIM’s 2016 State of the Information Management Industry Report asked respondents to what extent their information management and electronic content management system decisions are driven by a set of agreed and supported Information Governance policies.  The results?

  • Only 18% of the respondents report having agreed-upon Information Governance policies and making their decisions on that basis.
  • 29% do not have a set of agreed and supported IG policies.

  • 37% are formulating IG policies to help with their decisions.

  • 15% report that they have agreed IG policies, but that those policies do not drive their decisions.

Wow.  The movement toward governance rules for information is encouraging, but really?  82% of respondents acknowledge that data management decisions are being made every day at their organizations that are not driven by an agreed-upon governance strategy?

Tools need rules, grounded in a coherent information governance strategy for the organization.  Otherwise, we’re just kidding ourselves about information value, compliance, cost, and risk.