Image of one hundred bill burning on black background“If your clients don’t have a records management system, they may as well take their money out into the parking lot and set it on fire.”

– U.S. District Court Magistrate Judge John Facciola (now retired, and missed)

We all know that ediscovery is expensive, and various research reports have so confirmed. The 2012 Rand study, Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery, found that median costs for collection, processing, and review are $17,507 per gigabyte (roughly 3,500 documents or 10,000 e-mails).  The math is not pretty – a case involving 482 GBs of source data could exceed $8 million in ediscovery costs.

And on top of that are preservation costs. The 2014 Preservation Costs Survey demonstrated that large companies incur significant fixed costs for preservation (for in-house ediscovery personnel and also for procurement and maintenance of legal hold management and data preservation technology systems), averaging $2.5 million annually.  More significant is the cost of employee time lost in complying with legal holds.  While companies with up to 10,000 employees incur the average time cost of over $428,000 per year, costs for the largest companies exceed $38 million per year.

There is indeed great complexity in how to cost-effectively process huge amounts of data through the ediscovery funnel. Tighter management of ediscovery processes is important, and TAR continues to be a promising alternative to traditional review, with significant cost-savings potential.

But as we ponder how to cut costs, let’s not forget to use Occam’s razor:

What if the primary driver of ediscovery cost is not found in the complexity of how we manage the collection-processing-review-production funnel, but is instead the breathtaking volumes of data we cram into the ediscovery funnel in the first place?

Sure, overpreservation is indeed a contributor to ediscovery cost. For example, Microsoft disclosed in testimony to the Judicial Conference’s Advisory Committee on Civil Rules (in advance of the recent FRCP amendments) that it preserved nearly 700 times the data it eventually produced in litigation.

But there’s more going on here than the Legal Department’s understandable reaction to persistent uncertainty in spoliation standards and exposures (FRCP amendments notwithstanding). Simply put, we retain way too much data, without any legal requirement or business need.  And once the preservation duty arises, the ediscovery volume damage is already done.

The damage arises most commonly with data sources that are largely uncontrolled. Participants in the Preservation Costs Survey reported that, regardless of company size, their most difficult and burdensome data sources for preservation are email and hard drives, followed by legacy data.  The common denominator?  A lack of effective governance for unstructured business data, which allows rampant overretention, which in turn, unsurprisingly, results in overpreservation.  And on it goes.

As observed in the 2015 Annual Report of the Information Governance Initiative:

Our inattention to information is a creeping disaster in an age where we can no longer feel the weight of information, but we completely rely on it for our success and even our survival. Silicon Valley is an enabler, with business models that allow—and even depend upon—our atavistic fear of throwing things away by making information storage (and its costs) all but invisible.

Whether visible or not, the costs are there. And while a price will be paid later, during litigation, the deal was struck long before, when we kept information far beyond any compliance or business need.  We should stop doing that to ourselves.