HaystackID® Acquires Business Intelligence Associates, Inc.Read More

5 Inefficiencies of eDiscovery Processing

5 Inefficiencies of eDiscovery Processing

How do you make eDiscovery Processing more efficient and less costly?

In eDiscovery, inefficient data processing can occur at multiple stages, throwing some hefty wrenches into your case as it travels through the rest of the eDiscovery workflow. If your processes do not follow industry best practices, or if your workflows are not well-defined, defensible, and repeatable, opposing counsel may challenge those efforts, further delaying or derailing your case and raising your overall costs. 

eDiscovery data processing (a.k.a. ESI processing) involves the ingestion, processing, and often initial analysis and culling of data. The goal of processing is to normalize, reduce and prepare that ESI for your review process (be that a linear review or analytics-driven review) that determines relevancy and responsiveness. Processing is the step where the sausage is made – where the raw data collected goes through the grinder and the clearly extraneous or unusable data gets removed. The result is a subset of data in need of further analysis and review.  

There are several ways to avoid the most common data processing landmines. We’ve narrowed our list down to the top five Inefficiencies that, if avoided, can make eDiscovery data processing more efficient, more cost-effective, and more useful as you move onto the later parts of the eDiscovery process.

Inefficiency #1: Not having the right conversations up front

Before the first byte of data is processed, discuss with your vendor the following:

  • What software is your vendor using, and what are its capabilities and known limitations? For instance, not all ESI processing engines can proficiently process non-Windows based data (i.e. Mac or Linux data), so if your data crosses platforms, make sure you address that up front.
  • How recent is the version of the processing software they are using?  There will always be some lag times between new releases and implementations, but make sure your vendor isn’t using a system that is long overdue for an update. This could impact everything from how it handles modern data types to the software’s security.
  • Is your vendor’s processing system securely implemented? From encryption at rest to network protections and other security measures, it’s important that your vendor’s processing system is secure.
  • Can the vendor process the more complex file types like audio/video files, transcripts, online chat software, text messages, collaboration platforms and the like that have become increasingly pervasive in our remote work world? Not all systems are alike, and it may be appropriate to utilize supplemental systems or plug-ins at times. Similar to how review tools like Relativity have plugins to handle video reviewing and searching more efficiently, certain data types might need custom handling during processing, too. In short, talk about the types of data and how to handle these more modern data types most efficiently and effectively.
  • If using keywords, work with your vendor to understand and negotiate keyword searches. Keyword searches become rather complex when you start to consider wildcards, proximity, and other advanced search techniques. Different processing systems may have different capabilities that can impact that process as well.

Speaking of data types, here are some things you should do at the point of data collection to help make ESI processing more efficient and effective:

  • When working to identify the appropriate data sources for collection, make sure you note the details of non-email or traditional document systems. For example, if you learn that the organization uses Microsoft Teams, get into the details of how they use it. Is it mainly used for just chats, or are there complex SharePoint sites enabled or other application plugins in use?  Understanding exactly how the organization uses such systems will help you better prepare for the types of data you might need to collect and eventually process.
  • Understand if proprietary systems or data types exist, as that data may be more difficult to process. Legacy data sources, customized databases and applications, and more unique data sources could contain data that needs special handling. For example, a custom document repository or custom database may require special handling to ensure that the data is both usable and complete.
  • For customized databases, determine the types of data, how best to collect that data, and your vendor’s experience in doing so. You may find that it is much more efficient and effective to customize reporting or use other custom processes to handle that data differently than other more routine data types and sources.
  • Talk about the amount of time it will take to process any such custom or complex data. Attorneys often have to make representations to opposing counsel and the court about when they can produce data; identifying these more complex data sources early will help ensure transparency on how those complexities might effect timelines.

A few additional helpful suggestions to help prioritize your process:

  • Leverage the understanding you gain from the above to prioritize which of those data sources are most important to collect and review first. If, for example, the case is a contract dispute, then the contracts database (or wherever the organization stores its contracts) would be a natural first place to start, maybe even before email.
  • That said, make sure you look at the complexity of the collection and processing in prioritizing the steps. Even if a particular system is incredibly relevant, if the data will require customized handling, it may be better to concentrate first on more readily accessible ESI that can be more quickly collected, processed and reviewed. That will help get the gears moving – that ESI can be worked on while the more complex data sources are handled. If you’re waiting, you’re wasting time. It’s not a serial process – you can have several pieces moving in parallel. Balancing the priorities with the time requirements will help your overall efforts keep moving forward efficiently. 

Inefficiency #2: Using multiple vendors across all your cases.

A multi-vendor approach to eDiscovery leads to unnecessarily increased risks and costs. This approach will prevent you from being able to recycle and leverage your data, processes, knowledge and more across other existing and future matters.

It’s never a good idea to reinvent the wheel with each new project. And it’s especially not a good idea to constantly bid out each new project to a host of disparate vendors, especially if your organization deals with frequent litigation. Using multiple vendors actively decentralizes your data and overly complicates your entire process. Such decentralization is not only inefficient and costly, it’s a bad data security practice as well.

If a decentralized approach with multiple vendors is how you tackle your eDiscovery needs, remember that each vendor has its own methods, processes, and pricing models, further complicating your ability to control your eDiscovery. With each vendor operating in their own disparate silos, your business will experience more overall disruption as that wheel is continuously and needlessly reinvented. Of course, without centralizing these efforts, you cannot leverage the all-important but hard-to-quantify institutional knowledge that brings immense efficiency and accuracy to your process.

Inefficiency #3: Not reusing processed data for serial custodians

The term “serial custodians” refers to employees (or other relevant individuals or systems) who are often continuously on litigation hold and whose data tends to get repeatedly collected for each new matter. Common examples are executives or managers who routinely are identified as custodians in various matters. If your organization has serial custodians, it’s essential that you reuse their data to the greatest extent possible. Failure to do so creates significant redundancy and inefficiency.

This ties into the centralization of vendors mentioned above. A competent vendor – one that routinely handles the “serial custodian” concept, should create a centralized repository where all your data lives, regardless of matter. Of course, data can be segregated by matter, but with the right approach and process, a centralized repository means that you can leverage your spend and efforts across cases. When a new case arises, you can quickly, easily, and inexpensively pull data from the ESI already processed for those custodians.

Sure, you may need to supplement that data with newer data or data from other resources not relevant to prior cases. But at least with respect to the data you already paid to have collected and processed, you needn’t pay for that again (and again…and again). By centralizing these efforts, you gain consistency in your processing, searching and filtering efforts as well, which bolsters the overall defensibility of your process.

Added bonus: If any of that data was reviewed in a prior matter, certain aspects of prior reviews (especially privilege review, redactions, even some types of issue coding), you could reuse that work product as well!

Inefficiency #4: Not having a plan for encrypted or protected data.

Not having a plan in place for handling encrypted or otherwise protected ESI can be surprisingly inefficient. If you’ve yet to encounter protected data in an eDiscovery matter, you soon will. Device encryption and data encryption are two of the most crucial security steps companies must take to protect their data, so it’s no surprise to encounter password-protected and encrypted files during the processing stage.

Make sure that your eDiscovery processes, software, and tools have the necessary features to identify, record, and report on these protected files. In some cases, this will require manual intervention to unencrypt or remove password protection before file ingestion. Strategic custodian questionnaires and coordination with the IT department early on can help to identify and deal with these troublesome files. Depending on the results, you may even identify data encryption issues to address and handle prior to any processing steps.

Inefficiency #5: Not having a plan for exception files

When files are being processed, and an exception issue is encountered by the processing system (e.g., a file is corrupted, encrypted, or password-protected), those files get marked as an “exception” and added to an exception list. If there are too many exceptions, or if there are certain types of significant files with exceptions (an entire PST file, for instance), then you will need to investigate those issues further, and where appropriate, take steps to resolve the exception issue.

Make sure you know what your vendor believes constitutes an exception. Ask them how they handle exceptions generally, whether there are any particular file types (such as PST files) where exceptions are always addressed and resolved, and what other steps they take to resolve such exceptions. Not all exceptions need handling, but it’s important to understand how exceptions will be handled and adjust the process from the outset. The last thing you want to find out is that there were exceptions (like a PST full of 50,000 emails) that no one addressed until late in the process – or worse – after discovery closed.


ESI Processing may seem simple on the surface, but there is a highly sophisticated technical and business process underneath the hood. There are careful considerations to contemplate before allowing your corporate data to traverse a vendor’s eDiscovery processing engine. eDiscovery processing is complicated, especially in a world where data types and applications are growing at astronomical speeds. To do eDiscovery data processing correctly and accurately is not an easy feat—unless your vendor has the experience and mature, tested methods and technology to handle it.

ESI processing is an incredibly important but often overlooked part of the overall workflow, so using the right eDiscovery service provider is critical to success. BIA has been helping clients with efficient, defensible, and cost-effective data processing and eDiscovery services for two decades. For help identifying and getting rid of the inefficiencies in your data processing workflow – or anywhere in your eDiscovery process, we invite you to reach out today.