HaystackID® Acquires Business Intelligence Associates, Inc.Read More

TAR’d and Feathered

TAR and attorney document review

A straightforward guide of best practices using two recent case examples.

Technology assisted review (TAR) is here to stay. It has made review more efficient, and it can significantly lower the legal spend. Of course, as with any tool, if it’s not wielded properly, it can become a problem.

Here are two recent cases involving TAR – one where it was used correctly, and one where it was not.

TAR Case 1

The documents: 115,645 documents to be reviewed, after the initial culling of a larger data set.

The challenge: 1 attorney, 6 weeks.

The technology: There are two different technology approaches to TAR. One (TAR 1.0) involves reviewing a sample set of documents, feeding the results back into the machine, reviewing a second set, putting those results back into the machine, etc. This iterative statistical sampling process is done until the machine is trained on what documents are relevant. The technology can then categorize the entire data population based on the characteristics of the trained subset.

The newer TAR technology (TAR 2.0) actively learns as reviewers tag and code documents. So, there is no training of the system, per se; instead, it learns as it goes, and it updates the ranking of the population based on the most recent documents coded.  

In this case, TAR 1.0 was used.

The result: To train the TAR system, the attorney reviewed 4,444 of the 115,645 documents. To ensure the training was effective, the results were compared to a random control set, which is a standard way of measuring both precision and recall during the review. There was an approximate 50% richness rate, meaning about half the documents in the population were responsive or relevant to the matter.

The attorney was able to train the system through sample rounds to produce an 88% precision, showing that the search had a high rate of accuracy, and an 80% recall, showing that the search had a high rate of completeness. The numbers held even though, during the review, the attorney decided to change how some documents were tagged, which required retraining of the system.

Precision and recall figures were shared with opposing counsel, who reciprocated by providing their figures. This is not a required step in the process, but when both sides share their processes and results, it builds trust in both the process and the technology.

The costs: With an average review speed of 75 documents per hour, the attorney spent approximately 60 hours reviewing the documents. At $400 per hour, that resulted in approximately $24,000 in review costs. Support time for the review involved approximately 20 hours at $300 per hour, for an additional $6,000, plus the TAR technology cost of about $3,000. In total, the cost for the entire document review process incurred by our client was around $33,000.

Comparatively, if the documents had been reviewed by a traditional contract attorney review team, the cost would have been much higher. Assuming a review speed of 50 documents per hour, billing at $60 per hour, the cost would be around $138,000, not including setup time, overtime or time for an associate or partner to oversee and finalize the review. Including the additional associate time, traditional contract attorney review may have cost as much as $150,000.

Thus, our client saved over $127,000 by employing a technology assisted review approach.

TAR Case 2

The documents: A large data set was culled to 30,000 documents using keyword terms that had been agreed upon by both parties. Opposing counsel handled the review and, despite a very high perceived richness rate, implying that most of the 30,000 documents were responsive, initially produced only 12,000 documents. After repeated requests for more information about the non-produced files, opposing counsel supplemented the production several months later with an additional 5,000 documents.

The challenge: Lack of communication between the parties translated to lack of trust in the process.

The technology: This case employed active learning technology, or TAR 2.0, which uses machine learning to accelerate review and increase productivity. The dilemma was that opposing counsel failed to provide notice that a TAR process would be used on their data. While attorneys are not required to disclose the use of TAR technology, in this case, it caused confusion and skepticism.

The result: Based on the initial culling, the plaintiff had expected to see more of the 30,000 documents. The assumption had been that a review team would remove any non-relevant documents and produce the rest. When only 12,000 were provided – without any explanation of how or why the data was reduced or why some files were marked as non-responsive – the plaintiff wasn’t sure if the files produced were accurate to the request for production or if some were incorrectly withheld.

It wasn’t until in the courtroom that the plaintiff learned that a TAR process had been used. It had not been discussed prior, nor was it something that the plaintiff had agreed to. The judge was even surprised at the use of TAR, since there was a relatively small number of documents to review.  

A request was made to the opposing party to validate their TAR process. After reviewing the metrics, the validation was considered acceptable; however, there was suspicion that the TAR process had not been properly trained. The initial culling involved very unique search terms, including atypical name spellings, and it seemed to show that almost 30,000 documents were responsive and should have been produced. A motion to compel was filed to review some of the data that the opposing counsel had marked as non-responsive.

It also came out that the defense had withheld what they deemed to be potentially privileged documents without notice, though they later produced the data – approximately 5,000 documents – when questioned, albeit after the production deadline.

The costs: The plaintiff took advantage of BIA’s eDiscovery Consulting and Advisory Services, and our expert team was able to jump in and help educate the plaintiff on best practices for using TAR, how to discuss it in court, and other tactics that helped diffuse the situation.

Regardless, the total costs added up quickly. Because of the lack of communication between the parties, both clients were required to spend additional money retaining advisors to attend meet and confers; to create and respond to the motions to compel; to test the system and to engage tech support; as well as to gain knowledge about what the process should be, how to discuss it competently and what could have been done differently.

For this matter, because TAR wasn’t deployed successfully, a traditional manual linear review by contract attorneys would have been a more cost-effective and defensible approach, and it would have yielded more confidence in the results.

Using TAR doesn’t have to mean getting covered in sludge. An ESI order could explain what the process will be, and it should be agreed upon by both parties – before it’s filed with the court. When both parties are transparent about the TAR process and cooperate with each other, as in Case 1, there is identifiable savings. When parties aren’t transparent, it causes unnecessary additional expenses and second-guessing, as in Case 2.

The industry doesn’t expect 100% accuracy and 100% recall, nor should it. Studies, like this one that examined documents from the TREC 2009 Legal Track, have shown that even data reviewed by humans carries a certain amount of inconsistency. With many vendors moving to continuous active learning technologies, or TAR 2.0, the TAR process is validated differently. (This is because validation for TAR 1.0 demonstrates that the system was reasonably trained, in contrast with continuous active learning, where the system is constantly being trained until the metrics indicate sufficient completion.) However, anyone using a TAR process should always be prepared to provide validation of the process when questioned. Those who don’t cooperate will incur increased costs with “discovery on discovery.”

Ideally, clients want to have “the feather” when deploying TAR solutions – meaning they want the technology to allow for “lighter” costs and smaller legal teams with subject-matter experts who understand how the technology operates. They want their cases to move as swiftly and freely as possible. The feather only performs well until it gets stuck in the mud, at which point the cost of the review may surpass the cost of a traditional document review approach.   

Contact us to learn more about TAR and what is the best method to use for your legal needs.