Do you really need that full forensic image, or will a targeted data collection get the job done?
Full forensic imaging and targeted data collections may sound similar and even get tossed around interchangeably at times, but they are very different operations. We see countless hours of time and millions of dollars wasted due to clients choosing full forensic imaging when a targeted data collection is what they really need. Understanding some important differences between the two data collection types will help you make the right choice for your case and your wallet.
What is a full forensic image?
A full forensic image is a bit-for-bit copy of an entire data storage device, like hard disk drives, solid state drives, USB or thumb drives, etc. A forensic image captures not only the “live” data but also all deleted or potentially recoverable data. Depending on how the image is created, this may also include unallocated and free space too. Think of it like copying a notebook, but instead of just copying the pages with writing, you capture the covers, the binding, the empty pages, the inside covers – absolutely everything. Sounds like a waste of time and effort, right? In most instances, full forensic imaging is just that.
While full forensic imaging might be useful in certain circumstances (more on that below), in most civil litigations, it is the definition of overkill and a poster child for over-preservation. Using full forensic imaging instead of targeted collections will increase your collected data size by a factor of 5x, 10x or more. As a general rule, every step of the EDRM costs more when you collect more data. Thus, by over-collecting you’re increasing the costs of every other step in the EDRM.
When is a full forensic image needed?
Full forensic imaging is for when you need to know exactly how, what, when and where a user did certain activities on a given device, or if you suspect that an individual actively hid or attempted to destroy data. In short, it is most useful in a forensic investigation—that is, when you’re trying not only to preserve data but also to examine a user’s behavior or establish their intent.
A common example is an IP theft case. Sure, you’ll want to see all user data, emails, etc. on the individual’s devices. But you’ll also want to see what else the user did on the computer, such as:
- What other resources (network shares, collaboration systems or other such resources) were accessed?
- What devices were connected to the computer? (Did the user connect a USB drive to steal intellectual property?)
- What files were accessed, copied, or transferred using a given device (even if those files no longer—or never did—reside on that device)?
- What data did the user delete—or attempt to delete?
In short, you need a full forensic image if you’re investigating exactly what an individual did – reconstructing their daily activities going back days, weeks, months, or even years. However, if all you need is to collect data for review and production in response to a document request or subpoena, you almost never need to go to that extreme depth of investigative actions. A targeted data collection is a much better (faster and cheaper) tool for that job.
What is a targeted data collection?
A targeted data collection is precisely that—a collection of specific, targeted information related to a case. On a custodian’s computer, that generally means their email and user-created files like Word documents, spreadsheets, PDF files, etc. It specifically excludes system files, log files, program files, and other items rarely needed in litigation.
For example, data required for a case might get pulled from specific folders on a computer, a flash drive, a network share, or a specific date range of emails and files. A targeted data collection only collects active (non-deleted) files that you see in those folders or directories. It does not collect any deleted information or forensic artifacts, though it can include data from a “trash” bin in your email or on your computer (that data isn’t really deleted yet, after all).
When is targeted data collection appropriate?
Targeted data collections have become increasingly prevalent in modern eDiscovery, and they are the dominant norm for many companies, law firms and service providers. Unless you suspect attempted data destruction, or it’s a criminal matter where you need answers that only a forensic investigation can provide (as discussed above), then most of the time, targeted data collections will meet your data preservation and collection needs.
Especially considering the unbridled and exponential growth of data, controlling how much data you collect is the single most important thing you can do to control costs. Targeted data collections are the easiest and most defensible way to avoid collecting unnecessary data and thus your greatest weapon for controlling your costs across the entire workflow.
Simply put, targeted data collection is appropriate in nearly every civil litigation or regulatory matter and should be your default, with full forensic imaging done only when clear necessity is demonstrated.
How do I choose between a full forensic image and a targeted data collection?
The type of case determines the type of collection needed. A true forensic investigation—where you’re not just trying to locate standard user data, but truly investigating everything an individual did on or with a computer or other device—usually requires a full forensic image. If that’s not the case, and all you need is the data, then a targeted data collection should suffice.
Why does that choice matter?
Time and money! We often hear: “Can’t I just image the whole computer and go back to cherry-pick later?” You certainly can, but over-collecting costs you much more time and saddles you with the significantly increased cost of sifting through mountains of irrelevant data that you will never need. Sure, maybe you can eliminate a lot of the excess relatively quickly, but that process still costs time and money, so why even set yourself up for that kind of waste?
Think about it in terms of paper files for litigation in days of yore. If you wanted to collect documents for a case, you wouldn’t walk into a custodian’s office and grab every piece of paper in every filing cabinet, every folder in every Bankers box, every framed document hanging on the wall, every Post-It note in the wastebasket, and every strip of paper in the shredder. Unless you were the FBI conducting a raid, that type of collection approach is clearly unnecessary and arguably absurd. Instead, you would work with the custodian to identify potentially responsive material and be highly selective about which paper documents to collect, review, and produce for discovery.
The same goes in our digital world of litigation. In the vast majority of civil litigations, it is not necessary to perform full forensic imaging of a custodian’s entire hard drive. Unless there is a compelling reason (e.g., potential criminality or suspected improper deletion of files), you are better off doing a targeted data collection of a custodian’s active data.
What about mobile devices?
You actually cannot image many mobile devices. Apple has had encryption in place for the last decade that makes it impossible to fully image iPhones. (When dealing with iPhones, our forensic team calls it a “preservation” as opposed to an “image”.)
It is possible to image parts of an Android, like the SD card. There are specific techniques that allow imaging of other parts of Android cell phones, but this requires jailbreaking and other tactics, many of which are used only by law enforcement. (Here is a deeper dive into remote cell phone collections.)
So, even with mobile devices, targeted collections should be your de facto workflow, with full forensic imaging left to only limited, high need cases.
Ask the Data Collection Experts.
Still not sure which collection type is right for your case? BIA can help. For two decades, our Digital Forensics team has been helping clients identify, collect, and preserve data for litigation. Reach out to us about your next project, whether you need a full forensic image, a targeted data collection, or help choosing between the two.