A checksum is a mathematical calculation or algorithm that is used to determine hash values and thus create a unique identifier for ESI. The checksum is calculated such that any changes or errors introduced into the original file by virtue of transmission or other data processing can be easily identified. Checksums are designed so that even a small change in the input file will produce a different hash value or digest. Checksums are used throughout computing and are designed for the specific application for which it is being used.
How are Checksums related to eDiscovery?
In eDiscovery, several checksums have become standard in order to identify duplicate documents such as a spreadsheet, word processing files and emails. Those checksums include MD5 or SHA-1. These differ from checksums used in file transfer processes such as CRC.