Navigating eDiscovery and EDD Processing Techniques
For the uninitiated, setting off into the dark waters of Electronic Data Discovery (EDD) can be an intimidating proposition. EDD processing involves the conversion of potentially relevant electronically stored material into a format that makes it searchable, user-friendly and ultimately producible to requesting parties. The types of data that are processed through eDiscovery differ depending the case, but most commonly include email stores, office documents and image files. While all cases and their associated data are different and present unique challenges of their own, maintaining and adhering to a consistent framework for taking raw source data and transforming it into useable work product is essential to success.
It all starts with developing the right game plan. Most matters involving the need for EDD services start with a “meet and confer” where the parties discuss the matter at hand, identify key custodians of pertinent data, discuss search terms, etc. From there, relevant custodians are contacted and their data is identified and collected. Potential spoliation of data is a major concern and all data collection activities must be legally defensible. While simple collection procedures are generally acceptable, always seek to know beforehand if forensic collections and detailed chain of custody documents must be maintained to attest to the purity of the data.
Once the data has been collected, the initial stage of EDD Processing consists of loading data into EDD software to create a workable database for your collection. There are three main components to the loading process that are absolutely critical to the successful processing of your data. File Parsing refers to the method by which individual documents are recognized by type and separated for individual review within your database. It also creates a unique identifier or “hash” value which can be used later in the process for de-duplication. Attachments to emails and embedded files are extracted from their host documents.
Metadata extraction pulls relevant data such as date and time stamps, author information, email conversation IDs etc. into a searchable database. Finally, text extraction pulls searchable text from the files.
With these three pieces of the EDD puzzle in place, several secondary steps at this juncture can significantly simplify and speed your review. DeNISTing takes known system files which reside in all data and removes them from your data set. De-duplication takes the unique “hash values” created during the load process and allows you to systematically identify and remove unwanted duplicates.
Filtering allows one to apply values such as pertinent date ranges, removal of other non-relevant file types, etc. Searching against the text derived from the text extraction process narrow the field even further. By employing all of these data mining strategies made possible through eDiscovery services, some data sets have been reduced by over 90% reducing image processing needs, review time, and ultimately costs exponentially.
At this point, what has been identified as potentially relevant data can be hosted and reviewed natively for responsiveness. Selected files may be converted to image form so they may be redacted for privileged material. As most productions in litigation matters are ultimately produced in image or paper form, native files are converted to image files and assigned Bates Numbers before being turned over to the opposing party.
The long voyage from raw data to production is at an end. eDiscovery needn’t be a harrowing adventure provided you remember a few simple rules.
- Formulate a Plan – Identify key players, and map out a path to success
- Know Your Data – only process and review what is necessary
- Go Methodically – re-processing data is costly and error-prone. Measure twice, cut once
Share this article: