Wednesday, August 5, 2009

Doc Review Metamorphosis

Most people today communicate via email far more than they do via telephone (which in turn people use far more than letter-writing). It wasn’t always this way, of course, but most people do not realize that email has been around in one form or another since 1984! It took effectively 25 years for email to come to dominate as the preferred communication method, particularly for business.

How long will the Doc Review industry take to evolve to true non-linear[1] activity? Probably in less time than you think. For comparison, the transition from letters to phone dominance took more or less 60 years (1900-1960). The transition from phone to email took 25. And now as we evolve from relatively static email to far more dynamic Twitter, texting and live data/video feeds, the evolution will be even shorter, generally expected at 5-7 years. As a society we are becoming faster at adapting to change. The most sophisticated review teams are just now experimenting with non-linear review. The rest will be there within 3-4 years.

Already we see signs of the evolution to non-linear document review. Most people are at least aware of the simplest form on non-linear review: near duplicate detection. With near & exact duplicates grouped together into “Dupe Groups,” the very first level of non-linear review is taking place. With Near Dupe, savvy reviewers actually look at a group of documents together, rather than one by one. Documents are usually grouped together by content or attribute similarity, with a group “captain” embodying either the fullest set of content or a logical start and end-point to the logic chain grouping the documents together.

A similar technique is called Email Thread Grouping (ETG), where pieces (“stringlets”) of email conversation threads that might be resident in pockets of document storage mechanisms are brought together logically. Because typical ESI collection involves documents from several custodian sources, often who have important communication relationships with one another, the incidence rate of disassociated email conversations is extremely high. ETG bring together groups of documents from Inboxes, Outboxes, folders, different email storage systems and even different collection sites! By grouping the conversations together from all custodial sources, the review has taken a second step toward non-linear review.

Finally, the courts and the industry are waking up. It’s not going to be a “doc-by-doc” world much longer. Will you be ready? Where can you turn for smart, unbiased information?

The Electronic Discovery Institute is a non-profit organization dedicated to resolving electronic discovery challenges by conducting studies of litigation processes that incorporate modern technologies. They recently conducted a survey which broaches the subject of electronic document deduplication; the beginnings of non-linear review. Have a look and let us know what you think.


[1] Non-linear review is the concept of reviewing documents in bulk fashion, rather than one by one in sequential order.