We at Valora are occasionally called upon to evaluate the results of traditional document review (read: manual, doc-by-doc review efforts), as a sort of post-project audit. We use the usual precision and recall metrics to determine whether a document should have been tagged at all and whether it was done so correctly. While both these measures are extremely important, the truth is it’s really all about the recall. Let me explain.
If you are supposed to mark 1,000 documents as privileged, and your team only tags 200, does their precision on those 200 really matter? In layman’s terms: Do you really care how well those 200 docs scored on accuracy when the reviewers missed 800 documents in the first place?
This is not to suggest that there isn’t a crucial role for precision scoring. There is! But, not if you blow it on recall first. In our above example, if the review team indeed found 1,000 documents and marked them privileged, you would surely wish to score the accuracy of those markings. But, only if the recall fell within 20% +/- of the intended target. Well below (or above) this mark, precision is useless info. So, until as an industry we can demonstrate reliable recall on a regular basis, precision will just have to wait its turn.