Use of analytics in eDiscovery is gaining momentum and gradually becoming a mainstay in the industry. Analytics allows you to investigate a data collection, separate the wheat from the chaff and zero in on highly relevant data. The ultimate goal is to quickly build the facts of a case in an efficient and cost effective manner, by excluding irrelevant information and reducing document collections, and by grouping documents together by relevance, issues or similarity. Courts recognize that analytics tools save time and significantly reduce the cost of eDiscovery. Standard key features include:
Concept Clusters – Documents and emails, etc., are analyzed based on their text and complex algorithms group documents together based on their conceptual similarity. Even though the words within the documents may be different, documents will still be grouped together if they are conceptually similar.
Concept Search – Text is searched based on the association of words rather than keywords. The results of a concept search are conceptually related to the text selected or the example document submitted, and the most relevant documents are returned first. Concept search is an extremely powerful tool that can be used to quickly learn about a topic, uncover relevant and/or underlying relationships that a keyword search cannot find.
Near Dupe – Deduplication removes documents that are 100% duplicative, but what happens when related documents are only 99% similar, or when there are various versions of the same document? Near Dupe (ND) detection identifies similar documents and groups them together. In near dupe detection, one document is selected as the “pivot” document of the group and the others in the group are then compared to it.
Email Threading – Also known as “Conversation Threading,” email threading identifies subsets of emails, and emails that belong to the same thread, such as replies to an email. Email threading gives you the option to suppress duplicates during review.
Predictive Coding (TAR) – In its simplest terms, predictive coding allows you to train the computer to find the documents you are looking for, such as responsive documents to a document request. Reviewers code a sample set of documents and submit them to the analytics tool, which then goes through the remaining documents to conceptually find similar documents and categorizes or codes them accordingly. With predictive coding, you can reduce your review set of documents by as much as 90%.