Text Mining

Text Mining

Text Mining for Clementine is an add-on product that uses linguistic methods to extract key concepts from text based on context. This information can be combined with existing structured data, such as demographics, and applied to modeling using Clementine's full suite of data mining tools to yield better and more focused decisions. A separate license is required.

Text Mining for Clementine uses the technology underlying LexiQuest Mine to process unstructured data. Based on 24 years of research in computational linguistics, LexiQuest Mine uses Natural Language Processing (NLP) to automate the process of reading documents to uncover their content. For example, in a 30-page publication, only two or three sentences may be relevant for your purposes. Instead of reading the entire 30 pages, you can automatically discover the main concepts. In addition, the underlying proprietary lexicon dictionaries allow the automatic classification of these concepts. As a result, you can quickly determine the relevance of the information to your needs.

A system that incorporates NLP can intelligently extract terms, including compound phrases. Moreover, knowledge of the underlying language allows classification of terms into related groups, such as products, organizations, or people, using the meaning and context of the text.

Linguistic systems are knowledge-sensitive--the more information contained in their dictionaries, the higher the quality of the results. Modification of the dictionary content, such as synonym definitions, can simplify the resulting information. This is often an iterative process and is necessary for accurate concept retrieval. Custom dictionaries for specific domains, such as CRM and genomics, are also included.