SBIR-STTR Award

Detecting Misleading Information
Award last edited on: 3/1/2007

Sponsored Program
SBIR
Awarding Agency
DOD : DARPA
Total Award Amount
$98,757
Award Phase
1
Solicitation Topic Code
SB021-008
Principal Investigator
Yves Schabes

Company Information

Teragram Corporation

10 Fawcett Street
Cambridge, MA 02138
   (617) 576-6800
   info@teragram.com
   www.teragram.com
Location: Single
Congr. District: 05
County: Middlesex

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2002
Phase I Amount
$98,757
The purpose of this Phase I SBIR project is to demonstrate the feasibility of a complete workbench for detecting misleading information in vast amount of open sources including large amounts of textual information as found on the web. The workbench will demonstrate the ability to detect information that is inaccurate, implausible or inconsistent, especially when such inaccuracy has been stated intentionally. The novelty of the approach consists of applying numerical fraud detection techniques to linguistic and semantic features automatically extracted from unstructured text using the state-of-the-art and scalable linguistic technologies designed by Teragram Corporation. These include concepts and event extraction, parsing, categorization and semantic interpretation technologies, all of which have been deployed by Teragram customers at the size of the Internet. The techniques for detecting intentionally misleading information from textual documents as found in open sources such as the Internet are useful not only for military intelligence but also for commercial applications. Military intelligence need to identify potential threats of groups and individuals which communicate or publish information via open sources. Knowledge extraction applications using the information found on the Internet all assume that the information is accurate and does not contain incorrect or misleading information. Such applications include, among many others, corporate data collection, corporate intelligence gathering, question-answering systems, search engine which include meta-tags stated on web pages as part of their ranking mechanism, alert systems, web site which collect resumes and job openings from the Internet. Since those applications assume that the information collected is corrected, there is therefore a critical need to validate information the quality and soundness of the information gathered from public sources.

Keywords:
Natural Language Processing, Text Categorization, Information Warfare, Pattern Detection, Entity Extraction, Semantic Analysis, Fraud Detection,

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
----
Phase II Amount
----