DARPA 2000 Improving Recall in Domain Independent Information

Improving Recall in Domain Independent Information
Award last edited on: 4/2/2008

Awarding Agency

DOD : DARPA

Total Award Amount

$834,847

Award Phase

Solicitation Topic Code

SB001-012

Principal Investigator

Svetlana Sheremetyeva

Onyx Consulting Inc

1010 Edgewood Road Suite 107
Edgewood, MD 21040

(410) 252-8969

N/A

Location: Single
Congr. District: 02
County: Harford

Phase I

Contract Number: ----------
Start Date: ---- Completed: ----

Phase I year

2000

Phase I Amount

$89,698

This project is devoted to enhancing the recall of general-purpose domain-independent information retrieval systems. Its unique contribution is the incorporation of four different sources of knowledge for evaluating the match of a particular document to a query: the broad-coverage lists of proper names ("onomastica"); the knowledge of the syntax of the text in the documents; the knowledge of the ontological-semantic properties of words in the text; and knowledge to help resolve problems with anaphoric reference as well as metonymy and other tropes in the input text. These individual sources have been researched in academia and are available to Onyx Consulting for integration and incorporation in a working proof-of-concept system.

Phase II

Contract Number: ----------
Start Date: ---- Completed: ----

Phase II year

2001

Phase II Amount

$745,149

This project develops an information extraction system that demonstrates higher levels of recall than current systems, seeking not to jeopardize the levels of precision. Our recall enhancing algorithms use more linguistic and world knowledge than most current systems. Four crucial avenues of work that will lead to the improvement of recall are: disambiguation of input text terms through ontological semantic processing; processing reference; processing non-literal language; and assigning semantic features to new, unattested word and phrase occurrences. All the above activities rely on a unique battery of resources and processes developed by or available to Onyx. These include an ontological world model, a fact database, a comprehensive NLP lexicon of English and an onomasticon, or lexicon of proper names. In addition, we use special routines for resolving reference, processing non-literal language through controlled constraint relaxation and treating unattested inputs using expectations recorded in the ontology, the fact database and in special orthographic, morphological and syntactic rules. Architecturally, we will combine in a single system a variety of approaches and processes as above. Unlike most current systems, ours will be geared not only at information extraction for a given set of templates (and, therefore, typically, working in a single domain) but will also include facilities for modifying templates and defining new templates for new types of questions and, orthogonally, new domains. Thus, our product will be the first general-purpose, configurable information extraction system, which will in multiple domains and with multiple text genres. Additional resources and linguistic expertise for this project are supplied by consultants at New Mexico State University's Computing Research Laboratory, a premier academic R&D institution.

Keywords:
Text Extraction, Onomastica, Semantics, Anaphoric Reference, Natural Language Processing, Synta

SBIR-STTR Award

Improving Recall in Domain Independent Information
Award last edited on: 4/2/2008

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Onyx Consulting Inc

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

New To Inknowvation.com?

SBIR-STTR Award

Improving Recall in Domain Independent InformationAward last edited on: 4/2/2008

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Onyx Consulting Inc

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

Improving Recall in Domain Independent Information
Award last edited on: 4/2/2008