SBIR-STTR Award

Resolving Biological Entity References (Text/Databases)
Award last edited on: 2/6/09

Sponsored Program
SBIR
Awarding Agency
NIH : NCRR
Total Award Amount
$1,146,005
Award Phase
2
Solicitation Topic Code
-----

Principal Investigator
Frederick B Baldwin

Company Information

Alias-i Inc (AKA: Baldwin Language Technologies)

181 North 11th Street Suite 401
Brooklyn, NY 11211
   (718) 290-9170
   breck@alias-i.com
   www.alias-i.com
Location: Single
Congr. District: 07
County: Kings

Phase I

Contract Number: 1R43RR020259-01
Start Date: 00/00/00    Completed: 00/00/00
Phase I year
2004
Phase I Amount
$199,156
In the broadest terms, the goal of the proposed work is to make it easier for researchers to apply robust, scalable, entity-centered, heterogeneous data access to the biomedical literature. 'Entity centered' means that information is indexed irrespective of what a surface mention looks like in any given data source. For example, there is a gene in FlyBase with synonyms in text as diverse as 'Foil" and "Mel(3)10", generic norminal referring expressions like 'The gene", pronouns like "it", as well as a FlyBase database id of CG5490. The Phase I proposal breaks down into two major efforts. First, extend the existing LingPipe suite of linguistic processing tools to the challenges of bioinformatics resulting in LingPipe-Bio. This will be distributed as an open source suite of tools to the research and entrepreneurial community with dual open source/commercial licensing. Second, it is proposed to adapt a current interface for entity centered data access (ThreatTracker for intelligence analysts) to BioTracker, based on the needs of biomedical researchers.

Thesaurus Terms:
bioinformatics, computer program /software, computer system design /evaluation, indexing, information retrieval, nomenclature information system, publication

Phase II

Contract Number: 5R43RR020259-02
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
2005
(last award dollars: 2008)
Phase II Amount
$946,849

In the broadest terms, the goal of the proposed work is to make it easier for researchers to apply robust, scalable, entity-centered, heterogeneous data access to the biomedical literature. 'Entity centered' means that information is indexed irrespective of what a surface mention looks like in any given data source. For example, there is a gene in FlyBase with synonyms in text as diverse as 'Foil" and "Mel(3)10", generic norminal referring expressions like 'The gene", pronouns like "it", as well as a FlyBase database id of CG5490.[Morgan et al. 2002]. The Phase I proposal breaks down into two major efforts. First, extend the existing LingPipe suite of linguistic processing tools to the challenges of bioinformatics resulting in LingPipe-Bio. This will be distributed as an open source suite of tools to the research and entrepreneurial community with dual open source/commercial licensing. Second, it is proposed to adapt a current interface for entity centered data access (ThreatTracker for intelligence analysts) to BioTracker, based on the needs of biomedical researchers.

Thesaurus Terms:
bioinformatics, computer program /software, computer system design /evaluation, indexing, information retrieval, nomenclature information system, publication