SBIR-STTR Award

A Scalable Multilingual Scientific MetaSearch
Award last edited on: 1/17/2014

Sponsored Program
SBIR
Awarding Agency
DOE
Total Award Amount
$99,520
Award Phase
1
Solicitation Topic Code
-----

Principal Investigator
Emmanuel Roche

Company Information

Teragram Corporation

10 Fawcett Street
Cambridge, MA 02138
   (617) 576-6800
   info@teragram.com
   www.teragram.com
Location: Single
Congr. District: 05
County: Middlesex

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2007
Phase I Amount
$99,520
This project will develop a large scale, multilingual Metasearch application with the following properties: (1) cross-lingual search, namely the ability to issue a query in English and find documents in another target language; (2) ability to narrow the queries in the target language to reduce ambiguity; (3) on-the-fly translation of search results to allow a faster and more intuitive way to identify relevant documents; (4) on-the-fly bilingual keyword clustering based on the search results; and (5) ability to find English documents closely related to a particular document in the target language (possibly a document from the same author). Phase I will be restricted to the English-Chinese language pair, assuming an English-speaking user trying to access Chinese publications and evaluate their relevancy. A wide variety of existing technologies and language resources will be used, including text classification (English and Chinese), concept and entity extraction (English and Chinese), Chinese segmentation, a bilingual English-Chinese dictionary, large scale scientific taxonomies for English and Chinese, and generic Metasearch tools (such as search result parsing modules).

Commercial Applications and Other Benefits as described by the awardee:
Although information today has become more available to the public through the increasing use of the internet and internet search engines, information found in foreign countries and in foreign languages cannot be easily accessed. Yet, as science becomes more global, the amount of non-English scientific publications has increased significantly. A metasearch engine that provides an easy way to access and understand materials in many languages should drastically increase the knowledge available to scientists and information seekers

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
----
Phase II Amount
----