A Scalable Multilingual Scientific MetaSearch
Award last edited on: 1/17/2014

Sponsored Program
Awarding Agency
Total Award Amount
Award Phase
Solicitation Topic Code

Principal Investigator
Emmanuel Roche

Company Information

Teragram Corporation

10 Fawcett Street
Cambridge, MA 02138
   (617) 576-6800
Location: Single
Congr. District: 05
County: Middlesex

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
Phase I Amount
This project will develop a large scale, multilingual Metasearch application with the following properties: (1) cross-lingual search, namely the ability to issue a query in English and find documents in another target language; (2) ability to narrow the queries in the target language to reduce ambiguity; (3) on-the-fly translation of search results to allow a faster and more intuitive way to identify relevant documents; (4) on-the-fly bilingual keyword clustering based on the search results; and (5) ability to find English documents closely related to a particular document in the target language (possibly a document from the same author). Phase I will be restricted to the English-Chinese language pair, assuming an English-speaking user trying to access Chinese publications and evaluate their relevancy. A wide variety of existing technologies and language resources will be used, including text classification (English and Chinese), concept and entity extraction (English and Chinese), Chinese segmentation, a bilingual English-Chinese dictionary, large scale scientific taxonomies for English and Chinese, and generic Metasearch tools (such as search result parsing modules).

Commercial Applications and Other Benefits as described by the awardee:
Although information today has become more available to the public through the increasing use of the internet and internet search engines, information found in foreign countries and in foreign languages cannot be easily accessed. Yet, as science becomes more global, the amount of non-English scientific publications has increased significantly. A metasearch engine that provides an easy way to access and understand materials in many languages should drastically increase the knowledge available to scientists and information seekers

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
Phase II Amount