Rapid Development Techniques for Spoken Language Translation
Award last edited on: 6/22/2012

Sponsored Program
Awarding Agency
Total Award Amount
Award Phase
Solicitation Topic Code
Principal Investigator
Wei Wang

Company Information

Language Weaver Inc (AKA: SDL Language Weaver)

6060 Center Drive Suite 150
Los Angeles, CA 90045
   (310) 437-7300
Location: Multiple
Congr. District: 36
County: Los Angeles

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
Phase I Amount
We propose to build on recent work that introduces syntactic processing into statistical machine translation models, to develop algorithms and techniques that will create improved performance levels with a given amount of training data, and reduce the amount of training data required to achieve a given performance level. Projected advances will enable development of translation capability for a much wider variety of languages, subject domains and application areas than are currently feasible with data intensive statistical approaches.

Statistical Machine Translation, Syntax-Based Machine Translation, Bilingual Data, Dictionary

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
Phase II Amount
The Phase I project demonstrated the effectiveness of utilizing syntactic information and bilingual dictionaries to yield high quality translation systems in the Statistical Machine Translation (SMT) paradigm with limited bilingual corpus data for Chinese-English. Phase II extends this work with techniques that will enable similar success for morphologically complex languages, and a genuinely "resource poor" language, Urdu. This work seeks to overcome a key limitation of SMT commercially - the requirement for large scale data resources. Success in this project will be immediately applied to the many circumstances where only limited bilingual data is available: for many specialized domains, spoken translation applications, and the many resource poor languages that are important for military, intelligence and humanitarian operations.

Statistical Machine Translation, Chinese, Urdu, Arabic, Morphology, Syntax