SBIR-STTR Award

Automatic Language Detection Using Fast Wordspotting
Award last edited on: 6/23/2005

Sponsored Program
SBIR
Awarding Agency
NSF
Total Award Amount
$99,999
Award Phase
1
Solicitation Topic Code
-----

Principal Investigator
Jon Arrowood

Company Information

Nexidia Inc (AKA: Fast-Talk Communications Inc)

3565 Piedmont Road Building 2 Suite 400
Atlanta, GA 30305
   (404) 495-7220
   barnold@nexidia.com
   www.nexidia.com
Location: Multiple
Congr. District: 05
County: Fulton

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2004
Phase I Amount
$99,999
This Small Business Innovation Research Phase I project will perform the research and development necessary to integrate extra information gathered from an existing phonetic word-spotting technology into a language and dialect identification system, thus enhancing the identification system. The research objective of this proposal is to use Nexidia's existing wordspotting technology to improve a state of the art language identification system. Wordspotting is the technique where a word (or phrase) is searched for in audio, with the return being a set of timestamps where the word or phrase might have occurred, along with a confidence score for each timestamp. Standard state-of-the-art language identification systems currently are based on Gaussian Mixture Models and phoneme statistics of each candidate language. They cannot use full speech recognition for computational reasons. However, wordspotting is lightweight, needing only a fraction of a CPU. If a list of several thousand common words and phrases is generated, it is very likely that in speech more than a few seconds long, an item from this list will be spoken. Thus for this project, it is proposed to begin with a state of the art language identification system, and augment it by such a search from each candidate language. The expected result is a language identification system capable of outperforming current state of the art systems The ability to automatically classify which language is being spoken in a segment of speech would be a highly desirable feature in many speech communications systems. The proposed method for language identification is an extension to state of the art systems. As such, a baseline for performance can be considered to be current state of the art, and it is probable that the proposed research will result in better classification accuracy than is currently reported in the literature. If better accuracy is achieved, the proposed structure could become a standard. Further, there is no commercial product available at this time to perform language classification, as existing systems are all in the research lab and not commercialized. Were the proposed research to be even moderately successful, a new class of commercial offering would emerge. Possible applications include routing, monitoring, and quality assurance in call centers, data mining and intelligence applications, and to enable the proper speech recognition system. Call centers could automatically route incoming calls to appropriate CSRs, and surveillance operations could add additional filtering criteria to their intercepted records. The integration of this feature along with the original functionality of fast phonetic keyword spotting would greatly enhance data-mining capability

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
----
Phase II Amount
----