SBIR-STTR Award

Voice Transformation and Detection
Award last edited on: 4/7/2010

Sponsored Program
SBIR
Awarding Agency
DOD : AF
Total Award Amount
$841,352
Award Phase
2
Solicitation Topic Code
AF071-087
Principal Investigator
Sooha P Lee

Company Information

Advanced Media Research Inc (AKA: AMR)

422 Executive Drive
Princeton, NJ 08540
   (609) 430-0900
   info@amrnd.com
   www.amrnd.com
Location: Single
Congr. District: 12
County: Mercer

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2007
Phase I Amount
$99,895
Voice transformation is altering one person's voice such that it sounds like from another speaker. This can be done by mapping the voice quality and speaking style of the source speaker to that of the target speaker. In this proposal for Phase I, we will investigate state-of-the-art technologies based on the source filter model. For vocal tract modeling and mapping, we will test the linear prediction model and the harmonic noise model. For excitation modeling and mapping, we will consider using the LF model and sinusoidal models. For speaking style mapping, various intonation and speaking rate mapping methods will be examined for the feasibility. This includes various statistical models such as the CART, multiplicative or sum-of-products models. The transformation results will be evaluated using human listeners as well as automatic speaker identification algorithms. We will also investigate methods on how to detect when voice transformation software is employed. The final report of Phase I will include the recommended mapping algorithms and preliminary speech samples transformed using the algorithms. It will also contain requirements and specifications for the voice transformation system that will be implemented during Phase II. Finally, potential risk factors that may affect the performance will be described.

Keywords:
Voice Transformation, Voice Conversion, Voice Mimick, Prosody, Speaking Style, Vocal Tract, Excitation, Source-Filter Model

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
2008
Phase II Amount
$741,457
Voice transformation is altering one person’s voice into another person’s voice in such a way that the source speaker’s words sound like they are the target speaker’s. In Phase I, we have studied various algorithms for voice transformation. The project has come to grips with the feasible methods for glottal source and vocal tract conversion algorithms; thus laying a good foundation upon which this Phase II project can be successfully implemented. The objective of this Phase II proposal is threefold; to develop state-of-the-art voice transformation technologies, to develop a voice transformation platform (VTP) and to develop a prototype of a real-time voice transformation system (RVTS). Firstly, we will develop various state-of-the-art algorithms, which we identified as candidate technologies for Phase II. The VTP is a software application where users can perform the entire process of voice transformation, including data collection from speakers, automatic training of the conversion functions for vocal tract and glottal source, and eventually real-time voice transformation. The RVTS is a hardware system that transforms an input voice to a pre-determined target speaker’s voice. We are confident that the software and hardware developed at the end of Phase II will be important elements in future voice transformation systems.

Keywords:
Voice Transformation, Voice Conversion, Speaking Style, Vocal Tract, Glottal Excitation, Pitch Frequency, Speaker Identity