Navy 2008 Variable Speed Speech Synthesis

Variable Speed Speech Synthesis
Award last edited on: 11/14/2018

Awarding Agency

DOD : Navy

Total Award Amount

$895,598

Award Phase

Solicitation Topic Code

N08-149

Principal Investigator

Minkyu Lee

Advanced Media Research Inc (AKA: AMR)

422 Executive Drive
Princeton, NJ 08540

(609) 430-0900

info@amrnd.com

www.amrnd.com

Location: Single
Congr. District: 12
County: Mercer

Phase I

Contract Number: N68335-08-C-0429
Start Date: 8/12/2008 Completed: 1/20/2010

Phase I year

2008

Phase I Amount

$145,798

The objective of this proposal is to demonstrate the feasibility of developing variable speed speech synthesis technology. We plan to use open source TTS systems because they often provide flexibility and interoperability, which is essential for research oriented work. To modify speaking speed, we plan to focus on time domain time-scale modification algorithms, which provide good quality with less computational complexity compared to other approaches such as sinusoidal models or vocoder-based approaches. We will test time domain methods including SOLA, PSOLA, and WSOLA. We will apply linear scaling factor, which modifies the duration regardless of whether the speech segment is a silence, a transient or a sustained vowel. We will also apply different scaling factors to different parts of speech segments. During the optional six months, we will focus on creating multiple voices by modifying voice types, gender, dialects (accents), and perceived emotion of the speech. Based on the source-filter models, we will investigate algorithms for modifying source and filter characteristics, from which many different voices can be generated.

Benefit:
The time-scale modification technology will be of tremendous commercial value. Transforming speech or audio signal to an alternative time-scale can be useful digital audio effect. It can be used for fast browsing of speech material for digital libraries and distance learning, fast/slow playback for telephone answering machines and dictaphones, accelerated aural reading for the blind, editing audio/visual recordings for allocated timeslots within the radio/television industry. The ability to change the voice characteristics of TTS speech will enable new applications in various fields in addition to generating multiple voices. It will be an innovative technology for businesses in virtual world environments, childrens toy industry, web-based application software industry, on-line gaming industry, on-line service and entertainment industry, movie industry, and animation (cartoon) industry.

Keywords:
Text-to-Speech, Text-to-Speech, Speaking Speed, Voice Conversion, Variable speed, Multiple voices

Phase II

Contract Number: N61339-10-C-0037
Start Date: 9/28/2010 Completed: 9/28/2012

Phase II year

2010

Phase II Amount

$749,800

The main objective of this project is to develop technologies for variable speaking speed synthesis, or Text-to-Speech (TTS). A TTS system converts written text into spoken language. There are numerous commercial TTS systems; however, most systems do not allow for speed control of the output speech. In the simulation-based virtual training system, TTS systems are used to generate the voice of the virtual role-players. The ability to control the speaking speed of TTS output without sacrificing intelligibility will support a range of fast operational and training scenarios for Aviation Training Systems. Phase I was focused on the feasibility study of developing a variable speed speech synthesis technology. This proposal is a continuous effort toward Phase II, where the main goal is to develop a working prototype to enable realistic and adjustable speed control of synthetic speech that is intelligible enough to support a range of fast pace operational and training scenarios. An additional goal of Phase II is to provide the capability of mimicking the radio voice of military personnel, as well as, the capability of generating multiple voices from a single TTS system.

Keywords:
Voice Conversion, Voice Conversion, Voice Transformation, Speech Synthesis, Variable Speed, Time Scale Modification, Text-To-Speech

SBIR-STTR Award

Variable Speed Speech Synthesis
Award last edited on: 11/14/2018

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Advanced Media Research Inc (AKA: AMR)

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

New To Inknowvation.com?

SBIR-STTR Award

Variable Speed Speech SynthesisAward last edited on: 11/14/2018

Sponsored Program

Awarding Agency

Total Award Amount

Award Phase

Solicitation Topic Code

Principal Investigator

Company Information

Advanced Media Research Inc (AKA: AMR)

Phase I

Phase I year

Phase I Amount

Phase II

Phase II year

Phase II Amount

Variable Speed Speech Synthesis
Award last edited on: 11/14/2018