Introduction

Chatr embodies a method for producing speech synthesis without the need for signal processing by using re-sequencing of carefully selected phone-sized segments from a pre-recorded speech corpus. It faithfully reproduces the voice characteristics and speaking style of the original speaker to create novel utterances.
The process involves creating an index of phones and their prosodic characteristics for each utterance in the corpus. The re-sequencing synthesiser doesn't necessarily produce any sounds; it merely determines an optimal sequence for random-access replay from the original speech to give the best approximation to a desired utterance from the segments available in a given speech corpus. The synthesis method is independent of language or speaker but requires a sufficient source database that represents a balanced sample of the language
To find the optimal sequence of segments for concatenation, the synthesiser selects from amongst candidates in the database using a weighted combination of their acoustic and prosodic features to maximize continuity between segments while at the same time minimising the distance of each from its prosodic target. Optimal performance is achieved by under-specification of prosody, so that only key points in the utterance have targets and the remainder are considered prosodically neutral. In conjunction with loose selection of units from a continuous-speech corpus, prosodic under-specification maximises the number of candidate segments and uses the redundancy of information in natural speech to reduce or eliminate distortions in the output synthesis.

Further info

You can find out more about CHATR by subscribing to its mailing list. Send a message to majordomo@itl.atr.co.jp with just the two words `subscribe chatr-ml' in the body of the message.

Chatr (and this WWW site) is maintained by

Nick Campbell

Interpreting Telecommunications Research Laboratories,
Advanced Telecommunications Research Institute ,
Hikari-dai 2-2, Seika-cho, Kyoto 619-02 Japan
Tel 07749 5 1377, Fax: 07749 5 1308.
nick@itl.atr.co.jp

This page last updated 23rd Feb '97.

Introduction

Related Papers:

Overview in Japanese

Corpus-based speech synthesis

High definition Speech Synthesis

Syllable-level units for synthesis

Higher-level unit selection

Further info

(C) Copyright ATR Interpreting Telecommunications Research Labs 1997