CHATR database example

CHATR speech database example

   text:"It sounds just like me"



Example 1: focus on `me': synthesised version


 
Example 2: focus on `like':synthesised  version 



Here is the debug trace from chatr for that sentence:

(it shows the waveform segments used to make the utterance,
 and the contexts of each sound)

 the sounds are written in computer-readable phonetic notation
 `ih' is the sound /i/ as in `it'
 `aw' is the sound /ou/ as in `house'
  the numbers are predicted and actual durations (in milliseconds)
  and the lines starting /dept2/workk22/etc are the actual waveform files
  showing the start time and duration of the waveform segment that we use.

  the waveform samples include more context than is used in the final synthesis

chatr> (Save UnitLabels+ '-)
; Unit Stream plus
; (filename start duration num_units
;    (Seg_name source_dur target_dur) 
;    ...) 
; ... 
(Utterance Unit
(
("US015.wav" 16370 87 2
   ( ih   50   51)       ;;  # IH t        ; ae t # IH t s ax
   (  t   37   87)       ;;  ih T  s        ; t # ih T s ax z
   )  - from the words "It's a ... " 



("US065.wav" 31562 80 1
   (  s   80   82)       ;;  t S ay        ; eh s t S ay d #
   )  - from the words "West Side" 



("US089.wav" 10916 106 1
   ( aw  106   62)       ;; s AW th        ; ax # s AW th ax m
   )  - from the words "South America"
  


("US031.wav" 31544 129 3
   (  n   56   23)       ;;  aw N d        ; n p aw N d z oh
   (  d   30   27)       ;;  n D z        ; p aw n D z oh v
   (  z   42   15)       ;;  d Z oh        ; aw n d Z oh v m
   )  - from the words "pounds of"



("US020.wav" 14305 78 1
   ( jh   78   82)       ;; z JH ah        ; #ih z JH ah s t
   ) - from the words "it's just"



("US125.wav" 5164 197 2
   ( ah  126  123)       ;; jh AH s        ; oh m jh AH s t ih
   (  s   71   71)       ;;  ah S t        ; m jh ah S t ih s
   ) - from the words "from just"



("US074.wav" 5114 125 2
   (  t   76   60)       ;;   s T l        ; t ih s T l ae n
   (  l   49   39)       ;;  t L ae        ; ih s t L ae n d
   ) - from the words "latest land"



("US025.wav" 5612 60 1
   ( ay   61   90)       ;;  l AY k        ; ng z l AY k dh ae
   ) - from the words "things like that"



("US046.wav" 37543 13
   (  k  135  173)       ;;  ay K l        ; ey s ay K l z aa
   ) - from the word "cycles" 



("US006.wav" 13510 70 1
   (  m   70   75)       ;;  k M ow        ; l ay k M ow s t
   ) - from the words "like most"



("US007.wav" 2513 327 1
   ( iy  327  310)       ;;  m IY #        ; t uw m IY # ay s
   ) - from the words "to me."


))


gives a sequence of small waveform sections:




and the final wave is produced by joining the parts of each:





(with special thanks to Andy & Kris of the New York Times)
CHATR speech database example

(C) Copyright ATR Interpreting Telecommunications Research Labs 1997