Go to the first, previous, next, last section, table of contents.
Programmers and engineers of all disciplines or nationalities love
their TLAs; speech synthesis is no different. We hope to have
covered all those used in this manual, and perhaps a few more. If
you find any we missed (or got wrong!), please let us know for future
versions...
ACL-DCI
-
Association for Computational Linguistics - Data Collection
Initiative.
ACPA
-
Audio Capture and Playback Adapter.
ANN
-
Artificial Neural Network.
ASCII
-
American Symbolic Code for Information Interchange.
ASR
-
Automatic Speech Recognition.
ATR
-
Advanced Telecommunication Research.
http://www.atr.co.jp/
BEEP
-
British English Example Pronunciation.
btosps
-
Binary TO Signal Processing System.
car
-
Not an acronym. `Lisp' expression which refers to (and selects) the
first item of a list held in a variable. See also `cdr'.
CAT
-
Not an acronym. CATegory of a word. HLP tags used to categorize
words into Nouns, Verbs, Prepositions, etc. See NP, VP and PP.
cdr
-
Not an acronym. `Lisp' expression which refers to (and selects) the
items of a list held in a variable, less the first item. See also
`car'.
CELP
-
Code-book Excited Linear Prediction.
CEPLMA
-
CEpstral Resynthesis using a Logarithmic Moving Average filter.
CHATR
-
Collective Hacks from the Advanced Telecommunications Research
laboratories. Well you did ask...
http://www.itl.atr.co.jp/chatr/
CMU
-
Carnegie Mellon University.
http://www.cs.cmu.edu/People/air/consortium/description.html
CSTR
-
Centre for Speech Technology Research. A department of Edinburgh
University, UK.
http://www.cstr.ed.ac.uk/
CSLU
-
Center for Spoken Language Understanding. A department of Oregon
Graduate Institute of Science and Technology, USA.
http://www.cse.ogi.edu/CSLU/
CVS
-
Concurrent Versions System. CVS is a front end to the RCS revision
control system. It extends the notion of revision control from a
collection of files in a single directory, to a hierarchical
collection of directories consisting of revision controlled files.
These directories and files can be combined together to form a
software release. CVS provides the functions necessary to manage
these software releases and to control the concurrent editing of
source files among multiple software developers. CVS keeps a single
copy of the master sources. This copy is called the source
`repository'; it contains all the information to permit extracting
previous software releases at any time based on either a symbolic
revision tag, or a date in the past.
darpa
-
Defense Advanced Research Projects Administration. The central
research and development organization for the Department of Defense
(DoD), USA.
http://www.darpa.mil/
DTW
-
Dynamic Time Warping.
EGG
-
Electro-Glottal Graph. Device for measuring throat movement caused
by speaking.
EMACS
-
Editor MACroS. A Macro-based editor and complete computing task
environment.
ESPS
-
Entropic Signal Processing System.
FSF
-
Free Software Foundation.
http://www.gnu.ai.mit.edu/fsf/
HLCB
-
High Low Continuation Boundary. Tags used to mark intonation on
syllables.
HLP
-
High Level Phrasing. Method of tagging speech with prosodic
information.
HMM
-
Hidden Markov Model.
Holmes
-
John Holmes, one of the founders of speech synthesis.
HTK
-
Hidden (Markov model) Tool Kit. A product of Entropic Research
Laboratory, Inc.
http://www.entropic.com
IFT
-
Illocutionary Force Type. Strength or emphasis put on a phrase.
Speech act information - meaning you want to convey above and
beyond just the words spoken. As an example, the English phrase `I
understand' can mean `Thank you for informing me (I'm happy)' or `Now
I know what you intend I'm not happy' or even `I heard what
you said but haven't a clue what you mean' depending on how and
when it's said. That's IFT at work. The simplest case is the
difference between a question and a statement using the same
words.
IntoneStream
-
Series of symbols representing the intonation required on an
utterance. Attached to the WordStream.
IPA
-
International Phonetic Association. Representative organization for
phoneticians.
http://www.arts.gla.ac.uk/IPA/ipa.html
JToBI
-
Japanese Tones and Break Indices.
jtts
-
Japanese Text-To-Speech.
LDC
-
Linguistic Data Consortium. A group established to broaden the
collection and distribution of speech and natural language
databases for the purposes of research and technology development
in automatic speech recognition, natural language processing and
other areas where large amounts of linguistic data are needed.
http://www.ri.cmu.edu/comp.speech/Section1/Data/ldc.html
LFG
-
Lexical Functional Grammar.
LISP
-
LISt Processing language. A programming language originally developed
for Artificial Intelligence (AI) but now used mainly in the speech
synthesis field.
LMA
-
Logarithmic Moving Average. Mathematical reference to a method used
in audio filtering. See CEPLMA.
LPC
-
Linear Predictive Coding.
LTS
-
Letter To Sound.
LVQ
-
Learned Vector Quantization.
M-ACPA
-
Multimedia - Audio Capture Playback Adapter.
MARSEC
-
MAchine-Readable Spoken English speech Corpus.
MFCC
-
Mel Feature Cepstral Co-efficients.
mtts
-
Multi-lingual Text-To-Speech.
mrpa
-
Machine Readable Phonetic Alphabet.
Mu-law
-
Not an acronym. Pronounced `mew-LAW' - the `Mu' is actually the
Greek letter `Mu'. An 8-Bit compression code for audio signals
including speech. It is widely used in the telecommunications
field because it improves the signal-to-noise ratio without
increasing the amount of data. It is a companding technique.
That means it carries more information about the smaller signals
than the larger. Sometimes appears in documents written as `ULAW'.
MULE
-
MUlti Language Editor. Extended part of EMACS.
NFS
-
Network File System. A distributed file system that provides
transparent access to files residing on remote disks. Developed at
Sun Microsystems in the early 1980's.
NIST
-
(American) National Institute STandards.
NLP
-
Natural Language Processing.
NN
-
Neural Network.
PN
-
Noun Phrase. HLP tag used to denote an input word as a Noun.
nus
-
Non-Uniform (unit) Selection.
nuuph
-
Not an acronym. The `nuu' is the Greek letter `Nuu'. Japanese
phoneme set.
NUUCEP
-
Not an acronym. The `NUU' is the Greek letter `Nuu'. NUUtalk
CEPstral synthesis routines.
OAPD
-
Oxford Acoustic Phonetic Database. Contains data on vowel-consonant
and consonant-vowel combinations in both stressed and unstressed
locations.
PhoneStream
-
Series of symbols representing the phonemes of an utterance.
Attached to the WordStream.
PhonoWord
-
Type of input accepted by CHATR. Allows specification of
prosodic phrases and intonation features. Utterance is tagged
with four letters (D=Discourse, S=Sentence, C=Clause and P=Phrase)
to specify phrase levels, and other letters (e.g. H and L) to
indicate emphasis and accent.
PP
-
Preposition Phrase. HLP tag used to denote an input word as a
Preposition.
PphraseStream
-
Series of symbols representing the prosodic phrases of an utterance.
Attached to the WordStream.
PSOLA
-
Pitch Synchronous Over-Lap and Add. Algorithm to independently modify
the fundamental frequency and duration of a speech signal. Used
during concatenation of selected units from a finite speech database
such that minimal prosodic damage occurs due to target/selected unit
mismatch.
RCS
-
Revision Control System. A system that keeps track of different
versions of files. If one person is editing a source no
other developer may do so. Thus all sources are by default
read-only. When a file is checked out by a developer,
they may change it but no other developer may check it out at the
same time. When a developer is finished, they may check
in the file thus allowing others to check it out.
RFC
-
Rise Fall Continuation. A now become dated method of tagging
phoneme-sized segments with duration and frequency values.
SegStream
-
Series of symbols representing the segments of an utterance.
Attached to the WordStream.
SGML
-
Standard Generalized Markup Language.
SphraseStream
-
Series of symbols representing the syntactic phrases of an utterance.
Attached to the WordStream.
Stream
-
One of a sequence of cells containing symbols generated and/or
interpreted by CHATR and linked to an utterance (and other
streams). Causes changes in the timing, intonation and prosody of
the synthesized output.
SylStream
-
Series of symbols representing the syllables of an utterance.
Attached to the WordStream.
TIMIT
-
A large speech corpus from TI and MIT.
TLA
-
Three (or sometimes more or less) Letter Acronym. Initials
represent a well (or often un)-known title or description.
ToBI
-
Tones and Break Indices.
tts
-
Text-To-Speech.
ULAW
-
Not an acronym. Pronounced `mew-LAW' - the `U' is actually the
Greek letter `Mu'. An 8-Bit compression code for audio signals
including speech. It is widely used in the telecommunications
field because it improves the signal-to-noise ratio without
increasing the amount of data. It is a companding technique.
That means it carries more information about the smaller signals
than the larger. Sometimes appears in documents written as `Mu-law'
utterance
-
A series of words you wish CHATR to synthesize as speech.
Basically the input to CHATR, in whichever form it may take.
VP
-
Verb Phrase. HLP tag used to denote an input word as a Verb.
VQ
-
Vector Quantization.
WordStream
-
Series of words to be `spoken' by CHATR, derived from the
utterance.
XMG
-
X Multi-Graph. A graphics display program written at CSTR, Edinburgh
University, UK.
http://www.cstr.ed.ac.uk/
XWAVES
-
Not an acronym. A graphics display program from Entropic Research
Laboratory, Inc.
http://www.entropic.com
Go to the first, previous, next, last section, table of contents.