The CHATR system can be communicated with in a number of ways. It can be used interactively via a command line interface, in batch mode, via a pipe, or as a server mode dealing with multiple requests from a network. Another possible mode of communication could be as a sub-module of some larger system. CHATR C functions would be called directly by linking your executable with `libchatr.a'. Such an interface has not yet been fully determined--it can be done now, but it is not as clean as it should be.
The four currently available interaction modes are: interactive, pipe, batch and server. In each of these cases there are two distinct interpretation modes: command mode or text-to-speech (tts) mode. The default is interactive, command mode.
When interactive, a prompt is given and the user can type either commands to the command line interpreter, or text to be spoken in tts mode. The command line interpreter is based on the GNU readline library and hence allows command line editing and history (as in all good shells). The edit commands are EMACS-like. CHATR offers command completion, argument completion, variable name completion, and filename completion. TAB is the default completion key.
To exit interactive mode type either end-of-file (typically ctrl-D)
or the letter `q' and return. If the variable
chatr_confirm_exit
is set to a non-nil, confirmation is asked
for before CHATR exits.
Commands may go over onto other lines. A secondary prompt is given when a command is incomplete (no closing bracket(s)). At any time the interrupt character (typically ctrl-C) will interrupt whatever CHATR is doing and return to the top level.
In pipe mode CHATR will read commands (or text in tts mode) from `standard in' without prompting. No command line editing is available. End of stream is signaled by end of file. This mode is designed for use in communicating with other programs that generate CHATR commands or text to be spoken. It is not intended to be used for typed (keyboard) input.
In batch mode CHATR does not read commands (or text) from its input channel. Only files specified on the command line are read and processed. This is designed for long jobs where user interaction is not required.
CHATR has a server mode based on a BSD socket. CHATR may be run as a server on a known host and process commands or text received over the network. This version can have many preloaded databases, and particular files can be loaded before server mode is initiated. This allows faster speech synthesis than if the user had to start a new version of CHATR each time--client programs do not need to wait for requested synthesizers to become available. See section Using Server Mode, for a full explanation and example.
In server mode, CHATR first processes its command line arguments
(thus allowing configuration) then loops waiting for connections on a
known socket. By default this is port number 2234, but it may be
user selected by setting the variable chatr_server_portnum
.
When a connection occurs, a new version of CHATR is forked
giving the client a version of CHATR with much information
already preloaded.
One audio output mode that is specifically designed for server use is
the socket mode. This allows synthesized audio output to be given
back to the client machine. When a client first connects, it should
identify a socket that is waiting for (ulaw 8k) data. This is done
by the Audio
command. For security reasons that command may
not be available to random network client programs, so a single high
level function is provided. Once connected, a client program should
send something like
(Output_To `as71.itl.atr.co.jp' 4444)
Note the apostrophes. The machine name may be either a name or a number (e.g. IP address `133.186.36.171'). There should be a server on that machine waiting to receive connections on the specified socket number. The audio output may then be received.
Next, a few safe commands may be given. CHATR offers a
number of commands which would allow clients access to the server's
file system. From the server's point of view it wishes to restrict
which commands are available to random clients. Therefore, for
security reasons, the server side may set the variable
chatr_secure_functions
with a list of functions which a client
is allowed to call. These functions may call other functions, but in
that case the server is responsible for their content, so they can be
assured safe. This may have to become more strict as system security
becomes a bigger issue.
Basically, the commands available are Output_To
and tts
commands. Speaker selection should be available too. The exact
availability depends on how the server is started, but it does offer
some control over the services being offered.
The program chatr_pipe
offers a very simple example of a
CHATR client program. It opens a connection to a server and
sends down some initialization commands, then reads text from
`standard in' and sends it to the server to be synthesized.
chatr_pipe
also starts a server to receive the synthesized
waveform and writes the ulaw 8k data directly to `standard out'.
Commands can then take the form
echo hello world | chatr_pipe -h as71 >/dev/audio
In any of the four interaction modes the input data may be interpreted in one of two modes.
In command mode everything given to CHATR is treated as a CHATR command from CHATR's Lisp-like command language. Commands are of the general form
(command_name arg~1 arg~2 ... arg~n)
Commands start with an opening bracket which may seem awkward for those not used to Lisp. Single (non-bracketed) atoms are treated as variables and their value returned. There is one special command which does not require parentheses: `q', which quits the system. See section CHATR Command Language, for more details of the language.
In tts mode, everything that is given to CHATR is treated as text to be rendered as speech. There are a number of options available depending on what form the text is presented in. Some exist to deal with Japanese written in romaji or Kanji/Kana. See section Making CHATR Speak, for some examples. Files containing mixed English/Japanese can also be used as input. The appropriate language synthesis system will be selected. See section Multi-lingual Text Processing, for further details.
A simple method is provided for including commands embedded within
the text, allowing more control for the user without having to use
the Lisp-based command interface. Basically, key words may be
defined which denote CHATR commands. These are typically
selection of different speakers etc. In the default system, all
these commands start with the character `@', but users may
define any symbol to be command. Embedded commands are defined in
the variable tts_esc
in the library file
`$CHATR_ROOT/lib/data/tts.ch'. They include speaker selection
commands such as @f2b
, @wnc600
, @MHT
etc.
When started, unless -q
is specified, CHATR will first
load the user's `.chatrrc' file. Then, if files have been
specified as arguments, each will be loaded and evaluated in turn.
If interactive or pipe, CHATR will then read from `standard
in'.
The command line syntax is
chatr options file~0 file~1 ...
Where options are
-i or --interactive
-p or --pipe
-b or --batch
--server
--libdir Library-Directory-Name
Library-Directory-Name
.
This allows a version of CHATR to run with a different library
directory than the one named at compile time.
-h or --help
-q
-v or --version
-tts
-jtts
-mtts
tts_esc
variable
definition in the file `$CHATR_ROOT/lib/data/tts.ch' for a list
of defined `include' commands.
After every option is processed, remaining arguments are treated as
filenames and loaded as CHATR command files. However, if the
file name starts with a left parenthesis, it is treated as a
CHATR command and interpreted as such(2). In tts mode the files
are treated as text and spoken. Commands (in non-tts mode) may be
specified on the command line. For example, suppose we have a file
called `tests.ch' with many commands in it and wish to run the
function test2
after loading that file. This may be done in
batch mode on the command line by entering
chatr -b tests.ch "(test2)"
The command language for CHATR is basically a small Lisp system, though it is in fact more like `Scheme'. As has already been shown, commands can be called directly. In addition to top level commands that call CHATR internal C functions, the input language also supports variables and functions. It also supports the basic library functions one would expect from a tiny Lisp interpreter.
Variables may be set using the set
command. This is more like
the Lisp setq
command (or Scheme set!
), as it does
not evaluate the first argument. Variables are usually used to hold
utterances, though they can hold any value, including lists.
Function definitions have the following syntax
(define name arg-list command~0 ...)
If the first command in a function is a string, it is treated as a document string and returned by various help functions.
For those interested, CHATR is dynamically scoped. A variable
name will evaluate to its most locally named occurrence, set
will set its most locally named variable--that is argument names may
be the same as global variables, but you will always refer to the
argument while within that function. Also, because it is dynamically
scoped, you may refer to argument names in the scope of the function
caller. Anonymous functions (cf. lambda functions) are available via
the function function
.
Basic standard flow of control, logical operators and equal are
included. Two looping functions are available, for
and
mapc
. A simple so-called `naive reverse' of lists can be
defined within CHATR as
(define append (a b) (if (not a) b (cons (car a) (append (cdr a) b)))) (define reverse (a) (if (not a) a (append (reverse (cdr a)) (cons (car a) nil))))
A complete list of functions can be obtained by typing Help
.
Also see section CHATR Commands.
Another example more appropriate to CHATR shows how the command language can be used make CHATR synthesize a number utterances and save them in a particular directory. For example
(set outdir "/tmp/") (set examples '("MHT_0001" "MHT_0002" "MHT_0003")') (speaker_MHT) (define synth_and_save (name) (set utt1 (test_seg name)) (Save Wave utt1 (strcat outdir name))) (mapc synth_and_save examples)
The function test_seg
is defined for every database. It loads
in a segmental description of the given utterances, excludes that
from the database and then synthesizes it from the remaining units.
The function mapc
applies the first argument (a function) to
each member of the list given as the third argument.
The function test_txt
is defined for many (though not all)
databases and offers a textual representation of an utterance. It
loads the description, excludes that example from the database, and
synthesizes from text.
Another option is test_pf
, which loads a detailed structure
representation of an utterance, excludes it from the database and
synthesizes it. However, only a few databases have this defined.
There is currently no comprehensive automatic garbage collection in
this Lisp. However, utterances (typically the biggest structures) do
have reference counts allowing them to be garbage collected. It is
possible to free data by hand via a free
function, but this is
difficult to do safely. Later versions will require a real garbage
collector. However, for normal use, the small amount of garbage
generated will not be a problem. Note the tts function generates
no garbage. It has been used for hours (can even be days if
you are willing to listen) at a time.
CHATR offers what may appear to be a bewildering set of commands. Functions may be built in to CHATR, i.e. the function names are directly linked to functions written in C. Also, many user functions have already been defined in Lisp (some are loaded automatically at startup time) to make CHATR easier to use.
This section gives a brief summary of some of the more often used commands. Each definition is followed by a few examples.
set
(set radiodir "$CHATR_ROOT/usr/home/BU-RADIO/f2b/") (set test_utts '("f3ast01p1" "f3ast01p3")) (set utt1 (Utterance Text "A simple test"))
load
(load "cep_setup.ch") (load "f2b_stats.ch") (load "HLP_coded_announcement")
load_library
load_path
.
This works in the same way as EMACS library access. The standard
library includes a number of useful files.
(load_library "xwaves.ch") - sets up for Xwaves (load_library "f2b_dur_nnet.ch") - use f2b durations
Utterance
utt_hook
.
(set utt1 (Utterance Text "This is a simple utterance."))
Synth
(Synth utt1)
Say
Say
does just that.
(Say utt1)
Say (Synth
(Say (Synth (Utterance Text "This is a simple utterance.")))Still too many commands to remember/type? Try the following
SayText
function!
SayText
(SayText "This is an even simpler utterance.")
Save
Save
takes three arguments, a type, (e.g. Wave
,
Segments
, UnitLabels
, F0
or other), an
utterance, and a filename. If the filename is `"-"', the output
is sent to `standard out', useful for all types except of course
waveforms.
(Save Wave utt1 "ex1") (Save UnitLabels utt1 "-")Waveforms are saved in the format specified by the
Wave_FileType
variable. Possible values are RAW
,
NIST
, ULAW
or ESPS
. If using ESPS
, the
command btosps
(convert binary to sps) must be in the user's
path.
tts
"-"
, it will read
from `standard in'. It will then say everything typed at the
prompt. Note that this is performed sentence-by-sentence, so a
sentence must be completed before anything happens. Sentences are
terminated by a full stop, question mark, exclamation mark, or blank
line. This input mode is terminated by an empty sentence, best
accomplished by finishing a sentence and then entering a single full
stop on a new line.
(tts "war_and_peace") or (tts "~/RMAIL") or (tts "-") Hello, my name is Albert Einstein. Can you tell me how this system works? I would be interested to know. .
speaker_<ref number>
speaker_f2b
speaker_wnc600
speaker_MHT
speaker_fmp559
(speaker_f2b) (Say (Synth (Utterance Text "And here is the news"))) (speaker_wnc600) (Say (Synth (Utterance Text "But it is not the BBC")))
Especially in a research or development environment, there may well be several versions of CHATR available at any one time. It is of course important to be able to determine which version is running or are available, so functions have been included to achieve this.
There are several ways to determine which version of CHATR will be started by default or is currently running.
To determine which version of CHATR will be started by the
standard chatr
command, at the shell prompt type
chatr --version
or if in a hurry type
chatr -v
CHATR will print the version number and than exit back to the shell.
To determine which version of CHATR is presently running, issue the command
chatr_version
CHATR will send the version number currently running to `standard out'. Note that this is one of the few commands that do not need parenthesis.
It may be desired to start a non-current version of CHATR,
perhaps to compare the performance of a superseded module against the
old, or make use of a new one in a not quite yet proven new release.
Either way the method is the same; Append the version number of that
required to the usual chatr
shell command with a hyphen. As
an example, assume the current CHATR version is 0.92. To start
the previous version, enter the command
chatr-0.91
To start the not yet released version 0.93, enter the command
chatr-0.93
Of course these names may differ depending on the conventions adopted by the system administrator. However, the above works in the current ATR-ITL environment.
CHATR has a notion of a library directory or directories. The
value of the CHATR variable load-path
is a list of
directory names. By default it has the name of the CHATR
library directory set at installation time. The CHATR library
directory contains a number of CHATR command files you will find
useful: phoneme set definitions, duration models, intonation
statistics, etc. The load-path
variable is used in the same
way GNU EMACS uses its load-path
variable. Certain CHATR
commands (particularly load_library
) search for file names
relative to the directories listed in load-path
. The given
file name is appended to the values of load-path
and the file
is searched for. The first occurrence is used. You may set
load-path
to include path names of your own private CHATR
libraries too. You may also set the initial run time form of
load-path
to point to a library directory other than that
which was set at compile time. The command line version
--libdir
allows an alternative initial library directory.
To make the use of CHATR more convenient, an EMACS interface is provided. It allows buffers (and regions) to be selected and spoken by CHATR. The interface is menu driven and hence requires EMACS version 19 or later (or MULE-2.1 or later). Japanese is also supported in normal EMACS but of course it will not be displayed properly.
The interface falls into two categories, a menu driven tts mode and an EMACS mode for editing CHATR source files. The second of these is basically a modification of lisp-mode but with a few CHATR specific functions and facilities defined. The menu driven tts mode allows users to select regions (or whole buffers) of text (may be mixed Japanese/English), and have them rendered as speech. Crude control is also available so that various voices may be selected and commands to CHATR explicitly given. Note you must have set up your audio device in your `.chatrrc' file before this interface will work.
To use the EMACS interface, first you must ensure that the file `chatr.el' is in your EMACS load-path, and that the appropriate EMACS Lisp CHATR functions are loaded. To do this, add the following lines to the end of your `.emacs' file in your home directory
;;; Add chatr.el to your load-path (setq load-path (cons "$CHATR_ROOT/lib/etc" load-path)) ;;; Add chatr-menu to top menu-bar (autoload 'chatr-minor-mode "chatr" "Menu for using chatr." t) ;;; Switch chatr-menu on always (chatr-minor-mode 1) ;;; run chatr as inferior process (autoload 'run-chatr "chatr" "CHATR as inferior process." t)
More advanced users may also wish to add the following which defines a CHATR mode for editing CHATR source files.
;;; Lispish mode for editing CHATR command files (autoload 'chatr-mode "chatr" "Mode for editing chatr files." t) (setq auto-mode-alist (append '(("\\.utt$" . chatr-mode) ("\\.chatrrc$" . chatr-mode) ("\\.tts$" . chatr-mode) ("\\.ch$" . chatr-mode)) auto-mode-alist))
A further variable you may wish to set identifies the CHATR
binary. By default this is set to chatr
(which should be in
your path). If you wish to use a more recent version of CHATR
(that is a less stable one, but one that will offer more options, and
possibly better synthesis), also add the following line to your
`.emacs' file
;;; To get the latest *dangerously exciting* version of CHATR (setq chatr-program-name "chatr-alpha")
It is also possible to talk to CHATR as an EMACS inferior process.
In fact that is how the chatr-minor-mode (menu mode) works. You
may start CHATR by the EMACS command run-chatr
. This will
start CHATR in a buffer called `*chatr*'.
Go to the first, previous, next, last section, table of contents.