In addition to command level use, users of CHATR will also wish to add their own modules to the system. (e.g. new synthesis methods, alternative intonation modules, etc). This section explains how this may be done. Note although CHATR is flexible, it is impossible for it to be flexible enough for all users.
This section also explains many low-level details of the system.
CHATR has been specifically designed with the view that unknown people will introduce unknown modules into the system. New intonation modules, duration modules or waveform synthesizers or indeed many other additions that the original designers did not even consider. But what has been developed is a system in which it is easy to declare, define and call new modules that can fully access an utterance's internal structure and modify it in a desired way.
We wish to allow developers to have as free a choice as possible but there are a number of simple rules which will make your life and our lives a lot simpler if they are followed.
The following should be considered when writing new code or integrating existing code into CHATR (which may be harder).
Free everything you `malloc'.
free
function defined for that stream. CHATR may run
for thousands of utterances, so even one byte unfreed is one too
many. If possible you should use the CHATR provided
xfree
and xalloc
functions for memory management. If
all memory allocations go through the same functions, there is a
better chance of tracing memory leaks.
Never call `exit'
list_error
to exit, which will tidy things
up and continue with CHATR's execution. See section Dealing with Errors, for more information. If possible, try to free any free-able
memory before calling list_error
.
Never `printf' things to the screen.
P_Warning
, for error
messages P_Error
, for debug messages P_Debug
, and for
general messages use P_Message
.
Never put absolute path-names in code.
Always use CHATR functions to access CHATR structures.
If a CHATR function exists to do a task, use it.
Try not to add unnecessary functions to the name space.
Never use fixed sized arrays
Avoid machine-dependent functions.
CHATR is by no means complete and definitely requires further development. If it is difficult to write a particular module, it may be due to CHATR's architecture. Discuss the problem with others to see if an architectural change is required.
As with all large development systems, the ultimate authority lies in the source. That's what actually gets run. This document will never be as up to date as the source. It may well document what was supposed to happen rather than what actually happens. Look at the source for the answer. Also the source is useful to see how other people have tried to do things--you can copy tips and parts of code by looking at similar modules.
The source of CHATR is currently kept under RCS (Revision Control System). RCS is a system that keeps track of different versions of files. However, the primary reason why we use RCS is to ensure that two people cannot edit the same file at the same time.
The recommended use of the source tree is to create a private directory tree of the system, and then build symbolic links in each directory to the RCS files in the CHATR source. See section Installing the System, for more information. GNU make will automatically check-out files from the library as required. (Be careful: other `makes' may not do this).
There are three core structures in CHATR: List
,
Stream
and Utterance
. This section describes their
use, actual structure, and accessor functions and macros.
List structures are a direct implementation of the list structures that appear in many languages and most explicitly in Lisp and Scheme. Lists offer a generic method for dealing with complex structures. They are an ideal tool for dealing with structured ASCII data. Most of CHATR's non-binary data are described in list structures. A basic list cell can be one of two major types: cons or atom. An atom may consist of a string, a number, an object (a stream or utterance) or a function (either a C function or a user defined function). A cons cell consists of two sub list cells which for arcane historical reasons are called the car and the cdr. (In other systems these may be called first and rest. Most Lisp programmers (and many other people) are much more familiar with the terms car and cdr, so if new terms must be learned, it is better to learn the terms that many other people use, even if they are obscure words.)
Lists can be read (or printed) in an ASCII form which uses parenthesis. Internally they are represented by linked structures.
The car of a list is the first item in the list, while the cdr is the remainder of the list. To illustrate this, given the following list
(a b c d)
The car is `a' while the cdr is `(b c d)'.
C functions are defined for testing the type of a cons cell (whose C
type is List
).
Many C functions are defined for lists, printing reading, length, reversal, appending of two lists etc. Their prototypes are given in `$CHATR_ROOT/src/include/list.h'. Users should always use the functions provided for accessing these lists, the internal structure may change but the accessor functions will still work.
Some of the most basic functions are
List mkatom(char *a)
STRVAL(List a)
List cons(List a, List b)
List car(list a)
List cdr(list a)
int list_length(List a)
list_nth(int n,List a)
In addition a number of reading and writing functions are provided for dealing with s-expressions.
LSTREAM *lopen(char *fname,char *mode)
fopen
but for files containing Lisp
expressions. Three related functions exist: lopen_pipe
given
a pipe create an LSTREAM from it, lopen_stream
given an
existing FILE create an LSTREAM, and lopen_stdin
create an
LSTREAM for the interactive `standard in'.
lclose(LSTREAM *fd)
List list_read(LSTREAM *fd)
char *print_cc(List a)
char *pprint(List a)
The following small sample program reads in all the s-expressions in a file and prints the first thing in each list followed by the length of the string it appears in.
#include <stdio.h> #include <list.h> int main(int argc, char **argv) { List a; LSTREAM *fd; fd = lopen("testfile","r"); while ((a = list_read(fd)) != LIST_EOF) printf("%s: %n\n",STRVAL(car(a)),list_length(a)); lclose(fd); }
Streams contain the real contents of an utterance. An utterance will
consist of a number of streams, each with a name. Example stream
types are `Word', `Segment', and `Unit'. Each stream
consists of a doubly-linked list of cells. The contents of a cell
may be a phoneme, a segment, or a word etc. A stream cell of a
particular type will have a number of related functions as declared
in the file arch/table.c:stream_tab[]
. A large number of
accessor and manipulation functions exist. These are declared in
`include/table.h'.
The contents of a stream cell will be a pointer to a user defined structure. By convention the structure name and stream name are the same.
The following accessor functions exist
Stream new_stream_cell(char *type)
Stream SC_type(Stream cell)
Stream SC_next(Stream cell)
SNIL
is defined as the end of the stream.
Stream SC_previous(Stream cell)
SNIL
is defined as the end of the stream.
(struct Type*)SC(Stream cell,Type)
SC
automatically does the
casting, so usage can be as follows
SC(phone,Phoneme)->name
List sc_relation(char *type,Stream cell)
type
which are related to this cell.
Note there are macros for many standard types in `table.h'.
void sc_set_relation(char *type,Stream cell,List newvalue)
A stream cell may be deleted with the function
delete_stream_cell
. This function requires the cell
and the whole utterance, as deleting a cell requires that all
other pointers to that cell are removed.
New streams may be added to the system by adding a declaration to
stream_tab
in the file `arch/table.c'. A stream requires
a name (a string of characters), a delete stream function (typically
sc_delete_stream
), and functions to make and free the contents
of a cell. The name of the stream should be the same as the name of
the structure of its contents. An additional two fields have been
added, `load' and `dump' functions, which translate the contents of the
cell into a Lisp expression (or from a Lisp expression into the
internal form). This allows the X windows utterance inspector
program to graphically display the contents of an utterance (as well
as certain other functions to use the utterance contents uniformly).
For example, a new stream called "Ninput"
could be declared in
table.c as
{"Ninput", sc_delete_stream, free_ninput, make_ninput, sc_print_ninput, sc_load_ninput},
Do not forget to give prototypes for these functions in `table.c'. The structure itself (which should not be included in `table.c'), can be defined in some other `.h' file, as in
struct Ninput { /* Simplest high level romaji input */ char *text; };
Then the make and free functions themselves are of the form
void *make_ninput(void) { struct Ninput *ninput = xalloc(1,struct Ninput); ninput->text = NULL; /* always initialize strings and */ /* other fields explicitly */ return (void *)ninput; }
void free_ninput(void *contents) { struct Ninput *ninput = (struct Ninput *)contents; xfree(ninput->text); xfree(ninput); return; }
Note the use of void
is so that the contents of a stream may
actually be of any type.
Stream cells may be linked to other stream cells through relations. A function is provided to make those links automatically, based on the types of the stream cells. Two cells may be linked using
link_stream_cells(word_cell,syl_cell);
Other functions are also available for adding individual stream cells to streams in an utterance, or removing them if required.
A few common utilities are offered for regularly used functions. When following relations, a few functions are actually required, therefor macros for common access functions are defined in `table.c'. The utilities are
Rsyl1(Stream s)
Rseg1(Stream s)
Rword1(Stream s)
To access the actual contents of a stream cell, use the macro
SC
. As an example, to access the text
field of the
Ninput
cell described above use
SC(s,Ninput)->text
The contents will depend on the type of cell. The following small example looks through all words and prints specific information about the syllables they contain
#include "list.h" #include "interface.h" /* for print functions */ #include "table.h" #include "word.h" #include "syllable.h" void demo(Utterance utt) { Stream w; List syls,s; for (w=utt_stream("Word",utt); w != SNIL; w=SC_next(w)) { syls=Rsyl(w); /* Get list of related syllables */ P_Message("Word %s:\n",SC(w,Word)->text); P_Message(" num of syls %d\n",list_length(syls)); for (s=syls; s != NIL; s=cdr(s)) /* for each syllable */ { P_Message(" syls: %s\n",SC(STREAMVAL(car(s)),Syl)->text); if (SC(STREAMVAL(car(s)),Syl)->lex_stress == TRUE) P_Message(" stressed\n"); else P_Message(" unstressed\n"); } } return; }
An utterance contains a number of streams. The number and type
of these streams is determined at utterance creation time (via the
Lisp level Utterance
function which in turn is
the C function new_utterance
). The basic argument to
new_utterance
is an arbitrary List structure which is
whatever input was given. The returned form in an utterance
structure which should be accessed only through the provided
interface.
The following utterances access functions exist
Stream utt_stream(char *type, Utterance utt)
void utt_set_stream(char *type, Stream cell, Utterance utt)
The Lisp system allows the definition of functions and setting of variables but variable settings in Lisp are useless unless they can be accessed in C. A number of functions aid the interfacing of the two worlds.
For example, suppose we wish to find the value of the variable
test_dir
; to find the value of a Lisp variable in C use the
function
List l_test_dir; l_test_dir = list_str_eval("test_dir",NULL);
The second argument to list_str_eval()
is an error message to
be printed if the variable is unset. If no error message is supplied
(i.e. the second argument is NULL) and the variable found to be
unset, then NIL (an empty List) is returned and no error
message is generated. If an error message is specified and the
variable found to be unset, the function calls list_error()
to
process the error and hence does not return.
The above only gives a `List' structure (atom or list) in return. To access its internals another function is required. The major types are
char *STRVAL(List c); Utterance UTTVAL(List c); Stream STREAMVAL(List c); int list_num(List c); float list_float(List c);
These functions and macros may call list_error()
if given
inappropriate arguments. You should check things are atomic
(using atomp()
) and of the appropriate type
(streamp()
numberp()
etc. if necessary.
Most simple atoms in CHATR are treated as strings, even
though they may consist of digits--though true numbers, floats,
and realstrings can be created. Both the functions
list_num()
and list_float()
will return an int or
float even if their given argument is a string (or realstring),
if it can be given as a valid argument to the C functions
atoi()
and atof()
.
As many modules require a number of external parameters, a few extra functions have been added to aid this. The general recommendation for parameters for a module is that a single Lisp variable is set with a list of pairs (in Lisp terms called an assoc-list) defining values for each of the parameters. For example, a typical setting for the ToBI intonation modules parameters is
(set tobi_params '((pitch_accents H* !H* L* L+H* L*+H) (phrase_accents H- L-) (boundary_tones H-H% L-H% L-L% H-L%) (topval 45.0) (baseval 25.0) (refval 100.0)))
Thus in the ToBI module the external parameters may be obtained by accessing one variable and then the individual parts, using predefined functions for parameter accessing. The parameter setting functions (defined in file `$CHATR_ROOT/src/phrase/futils.c') take three arguments: an association list, a parameter name and a default value. A number of parameter setting functions are defined, one for each major type--number float, string, list, etc. Thus our parameter initialization in the ToBI module would be
List params; params = list_str_eval("tobi_params",NULL); tobi_pitch_accents = param_get_list(params,"pitch_accents",NIL); tobi_phrase_accents = param_get_list(params,"phrase_accents",NIL); tobi_boundary_tones = param_get_list(params,"boundary_tones",NIL); tobi_topval = param_get_float(params,"topval",50.0); tobi_baseval = param_get_float(params,"baseval",50.0); tobi_refval = param_get_float(params,"refval",120.0);
For completeness, documentation for any parameter variables should be given in the table in the file `$CHATR_ROOT/src/chatr/chatr_vars.c'. A documentation string may be associated with a variable. This string is available in on-line help, and it will also automatically appear in the user manual.
All Makefiles in CHATR refer to all files in that directory. It is important that all files are mentioned in a Makefile so that CHATR can automatically check in, check out, compile, and backup the files that are part of the system. If a new file is to be added to a directory, edit the Makefile and add the new file name to the the appropriate line. For `.c' files add it to SRCS. For `.h' files add it to H (add the variable if not already there--and add it to the FILES variable). For other files, add it to FILES (or some other appropriate list). For example, the Makefile for the `$CHATR_ROOT/src/lex/' directory looks like
# Makefile for synthesizer: lexicon module TOP = ../.. DIRNAME = src/lex SRCS = lexicon.c complex.c lextree.c word.c reduce.c OBJS = $(SRCS:.c=.o) FILES = $(SRCS) Makefile ALL = .chatrlib include $(TOP)/src/include/default.make.rules
To add a new file, for example `oaldce.c', change the `SRCS' line to
SRCS = lexicon.c complex.c lextree.c word.c reduce.c oaldce.c
To incorporate a new directory, add the name to the list of
directories in the `Makefile' in the parent directory. The
easiest way is to copy the `Makefile' from a sibling directory
and edit it -- remember to redefine the variable DIRNAME
!
Finally, the directory name must be added to the file
`$CHATR_ROOT/utils/mkchatrdirs'.
This section describes how to add a new (completed) module to CHATR. See section Developing New Modules for CHATR, for more detailed aspects of adding new methods to CHATR.
Basically there are three things you must do in order for a new module to be accessible within CHATR: declare, define and call.
A detailed example is given here showing how a module that does reduction of vowels to schwas in de-accented words. Note this is only illustrative, and not intended to be a complete implementation of such a function. Later examples will show other changes to the system.
This new module, reduce_module
, will act on utterances and be
a conventional utterance module. It will take an utterance at some
suitable stage of processing and modify the phonemes in words where
it is decided they should be schwa'd.
The first stage is to declare the new module. The file
`$CHATR_ROOT/src/chatr/utt_modules.c' should be used for this.
It contains a table of utterance modules called com2umfunc
.
Each entry identifies a module using the following five fields
Lisp Name
C function name
void reduce_module(Utterance utt);Such a declaration must appear in this file.
Requires List
Provides List
Documentation string
Thus our entry for our module would be
{"Vowel_Reduce",reduce_module,NIL,NIL, "Reduces vowels to schwas in de-stressed function words."},
Once declared we can define our module. We may wish to build it in a new directory. See section Adding a New Directory, for information on adding a directory to the CHATR structure. Alternatively we may add it to an existing directory or file. Here we will simply include it in the `lex/' directory in a new file. See section Adding a New File, for information on how to update the Makefiles such that a new file may properly become part of the system.
In our new file `reduce.c' we can write our module. Almost definitely we need the following `includes'
#include <stdio.h> #include "alloc.h" /* basic alloc/free and string functions */ #include "list.h" /* List access function */ #include "table.h" /* utterance and stream access functions */ #include "word.h" /* word cell structure */ #include "syllable.h" /* syllable cell structure */ #include "phoneme.h" /* phoneme cell structure */
The main function basically goes through each word and checks to see if it is de-stressed. If so, an attempt is made to change the vowel to a schwa.
void reduce_module(Utterance utt) { Stream word; for (word=utt_stream("Word",utt); word != SNIL; word=SC_next(word)) if (destressed(word) == TRUE) make_schwa(word); }
The third and final stage is to call our new module. Normally
in CHATR, modules are called through the Lisp function
Synth
(C function chatr/chatr.c:chatr_synth()
). It is
possible to add a call there to reduce_module
if
desired. However, for testing and experimentation purposes, it is
possible to call reduce_module
through Lisp. We can define a
new synthesis function in Lisp as follows. (This offers the same
functionality as the existing function Synth
for HLP type
utterances.)
(define new_synth (utt) ;; Explicit flow of control for synthesis in Lisp (Input utt) (HLP utt) (Word utt) (Phonology utt) (Intonation utt) (Duration utt) (Int_target utt) (Rfc_Module utt) (Synthesis utt))
This function calls the appropriate modules for synthesis. The
function Input
deserves some explanation, it loads the
appropriate streams from the input form given to the function
Utterance
. This function should always be called at the start
of such a synthesis Lisp function. The final function
Synthesis
does the low level waveform synthesis (by whichever
method is currently selected). The middle functions are the
interesting ones. We can add our new module and define
new_synth
as
(define new_synth (utt) ;; Explicit flow of control for synthesis in Lisp (Input utt) (HLP utt) (Word utt) (Phonology utt) (Vowel_Reduce utt) (Intonation utt) (Duration utt) (Int_target utt) (Rfc_Module utt) (Synthesis utt))
We use the Lisp name for our function as defined in our utterance
module table entry. This position for the call may not be the most
optimal, perhaps it should be within the word module itself. The
word module could be defined in Lisp and Vowel_Reduce
added to
it.
Now we can use the above function instead of Synth
and get
synthesis to use the new module. For example if utt1
is an HLP
type utterance we can use
(Say (new_synth utt1))
A new Lisp command can be added in a similar way to new utterance
methods as described above. In `chatr/commands.c' add a new entry
for your function. The table com2func
defines the commands.
The fields are
Lisp name
C function name
List <name>(List args)The function is given a list as an argument. The car of that list is the name of the function being called while the cdr is the list or arguments given.
Lambda/Nlambda
'L'
all arguments are evaluated before
the function is called, or if 'N'
arguments are not evaluated
(but the function itself may evaluate the arguments if desired).
Arguments completion state
com_args
table also
in `chatr/commands.c'. It defines what completions are
available for arguments. A value of -1
denotes unknown
argument completion, but of course variable, command and file name
completion will still occur.
Document string
For functions which simply act on single utterance objects, it may be more appropriate to define them as utterance modules. These are functions of the form
void um_func(Utterance utt);
These may be defined in a table similar to standard functions as in
com2func
. Utterance modules are defined in the table
com2umfunc
in `$CHATR_ROOT/src/chatr/utt_modules.c'.
A new synthesis method could be introduced as a completely new utterance module. However, as there are already a number of methods, probably the easiest way is to add to them.
First, choose a name and write the module.
Secondly, in `$CHATR_ROOT/src/chatr/commands.c', the
documentation structure for the command Parameter
should be
updated, along with the argument completion table above it.
The next stage is to allow your new synthesis method to
be called. This is done from the function synth/synthesis.c:
synthesis()
. Add a condition in the obvious way. The
new synthesis module should take an utterance as an argument
and return a P_Wave
structure. This structure is defined
in the file `$CHATR_ROOT/src/include/wave.h'.
A dummy example synthesis method is given in the file `$CHATR_ROOT/src/synth/dummy.c'. It shows the basic declaration and how to access the most obvious parts of the utterance structure.
There are already a choice of options in various places in CHATR, and it is much easier to add new modules at these points.
A new module must of course be defined before it can be selected. As
an example, if introducing a new duration module, it must be added to
the file `$CHATR_ROOT/src/duration/duration.c:' in
duration_module()
. A name should be chosen that can be called
as an argument to the command Parameter Duration_Method
.
The module selection test in the first part of
duration_module()
should be extended and the new module added
in the directory `$CHATR_ROOT/src/duration/'.
Another example, adding a new intonation module, is a little more difficult. The basic selection is in file `$CHATR_ROOT/src/intonation/intonation.c'. Three new functions are required as follows
intone_module()
.
int_target_module()
after duration values
have been calculated.
make_F0()
in
`$CHATR_ROOT/src/intonation/make_f0.c'.
These functions may not do anything in some intonation theories, but all should be defined if it is wished to fit neatly into the current system.
The file `$CHATR_ROOT/src/ruc/ruc.c' should be edited to include
new signal processing and concatenation techniques. Again, a
particular function is selected by setting a value using the
Parameter
command.
It is not acceptable for modules to call exit()
. CHATR
should continue running even if an error occurs. An error system is
included in CHATR that allows modules to abandon their execution
but still allow CHATR to continue. If you find an error
condition call
list_error(On_Error_Tag);
This returns to a higher level (via a setjmp/longjmp
) and
allows CHATR to continue. It is wise to tidy up any allocated
memory, or any modification made to an utterance before calling this
function. It is also wise to close any open files opened in this
module before calling list_error()
.
If you wish to catch errors within your module which occur in a
function below, then you can add a catch for errors allowing you to
tidy up before calling list_error()
or allowing you to
continue. For example file/list.c:load_lfile()
loads and
evaluates Lisp commands in a file. If an error occurs while
executing the commands in that file before we return to top level
with an error message, we wish to close that file. This is done by
adding a new return point within the function so that if an error
occurs there is a chance to close the file. The general format of
such a guard is
List_Error_Tag local_tag; FILE * volatile fd = NULL; local_tag = On_Error_Tag; if (list_onerror(On_Error_Tag)) { /* Gets here only if error occurs in else clause */ /* do tidy up */ if (fd != NULL) fclose(fd); reset_error(On_Error_Tag,local_tag); /* Send error to higher level *. list_error(On_Error_Tag); } else { /* normal calls */ fd = fopen(...); reset_error(On_Error_Tag,local_tag); return /* whatever */; }
Note that due to the implementational semantics of
setjmp/longjmp
, care must be taken when accessing variables
during execution of the error condition. Optimization strategies
used may result in the values of local variables not being restored
properly. This can cause serious problems when trying to tidy up
before passing the error further back. To deal with this, all
variables used within the error case should be declared
volatile
. The above example shows how such a declaration
could be used with a pointer. This also should be done with
care--check existing code for further examples.
Go to the first, previous, next, last section, table of contents.