Go to the first, previous, next, last section, table of contents.


Tools

Various general purpose subsystems exist within CHATR that allow the creation of new (and parameterizing of existing) modules.

Feature Functions

Feature functions are used to obtain detailed information from an utterance. A feature function examines the specified stream cell and returns a value in the form of a character string. The mechanism is designed so vectors of parameters can be both easily and uniformly extracted.

A large number of feature functions have already been defined and can be found in file `$CHATR_ROOT/src/chatr/feats.c'. To obtain the names and a short description, issue the command

     (Feats_Out)

When used as a Lisp function, this command takes three or four arguments: an utterance, a stream name, a list of features to extract, and optionally, a filename to write them to.

It is possible to get sets of vectors for items in a stream within an utterance. For example, suppose the vowel type, lexical stress, accent and duration of all syllables are required. Use the command

     (Feats_Out utt1 'Syl '(syl_vowel stress Syl_tobi_accent Syl_dur))

A list of values will be returned for each syllable in `utt1'.

This function is especially useful in dumping features from PhonoForm utterances. See section PhonoForm Utterance Types, for a full description.

A further use would be for extracting data to use in an external training system or possibly one of the training sub-systems within CHATR. See section Linear Regression, for details.

Note some of these functions are very specific to particular types of utterances and not all general purpose. However, most have already been used within the system for various decision trees, duration and intonation systems which use linear regression and neural nets.

If you are writing new functions they should be added to the table in the file `feats.c'. Care should be taken so they do not generate garbage (i.e. memory leaks). This is not usually a problem, but if numbers are to be returned they must be converted to strings and hence may not be garbage collected. In this rare case, copy the technique used in the function syl_dur (or any of the other ???_dur functions defined in `feats.c'), where the new string is added to a list of strings to be freed later. Note that returning numbers is actually quite rare--in most situations quantized versions are quite adequate. At the risk of sounding like a lecture, it must be stressed that memory leaks can cause a lot of trouble and efforts should be made to avoid them in all instances.

Linear Regression

Linear regression is used in a number of different places within CHATR, such as intonation, duration, as well as weight-training for unit selection. It is also available through a Lisp interface so models can be trained within CHATR. Full experiments would normally be done away from CHATR in something like S-plus or MATHLAB, but once a model is decided upon it is easier to train within CHATR, since data will be much more accessible inside the system rather than from outside.

Two Lisp functions provide the interface, one for training and one for testing. Data for training and testing can be obtained using `feature functions'. See section Feature Functions, for details. See section Training a ToBI-Based F0 Prediction Model, for an example of model building using linear regression.

First, collect vectors into a file. A vector consists of a value (to be predicted), and a set of feature values. These should be put in a file, one vector per line, with parenthesis at the start and end. A vector should be readable in one lisp read.

Given such a vector file and a list of feature names (including a name for the first item in a vector list), use the command

     (Linear_Regression_Detail vector_file features_names '2)

The results consist of a list of details about various aspects of the linear regression model built. The third parameter controls the amount of information that is returned by the call. The following values are valid

0
Returns a list of floats, the intercept plus the weights for each feature.
1
Returns a list of items: a list of pairs consisting of feature name plus weight; the intercept; the percentage variance described by the model; the correlation; and the correlation of each individual feature with the dependent variable.
2
As type 1 plus: the weight times standard deviation for each feature (shows their relative contributions); a list of dropped features, i.e. those with weight 0; the contribution of each feature; the stepwise contribution giving the order (actually an order) of importance of each feature.

It is recommended that the results from linear regression be checked to determine which features are contributing the most, least or nothing at all (i.e. are fully predictable from other features).

A function called Linear_Regression_Predict allows the testing of models, typically against a test set. Like the Linear_Regression_Detail described above, this requires a vector file in the same format. Each vector consists of a value for prediction (the dependent variable), and the features used to predict it. Vectors must have parentheses around start and ends. A `Model' is required, consisting of a list of floats, the first being the intercept, and the rest the weights for each feature (i.e. as is returned by a level 0 linear regression or extracted from the more detailed results). An optional third argument to Linear_Regression_Predict specifies a file that predicted values are written to, so external tests may be carried out. The correlation and statistics for mean error are printed to the screen.

A representation of a linear regression model structure is provided internally (with conversion functions from Lisp) for use within a model.

Note when using any of these functions in your own C code you must include the line

     #include "lr.h"

The requirements are: a Lisp list of pairs of feature name plus weight; plus a pair named Intercept. An optional third value may exist in the misnamed `pair', this should be a feature map name. Feature maps are defined in the variable feature_map and consist of a name and a list of values. If a feature returns a value in a feature map's list then its value is treated as 1, else it is assumed to be 0. This is designed to be able to efficiently deal with category valued features. An example would best illustrate this. Supposing that we have a feature that returns the ToBI accent label on a syllable. Things like H* and L* are of course not valid numbers and hence cannot be used in a linear regression model. However, if we convert this feature with a feature map we can give it to the LR model. Assume we have the following features map

     (set feature_maps
       '((tobi_accent_0 H*)
         (tobi_accent_1 !H*)
         (tobi_accent_2 L*)
         (tobi_accent_3 L+H* L+!H* H+!H* L*+!H L*+H)))

In our linear regression model we can then specify pairs as

     ...
     (tobi_accent 13.3743 tobi_accent_0)
     (tobi_accent 12.5658 tobi_accent_1)
     (tobi_accent -17.0276 tobi_accent_2)
     (tobi_accent 5.6093 tobi_accent_3)
     ...

Using C we can convert such a Lisp representation of a linear regression model to a more efficient internal one and then use it. This short example illustrates such a use

     List lisp_lr_model;
     LR_Model lr_model;
     Stream s;

     lisp_lr_model = list_str_eval("lr_model","no LR model set");
     lr_model = make_lr_model(lisp_lr_model);

     for (s=utt_stream("Segment",utt); s != SNIL; s=SC_next(s))
            SC(s,Segment)->duration = lr_predict(s,lr_model);

Neural Nets

CHATR includes a subsystem for training, testing and using neural nets. Although it offers some flexibility, it has been specifically designed to implement Campbell's duration theory. Hence the basic structure is fixed. However, a choice in number of inputs, hidden nodes and outputs is available.

Each net consists of an input layer, a hidden layer and an output layer. All input nodes are connected to all hidden nodes and all hidden nodes are connected to all output nodes. The nets are used to produce a single value between 0 and 1. This we understand is a slightly unusual use of neural nets.

Input values must be single digits between 0 and 9. If all training data inputs are 0 or 1, the inputs are treated as binary and re-scaled. The actual inputs (and outputs) are re-scaled to be greater than 0 but less than 1, however the scaling factors are internal to the implementation, so users need not worry about them. The outputs are linearly scaled from the examples in the training set by the function

O{net} = (O{actual}-Min{out}+epsilon)/(Max{out}-Min{out}+epsilon)

Some other mapping function may be considered to be more appropriate.

A training set should consist of a list of pairs of inputs (a string of digits) and an output value (a float). For example

     7710000063600011000110026401    70.0
     7101010636300111001110264411    270.0
     1010116363301111011112644410    220.0
     0104103633311111111116444401    190.0
     1041016333011110111114444412    400.0
     0410113330011100111104444221    310.0
     4100113300011000111014442412    230.0
     1000103000310000110104424821    140.0
     0000000003300000101004248811    190.0
     0001000033000001010002488210    150.0

The training set is best stored in a single file. The training function takes the following arguments

     (NN_Train PAIRS-LIST|IN-FILE OUT-FILE ITERATIONS)

The first argument may be a Lisp list of input/output pairs or a file name that contains such a (non-bracketed) list. (The non-file option was really only added for very simple tests and debugging of the system.) The OUT-FILE argument names a file where a Lisp representation of the neural net will be saved at the end of training or at check points. That representation is suitable as input to the other neural network functions. Finally ITERATIONS is the number of times this training session should repeat.

Other parameters for the training may be specified in the variable nn_params. If set, this should consist of a list of pairs, each pair being a parameter name and a value. The supported parameters are

n_hidden
A number specifying the number of hidden nodes this net should have.
check_pt
A number specifying the number of iterations between check points.
check_actions
A list of actions to do at a check point. There are three actions available
save
Save the current net in the output file.
error
Print the current mean error.
list
Print one cycle of input/output pairs (probably too much for most users).
check_pt_func
A Lisp function to be evaluated at check points. This can be useful for doing arbitrary tests. It is recommended to set it to run NN_Test on the training and test sets.
start_net
A Lisp representation of a net as saved in OUT-FILE at the end of a training session or at a check point. This net is used as the starting point so training sessions can continue after being stopped. The net representation includes details of the number of iterations it has already executed. The ITERATIONS argument to NN_Train is the number of iterations required this time rather than with respect to the number of iterations already made with the starting net.

A typical training session may be started as in the following example

     (define ccc ()
       (NN_Load (load "rad_sdur9"))
       (print (NN_Test "rad_sdur5.netdata"))
       (print (NN_Test "rad_sdur5.test")))

     (set nn_params '((n_hidden 20) (check_pt 50) 
                      (check_pt_action save error)
                      (check_pt_func ccc)))

     (NN_Train "rad_sdur5.netdata" "rad_sdur9" '40000)

When continuing an interrupted training session, a typical restart command-set is

     (define ccc ()
       (NN_Load (load "rad_sdur9"))
       (print (NN_Test "rad_sdur5.netdata"))
       (print (NN_Test "rad_sdur5.test")))

     (set nn_params '((n_hidden 20) (check_pt 50) 
                      (check_pt_action save error)
                      (check_pt_func ccc)))
     (set nn_params (cons (list 'start_net (load "rad_sdur8"))
                    nn_params))

     (NN_Train "rad_sdur5.netdata" "rad_sdur9" '40000)

The function NN_Load takes one argument, a Lisp representation of a net as saved by a training session. It stores it as the current net, though this notion of current net is only used in testing and training and not when these nets are actually used in the duration model.

The function NN_Test takes as an argument a list of input/output pairs or a file name containing such a list. (i.e. the same form as the first argument to NN_Train.) It tests that list with respect to the current net, and returns a list of three values: the mean error; the RMS error; and the standard deviation of the error.

Decision Trees

Decision trees, sometimes referred to as discrimination trees, are used by a number of sub-systems within CHATR. All are of the format

     dtree:           condition-node | leaf-node 
     condition-node:  (condition true-node false-node)
     true-node:       dtree
     false-node:      dtree
     condition:       (featname binary-operator value) |
                      (featname in value value value ...)
     binary-operator: < > = <= >= 
     value:           number real or string
     leaf-node:       list whose car is atomic

featname should be a feature function as defined in `src/feats.c'. The actual leaf node may be anything, a probability distribution, a particular class, a linear regression model, or whatever is required. They are used with respect to a particular stream cell. The feature name in a condition is called for that stream cell and the result tested against the condition. If the condition is true, the function recurses and applies the true node to the stream. If the condition fails, it recurses and applied the false node to the stream cell. When a leaf node is found it is returned. As an example, a tree for predicting reduction on syllables is

     (set reduce_tree '
        ((foot in 8 9 5 7 6  ) 
            (  0.9361 0.007192 0.05675 unreduced)
            ((accented = 0)
             ((pbi < 3.5) 
                 (  0.2895 0.6295 0.081 reduced)
                 ((bi < 0.5) 
                     (  0.3242 0.4341 0.2418 reduced)
                     (  0.647 0.3406 0.01244 unreduced)))
             ((pbi < 0.5) 
                 (  0.1538 0.7692 0.07692 reduced)
                 (  0.8138 0.1517 0.03448 unreduced)))))

It is not difficult to use a decision tree within C code. Note the decision is always made with respect to a Stream cell, in this case a syllable. The feature functions named should be appropriate for the Stream cell type given. For example

     #include "disctree.h"

     tree = list_str_eval("reduce_tree","No reduce tree");
     for (s=utt_stream("Syl",utt); s != SNIL; s=SC_next(s))
    {
        class = dt_decide(s,tree);  /* returns leaf node */
        if (list_sequal("reduce",list_last(class)))
           reduce_syl(s);
    }

Decision trees may be created externally from any CART-like system, converted to the above Lisp format and then easily used within CHATR


Go to the first, previous, next, last section, table of contents.