Screen shot of the Voice Training GUI for Vicor's RIDS Extraction Subsystem

The GUI is written in Perl/Tk, and communicates with a backend server written in C++. The server is linked with the voice recognition toolkit libraries provided by Entropic, Inc.

This allows a user to use recently or newly recorded utterances from the main application grammar to improve the Hidden Markov Models (HMM) for that user.

The user can choose to use phrases that have been captured and saved during the normal course of their day's work, and/or can choose to have a set of phrases created for them to train from.

In the former case (shown) each phrase has been saved as a wave file and given a relative score based on the recognition confidence over all tokens in the phrase.

The user can sort the list based on any column heading, and can select a subset of phrases by dragging the mouse over them.

The user can 'review' any phrase by clicking to hear a playback of the wave file.

The user may also optionally record/re-record any phrase, or record in a continuous prompt mode for the set of all selected phrases.

When satisfied with the phrases, the user commits them to be used for creating the HMM transform, which can be treated as a 'first-time', or cumulative transform.

[Reproduced with permission from Vicor Inc. 2004]