VoiceXML Review - Columns

Volume 1, Issue 3 - Mar. 2001

VoiceXML in the Large

By Rob Marchand

(Continued from Part 1)

Abstraction of Components

Abstraction of application components is an important software development technique. In the case of VoiceXML, much of the abstraction can be delegated to the server-side implementation.

Once you start developing VoiceXML applications however, you will soon find that there are certain dialogs that you are going to need over and over again. For example, the confirmation of input, or the collection of a specific piece of information such as a phone number. Vendors in the speech industry have invested a great deal of effort in designing the 'call flow', grammars, and prompts for these dialogs, so that they are very good at accurately gathering the information in which you're interested. Wouldn't it be great to be able reuse that experience? Well, with the VoiceXML <subdialog> tag, you can.

Subdialogs allow you to encapsulate a reusable dialog, and reuse it in many applications. A subdialog provides the following features:

A subdialog can be 'built-in', defined in the VoiceXML page, or referenced as a server-side component;
A subdialog is parameterized (using the <param> tag), allowing the tailoring of the subdialog in very powerful ways;
Invocation of a subdialog can include handling of events in the same manner as a field (and in fact is used like a field in a form);
A subdialog returns a set of variables to the calling document (using the <return> tag).

Samples of useful subdialogs include:

Credit card number collection;
Confirmation of a previous input;
Collection of a date (including support for different inputs, like "yesterday", "next Monday", or "July 25, 2001");

Subdialogs can be used to build much larger user interactions, while avoiding common mistakes and taking advantage of hard-won experience.

Application Partitioning

One of the most common questions asked by new users is "how do I partition my application between the client and server?" Our answer to this is usually "pick the right tool for the right job." You have the following tools available to you:

VoiceXML, controlling real-time speech, DTMF, audio, and TTS interaction with the user;
ECMAScript (part of the VoiceXML specification), providing a general 'workhorse' component for doing little jobs that don't require a trip to the server; and
The application server component; which provides integration with database and legacy systems, and links to data required by the application.

VoiceXML is best used to control the collection of information from the caller, validation of that information if possible, and playback of information to the user.

ECMAscript is best used to support the VoiceXML part, whether it is by massaging data formats, doing simple calculations or whatever. It gives the application author a mechanism to avoid a round-trip to the server for simple processing requirements. It also melds nicely with the variable and object model of the interpreter itself, so that, for example, field variables and ECMAscript variables are interchangeable.

The application server component provides your VoiceXML application with access to personalized or dynamic information. By filling a VoiceXML page with information that is relevant to the caller, or based on current events and information, you can build a more compelling application. This is exactly the model that has migrated the web from being a 'read-only' experience to a much more personalized, and useful medium.

You can think of this partitioning in the same way as typical n-tier web application architectures. Many of the same issues are relevant, with the exception that the interaction with the speech user has more stringent response requirements than the typical web application.

So the short answer to the question on how to partition an application is 'it depends', but if you follow the guidelines above, you'll be off to a good start.

What's Next?

Next month, we're going to get back into the more interesting stuff: coding! We'll show you more about how to build typical VoiceXML applications, using the concepts we've learned so far. In the meantime, if you haven't already, you might want to have a look at these resources provided by the VoiceXML Forum:

A number of VoiceXML Forum Members provide access to developer sites and tool kits that will allow you to try out VoiceXML for yourself. A few of these are:

back to the top