VoiceXML Review - Columns

Volume 6, Issue 1 - Jan/Feb 2006

First Words

Welcome to “First Words” – the VoiceXML Review’s column to teach you about VoiceXML and how you can use it. We hope you enjoy the lesson.

Wrapping up VoiceXML 2.1

This month, we’re just going to do a quick review of our articles (published over the last year and a half!) on VoiceXML 2.1. Since we’ve started discussing VoiceXML 2.1, the specification has advanced from a Last Call Working Draft to a Candidate Recommendation. There are two steps remaining (Proposed Recommendation and Recommendation), and vendor implementations have started to appear. The VoiceXML Forum Conformance Committee is beginning work on a conformance program as well, to ensure that users can rely upon the behavior of a vendor implementation.

http://www.w3.org/TR/2005/CR-voicexml21-20050613/

The W3C Voice Browser Working Group has prepared an Implementation Report Plan for VoiceXML 2.1:

http://www.w3.org/Voice/2005/vxml21-ir/voicexml21-irp.html

The Implementation Report is used to demonstrate that the specification can be implemented in practice. However, it is also used as the starting point for conformance testing – so if you’re interested in what conformance testing will address, have a look at the Implementation Report for some hints.

One Last Look

The new features proposed for VoiceXML 2.1 were based on feedback from application developers and VoiceXML platform developers. The features making up VoiceXML 2.1 include:

Referencing Grammars Dynamically – Generation of a grammar URI reference with an expression;
Referencing Scripts Dynamically – Generation of a script URI reference with an expression;
Using <mark> to detect barge-in during prompt playback – Placement of ‘bookmarks’ within a prompt stream to identify where a barge-in has occurred;
Using <data> to fetch XML without requiring a dialog transition – Retrieval of XML data, and construction of a related DOM object, without requiring a transition to another VoiceXML page.
Concatenating prompts dynamically using <foreach> - Building of prompt sequences dynamically using Ecmascript;
Recording user utterances while attempting recognition – Provides access to the actual caller utterance, for use in the user interface, or for submission to the application server.
Adding namelist to <disconnect> - The ability to pass information back to the VoiceXML platform environment (for example, if the application wishes to pass results to a CCXML session related to this call)
Adding type to <transfer> - Support for additional transfer flexibility (in particular, a supervised transfer), among other capabilities.

We’ve gone over all of these features, and given you a peek at the specification in the process. We encourage you to have a look at the real specification – it is well written, and includes a number of useful examples (which, uh, may look somewhat familiar to the faithful readers of this column).

Here is a complete list of the articles in this series:

An Overview of VoiceXML 2.1
http://www.voicexmlreview.org/Apr2004/columns/apr2004_first_words.html
Using ‘expr’ with Grammars and Scripts:
http://www.voicexmlreview.org/Sep2004/columns/sep2004_first_words.html
Accessing User Utterances, and using ‘namelist’ with <disconnect/>
http://www.voicexmlreview.org/Nov2004/columns/nov2004_first_words.html
Using the <mark/> tag to place bookmarks within a prompt stream:
http://www.voicexmlreview.org/Feb2005/columns/Feb2005_first_words.html
Using <foreach/> to build dynamic prompt sequences :
http://www.voicexmlreview.org/Apr2005/columns/Apr2005_first_words.html
Adding ‘type’ to <transfer/>
http://www.voicexmlreview.org/Jun2005/columns/Jun2005_first_words.html
Using the <data/> tag to fetch XML information: http://www.voicexmlreview.org/Aug2005/columns/Aug2005_first_words.html

Why VoiceXML 2.1 Matters

There are a number of benefits to the availability of VoiceXML 2.1:

Clearer separation between Presentation and Business Rules: A number of the new features, including ‘expr’ on <script/> and <grammar/>, <data/>, and <foreach/> allow the construction of VoiceXML pages with less reliance on application server logic to accomplish a task; these pages are much more likely to be ‘static’ and can contain the logic to perform personalization using ECMAScript and XML data access through the DOM;
Features supporting better user interfaces: The use of bookmarks and access to user utterances support the building of better user interfaces;
Better Telephony Integration - A few small steps, but important nonetheless. Adding the ability to return data when a call is disconnected allows a platform to integrate more cleanly into IP and CTI environments. And support for ‘type’ on <transfer/> provides a way to manage the many different types of call transfer that are encountered in the real world. Both also provide initial ways to support CCXML interaction in a standards based way.
Field-Proven Features: All of the features in VoiceXML 2.1 had been implemented as extensions to VoiceXML 2.0 by at least two vendors – this demonstrated a need in the field for such features. The criteria for selection of features for VoiceXML 2.1 ensured backwards compatibility, while allowing the standard to evolve. One of the great benefits of VoiceXML is that it has not evolved in a vacuum. It is widely supported by industry, and is widely deployed with many different kinds of applications.

If you are a VoiceXML developer, VoiceXML 2.1 is well worth a look.

Summary

There isn’t too much more we can say about VoiceXML 2.1. The specification should continue to advance along the standards track until published as a full W3C Recommendation in a few months. Be sure to try out the new features in your applications, and keep them in mind when doing your VoiceXML designs. See you next issue!

back to the top