Volume 4, Issue 2 - March/April 2004
 
   
   

 


Beyond VoiceXML 2.0

By
Jim Larson and Scott McGlashan

The W3C Voice Browser Working Group (VBWG) reviewed over 700 requests for change to VoiceXML 1.0. After careful deliberation, many of these were adopted, resulting in VoiceXML 2.0 which became a recommendation in March. The VBWG is now working on two efforts to make VoiceXML even better:

VoiceXML 2.1 contains features already implemented by speech platform venders. These include (a) dynamic reference of grammars and scripts, (b) a <data> element to fetch XML data structures, and (c) facilities to record utterances during recognition. VoiceXML 2.1 is completely backward compatible with VoiceXML 2.0: all VoiceXML 2.0 applications will run without modification under VoiceXML 2.1. For more details, see http://www.w3.org/TR/2004/WD-voicexml21-20040323/

V3 is the code name for the follow-on to VoiceXML 2.0. Requirements for V3 are derived from deferred VoiceXML 2.0 change requests, the SALT 1.0 and XHTML + Voice 1.0 (X+V) contributions to W3C, as well as the Multimodal Interaction and other W3C Working Groups. V3 will support:

  • Modularization -- a set of modules with common external interfaces. This enables dialog designers to mix and match voice with other modes of input, including keyboard and pen. For example, A V3 module for speech recognition might be embedded into XHTML, enabling a graphical web page to accept speech input from the user.
  • Extensibility -- to extend the power of dialog management. VoiceXML 2.0 already supports both system-directed and mixed initiative dialogs. V3 will allow for plan-based or rules-based dialog definition by enabling the dialog author to define new dialog strategies based on low-level components.
  • Low-level control of media resources -- including speech recognition, speech synthesis, and audio replay. Using these resources, application developers will be able to specify their own control structures, enabling a procedural style of dialog specification similar to that used by SALT developers in addition to the declarative programming style of VoiceXML 2.0 enabled by the Forms Interpretation Algorithm.

The Voice Browser Working Group will continue to coordinate with the Multimodal Interaction Working Group to guarantee the compatibility of V3 with the multimodal languages.



back to the top

Copyright © 2001-2004 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).