VoiceXML Review - Columns

Volume 5, Issue 3 - May/June 2005

First Words

Welcome to “First Words” – the VoiceXML Review’s column to teach you about VoiceXML and how you can use it. We hope you enjoy the lesson.

VoiceXML 2.1

In this lesson, we’re going to continue investigating VoiceXML 2.1.

You may recall that as VoiceXML platform vendors and application developers began to widely deploy VoiceXML applications, they began to identify potential future extensions to the language. The result of this experience is a collection of field-proven features that are candidates for addition to the VoiceXML language. These features are being proposed as part of VoiceXML 2.1.

As of this writing, VoiceXML 2.1 has just advanced to the Candidate Recommendation state. here is a pointer:

http://www.w3.org/TR/2005/CR-voicexml21-20050613/

Note: if you’re reading this article after VoiceXML 2.1 has been finalized and published as a full Recommendation, you should spend a few minutes tracking down the final specification rather than this link, as the specification may have undergone minor changes.

The new features proposed for VoiceXML 2.1 are based on feedback from application developers and VoiceXML platform developers. The features we’ve covered already include:

Referencing Grammars Dynamically – Generation of a grammar URI reference with an expression;
Referencing Scripts Dynamically – Generation of a script URI reference with an expression;
Recording user utterances while attempting recognition – Provides access to the actual caller utterance, for use in the user interface, or for submission to the application server.
Adding namelist to <disconnect> - The ability to pass information back to the VoiceXML platform environment (for example, if the application wishes to pass results to a CCXML session related to the call)
Using <mark> to detect barge-in during prompt playback – Placement of ‘bookmarks’ within a prompt stream to identify where a barge-in has occurred;
Concatenating Prompts Dynamically using <foreach>

Here are the links to the previous articles in this series:

http://www.voicexmlreview.org/Sep2004/columns/sep2004_first_words.html
http://www.voicexmlreview.org/Nov2004/columns/nov2004_first_words.html
http://www.voicexmlreview.org/Feb2005/columns/Feb2005_first_words.html

This issue, we’re going to look at:

Adding type to <transfer> - Support for additional transfer flexibility (in particular, a supervised transfer), among other capabilities.

Adding ‘Type’ to <transfer>

VoiceXML 2.0 defined the <transfer> tag, to allow a VoiceXML platform to transfer a call to another destination. We talked about <transfer> a long time ago:

http://www.voicexmlreview.org/oct2001/columns/oct2001_first_words.html

Have a look there, as well as at the VoiceXML 2.0 Recommendation (http://www.w3.org/TR/2004/REC-voicexml20-20040316/) to review the capabilities of the <transfer> tag.

VoiceXML 2.1 adds a single attribute to the <transfer> tag – ‘type’ – which can have one of three values:

bridge
blind
consultation

The ‘type’ attribute replaces the functionality of the ‘bridge’ attribute used in VoiceXML 2.0. Either the ‘type’ or ‘bridge’ attributes can be used, but not both. When ‘type’ takes the value ‘bridge’, the behavior of the VoiceXML platform must be exactly the same as the case when the ‘bridge’ attribute was used (as would be the case in VoiceXML 2.0) and set to ‘true’. When the value of ‘type’ is set to ‘blind’, the behaviour of the VoiceXML platform must be exactly the same as the case when the ‘bridge’ attribute was used (again, as would be the case in VoiceXML 2.0) and set to ‘false’. So these first two values really implement the same behaviour with the ‘type’ attribute that was supported previously with the ‘bridge’ attribute.

Where things start to get interesting is when we examine the third option - ‘consultation’.

VoiceXML 2.0 blended a few transfer-related concepts. The idea of whether a call remains ‘bridged’ (sometimes called hairpinning or tromboning) through the VoiceXML platform once the call was transferred is the first concept. When the call remains bridged, the VoiceXML platform remains involved in control of the call. For example, if the person who was the target of the transfer hangs up, the original caller can then continue to interact with the VoiceXML platform.

The second concept is that of a ‘blind’ or ‘supervised’ transfer. In the case of a truly blind transfer, the transfer takes place regardless of whether the call is answered, busy, etc, and the original caller will (typically) actually hear the results of their call – for example, a busy signal. In the case of a supervised transfer, the VoiceXML platform will monitor the progress of the call setup, and should the transfer be unsuccessful, the VoiceXML platform can then continue to interact with the caller.

In terms of these concepts, when set to ‘false’, the VoiceXML 2.0 ‘bridge’ attribute triggers behavior like an unbridged, blind transfer. When set to ‘true’, the ‘bridge’ attribute triggers behaviour like a bridged, supervised transfer.

It is also useful to be able to perform an unbridged supervised transfer. In this case, the VoiceXML platform remains involved until it is certain that the call has been successfully connected. If the transfer attempt fails, the application may then choose to provide other capabilities to the caller, for example, to try to transfer to another destination. Once successfully transferred, the VoiceXML platform would not longer be involved in the call. This example combines features of the ‘true’ and ‘false’ values for the ‘bridge’ attribute in VoiceXML 2.0.

The diagram below, gleefully borrowed from the VoiceXML 2.1 Candidate Recommendation, and as originally drawn by Ken Rehor (of World of VoiceXML fame, see http://www.kenrehor.com/voicexml/) shows the way this works.

The VoiceXML implementation platform is not part of the audio connection between the caller and callee after a succesfully established consultation transfer.

When the call is successfully established, the call is no longer bridged through the VoiceXML platform, allowing the resources to be used for other purposes. When the transfer is unsuccessful, the VoiceXML platform remains connected to the original caller, and can take action appropriate to the application.

As always, there are a wide variety of error and status conditions that can be generated when attempting a transfer. We recommend that you have a look at the VoiceXML 2.1 Candidate Recommendation to understand those conditions.

There are many other interesting ways to transfer a call. For example, you might want to run a dialog against the destination caller prior to completing the transfer. A number of VoiceXML platforms include this as an extension to VoiceXML, but the best way to provide this kind of call control is using something like CCXML (see http://www.w3.org/TR/2005/WD-ccxml-20050629/), which has just been updated as a second Last Call Working Draft. We also encourage you to have a look at this document, as it provides a standards-based mechanism for managing call control, with capabilities far in excess of those provided by VoiceXML itself.

Summary

Here is the direct link to the description of these extensions to the <transfer> tag:

http://www.w3.org/TR/2005/CR-voicexml21-20050613/#sec-transfer

VoiceXML 2.1 proposes some useful additional features for VoiceXML 2.0, based on real-world deployment experience. .

There remains one further feature for us to investigate:

Using <data> to fetch XML without requiring a dialog transition – Retrieval of XML data, and construction of a related DOM object, without requiring a transition to another VoiceXML page.

This is perhaps one of the most powerful additions in VoiceXML 2.1. Brad Porter has written a detailed article regarding how the <data> tag can be used to support development of AJAX applications in VoiceXML as part of the current VoiceXML Review issue as well.

As always, if you questions or topics for VoiceXML 2.0 or 2.1, drop us a line!

back to the top