Volume 1, Issue 9 - October 2001
   
   
 

Telephony Support in VoiceXML

By Rob Marchand

Welcome to First Words, VoiceXML Review's column that teaches you about VoiceXML and how you can use it. We hope you enjoy the lesson.

Last month we raved about caching and prefetching in VoiceXML. This month, we're going to look at the <transfer> tag and show how VoiceXML applications can use it to place outbound phone calls.

VoiceXML can be used in many different environments, including Voice over IP (VoIP), and on the desktop. However, it was originally built to work with the public switched telephone network (PSTN), in a telephony environment. So it is hardly surprising that it includes a few telephony related features.

Of course, whenever a call takes place, you have access to telephony-related information about the call. This information includes the dialed number (session.telephone.dnis), the calling number (session.telephone.ani), if available, and possibly additional network information (User-to-user information, UUI, as session.telephone.uui, and Information digits as session.telephone.iidigits). This telephony-related information allows you to tailor your application based on who calling whom, where the call if made from, and so on. UUI can be used as required by the application, and the information digits provide useful information about the call.

The major telephony feature of VoiceXML 1.0 is the <transfer> tag. This tag will place an outbound call from the platform, and 'bridge' the inbound call (that is associated with your application), and the outbound call. As you may recall from an earlier column when we seemed preoccupied with pizza, use of the <transfer> tag is quite straightforward. Here is a simple example:

<?xml version="1.0"?>
<vxml version="1.0">

<form>

    <block>
   I will now attempt to transfer you to the operator. One moment please.    </block>    <transfer name="transfer_result" dest="phone://19995551212">        <filled>            <block>                <prompt>The results of the call are                    <value expr="transfer_result" />                </prompt>            </block>        </filled>    </transfer> </form> </vxml>  

In this example, we will attempt to connect to caller to our operator, so that they can pursue some transaction that requires human support. Here are the interesting bits in this example:

  • <transfer> is a <form> element. It has many of the same features as a regular field element;
  • The <transfer> field variable gets a result; more on this below;
  • <transfer> accepts a telephone number that sort of looks like a URI.

The specification of the telephone number depends on what your VoiceXML gateway is connected to. If it is a long distance trunk, it may not need the '1' preceding the number. If you have local seven digit dialing, then you may only need seven digits. Or, if you're hooked up to a PBX, you may only need three to five digits to transfer to an internal extension. You may also be able to specify an extension as part of the telephone number, typically done as: "phone://19995551212x245". Some platforms also allow conventional telephone number punctuation. For example: "phone://1(999)555-1212x245". Finally, if your provider supports VoIP, you can hostnames, IP addresses, and Session Initiation Protocol (SIP) identifiers to the mix. Check your gateway provider's documentation for details.

The format of the telephone number will be changing somewhat in VoiceXML 2.0, to support a richer selection of destinations.

The <transfer> tag has the following attributes:

  • name - the field variable name. This can be used to get the results of the <transfer> attempt;
  • expr - an ECMAScript expression with which to initialize the field variable;
  • cond - The usual VoiceXML condition variable, which acts as a guard condition for the <transfer>;
  • dest - The telephone number to call, formatted as 'phone://digits';
  • destexpr - An ECMAScript expression that will produce a valid telephone number string, as for dest;
  • bridge - A Boolean value determining whether the transfer is 'supervised' or 'unsupervised'. The values are 'bridge' or 'blind' respectively;
  • connecttimeout - How long to wait for the call to be connected, as a VoiceXML duration (i.e., either milliseconds or seconds, specified with the suffix 'ms' or 's' respectively);
  • maxtime - The maximum time for which the call can be connected. Specified as a VoiceXML duration.

I can hear the questions already: what's the difference between a supervised and unsupervised transfer? An unsupervised or 'blind' transfer is basically the last thing the VoiceXML platform will do with the call. One example might be a voice activated dialing service, where the caller speaks the name or number to call, and is then connected using the <transfer> tag. When the call is successfully connected, the VoiceXML platform will throw a telephone.disconnect.transfer event to allow the application to finalize processing. When either end disconnects, the call is complete. The original caller does not interact further with the VoiceXML gateway. Note that you may still occupy ports on the platform in this situation, until the call is complete, depending upon the capability of the VoiceXML gateway.

In the case of a supervised transfer, the caller can return to their VoiceXML session. This can happen in a number of ways:

  • If the called person disconnects, the caller is returned to the VoiceXML page that was executing, and processing continues in a normal manner, with the <transfer> tag results available to the form;

  • If the <transfer> tag contains a <dtmf> tag, then the caller can use elements in the DTMFgrammar to end the call, and return to the VoiceXML application;

  • If the <transfer> tag contains a <grammar> tag, then the caller may be able to use the elements in the speech grammar as 'hotwords' to disconnect the call and return to the VoiceXML application (this feature is optional);

Note that <transfer> is modal! So grammars defined outside the <transfer> tag itself are not in scope.

While the call is ongoing, it will typically occupy two ports on the platform: one for the inbound 'leg' of the call, and another for the outbound leg.

So you can use the appropriate type of transfer to build your application. The underlying platform may be able to do certain optimizations if you select a blind transfer. It may be possible to release the call to the telephony switch, freeing two ports on the platform. Or it may be possible to release speech recognition resources that are no longer required for the application, as it will not return.

When the caller disconnects within a <transfer> the VoiceXML platform will throw a telephone.disconnect.hangup event, as you would expect. If the callee hangs up, the transfer will 'fill' the form item variable. The possible results include:

  • busy - The destination refused to accept the call;
  • network_busy - Some intermediate network refused the call;
  • near_end_disconnect - The call completed normally, and was terminated by the caller;
  • far_end_disconnect - The call completed normally, and was terminated by the callee;
  • network_disconnect - The call was disconnected by the telephone network.

back to the top

 

Copyright © 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).