What
is VoiceXML?
(Continued
from Part 1)
Figure
2 shows the relationship between a traditional Web application,
and a voice-enabled Web application.
Figure
2: Relationship Between a Traditional Web Application
and a Voice-Enabled Web Application
A
Basic Menu Example
To
see how VoiceXML works, let's start with a very simple
example of a basic menu. Following the architecture
shown in Figure 1, a caller dials the telephone number
of this simple voice portal. The call is routed to the
VoiceXML telephony server. The appropriate VoiceXML
page (in this case menu.vxml) is fetched via HTTP from
the application (web) server, and interpretation begins.
Example 1: menu.vxml 1 <?xml version="1.0"?> 2 <vxml version="1.0"> 3 4 <menu> 5 <prompt> Choose from <enumerate/></prompt> 6 7 <choice next="sports.vxml"> sports </choice> 8 <choice next="weather.vxml"> weather <choice> 9 <choice next="news.vxml"> news <choice> 10 </menu> 11 12 </vxml>
|
The
first line of Example 1 indicates that it complies with
W3C's XML version 1.0. Line 2 is the top-level VoiceXML
element containing dialogs of either <menu>s or
<form>s. This also indicates compliance with VoiceXML
version 1.0. Lines 4 through 10 contain a menu consisting
of a prompt and three choices. The contents of the <choice>
elements are used by the VoiceXML interpreter to instruct
the ASR engine what to listen for, in this case the
words sports, weather, or news.
The content is also used to construct a prompt if the
<enumerate> element is included. A speech synthesis
engine would render the text as audio.
The user interaction would be as follows:
Computer: Choose
from sports, weather, news.
Human: Sports.
The VoiceXML interpreter then fetches the file sports.vxml
and the process continues.
But what if the user asked for help, didn't say something
appropriate, or said nothing at all? VoiceXML has language
elements that allow a dialog designer to handle these
circumstances. Here's the same menu example embellished
to handle "unexpected" responses:
Example 2: menu.vxml (embellished) 1 <?xml version="1.0"?> 2 <vxml version="1.0"> 3 4 <menu> 5 <prompt> Choose from <enumerate/></prompt> 6 7 <choice next="sports.vxml"> sports </choice> 8 <choice next="weather.vxml"> weather <choice> 9 <choice next="news.vxml"> news <choice> 10 11 <help> 12 If you would like sports scores, say sports.
13 For local weather reports, say weather, or 14 for the latest news, say news. 15 </help> 16 17 <noinput>You must say something.</noinput> 18 19 <nomatch>Please speak clearly and try again.</nomatch> 20 21 </menu> 22 23 </vxml>
|
The
user interaction might be:
Computer: Choose
from sports, weather, news.
Human:
(user says nothing)
Computer: You must
say something. Choose from sports, weather, news.
Human: Tblisi
Computer: Please
speak clearly and try again. Choose from sports, weather,
news.
Human: Help
Computer: If you
would like sports scores, say sports. For local weather
reports, say weather, or for the latest news, say news.
Human: Sports
Summary
VoiceXML is a powerful, yet simple language for building
voice dialogs. It leverages web architecture, tools,
and technology to enable innovative new telephone applications.
Thanks to the standardization efforts of the VoiceXML
Forum and the W3C, it is gaining widespread adoption--especially
by the 350-plus members of the VoiceXML Forum. New language
features in the recently published draft of VoiceXML
2.0, and new call control features currently under development,
promise an even richer voice-enabled Web.
back
to the top
Copyright
© 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE
Industry Standards and Technology Organization
(IEEE-ISTO).
|