Volume 1, Issue 3 - Mar. 2001
   
   
 

Your First Pizza

By Rob Marchand

Welcome to First Words, VoiceXML Review's column that teaches you about VoiceXML and how you can use it. We hope you enjoy the lesson.

In our January column, we looked at some simple VoiceXML pages that demonstrated a few of the capabilities of VoiceXML. In our February and March columns, we talked about document structure and support for building modular applications. This month we're going to build and evolve a simple application to apply some of the things we've learned.

Here at the VoiceXML Review, we've decided to branch out a bit-we've decided to start an on-line Pizza ordering service. This will give us a little extra pocket change, and should come in handy during those late nights in the lab. We'll spend the next few issues extending the functionality of our pizza ordering business, which will show more and more of the features of VoiceXML.

We're going to start with a VoiceXML page that will let us order a number of a particular kind of item. In this instance, we're going to accept orders for pizzas, drinks, salads, and wings.

Let's start with a walkthrough of this simple VoiceXML page.

<?xml version="1.0"?>

<vxml version="1.0">
<meta name="MAINTAINER" content="rob@voicegenie.com"/>
<meta name="APPLICATION" content="Pizza Application"/>
<property name="caching" value="safe"/>


<form>
  <block>
    <prompt>
      Welcome to the VoiceXML Review pizza franchise
    </prompt>
  </block>

  <field name="orderItem">

    <grammar>
      pizza | drinks | salad | wings
    </grammar>

    <prompt>
      What would you like to order?
      We have pizza, drinks, salad or wings.
    </prompt>

  </field>

  <field name"orderCount" type="number">

    <prompt>
      How many <value expr="prderItem"/>would you like?
    </prompt>

  <field>

  <block>
    One moment while I add
      <value expr="orderCount"/>
      <value expr="orserItem"/>
    to your order.

    <submit next="/cgi-bin/pizzaCart.pl" namelist="orderItem OrderCount"/>
  </block>
</form>
</vxml>
 

 

If you've been following along at home, you should have a pretty good idea of what this program will do. This is a VoiceXML document that will:

Welcome the caller to our new business endeavor, using text-to-speech (TTS);

  • Prompt the caller to tell us what they would like to order, using speech recognition (ASR);
  • Prompt the caller to tell us how many items they would like to order, using ASR;
  • Tell them what they've ordered (using TTS); and
  • Submit the order to a server-side processing program to handle the order.

The astute reader will notice that we've made no attempt to:

  • Determine where to deliver the order;
  • Determine who will pay for the order;
  • Define exactly what the mysterious 'pizzaCart.pl' will do.

These are all important things, of course, particularly if we hope to retire on our pizza profits. Don't worry, we'll address some of these issues in future columns.

Some other interesting points about the above example are as follows:

  • We're using the value tag to insert the caller's response into prompts. The value tag evaluates an ECMAScript expression, and returns the resultant value. In this case, we only reference the field variables that are collected in the form. Note that that VoiceXML variables are exactly equivalent to ECMAScript variables, which is handy when you want to do client-side processing using ECMAScript functions and scripts.

  • We're using an 'in-line' grammar to collect the order item from the user. In this example, we're allowing the caller to select among several items (the grammar can be read as allowing the caller to say 'pizza' or 'drinks' or 'salad' or 'wings'). There are also external grammars referenced by a URI. We'll stick to relatively straightforward grammars in this column, as this tends to be an area where vendors tend to diverge. VoiceXML 1.0 does not specify a required grammar format; A future version of the language being defined by the W3C will do so, which will improve application portability. You can read more about the new Speech Grammar Specification Language in Andrew Hunt's feature article this month.

  • We're using a 'built in' grammar to collect the number of items from the user. By declaring the 'type' of a field, we can take advantage of a built-in grammar that has already been defined for us.

  • We're taking advantage of lots of defaults: for example, the handling of exceptional input conditions, such as the caller saying something we can't recognize, or the caller not saying anything at all.

Making It Better

VoiceXML lets us define which actions to launch when certain events occur. Some of the events you'll deal with in almost every application will include noinput and nomatch. Events are thrown and caught. Let's suppose that we want to help the user out if they have trouble. For each field, we'll define event handlers that will provide guidance to the user. For the first field in our example:

<field name="orderItem">


      <grammar>
        pizza | drinks | salad | wings
      </grammar>

      <prompt>
        What would you like to order?
        We have pizza, drinks, salad or wings.
      </prompt>

      <noinput>
        Say pizza, drinks, salad or wings.
      </noinput>

      <nomatch>
        You can say pizza, drinks, salad or wings.
      </nomatch>

      <help>
        <reprompt/>
      </help>

</field> 


While collecting input, the VoiceXML interpreter will generate certain events that we can catch and handle. If the caller says something that we can't understand, the built in nomatch event will be thrown. In this example, we're catching it within the field, and reiterating what we'd like them to say. If the caller doesn't say anything, the built in noinput event will be thrown. If the caller asks for help, then the built in help event will be thrown. In this case, we're taking advantage of the reprompt tag, which tells the interpreter that the user should be reprompted with the field prompt when we next collect input.

We can also add the same kind of information to the second field as well. In this case, we've enhanced those messages by referring to the item that the caller has requested (pizza, salad, etc.). (See the complete example at the end of this article for the details of changes to the second field.)

In Example 2, we're also taking advantage of some syntactic shorthand that VoiceXML provides for us. Events are typically caught using a construct such as:
<catch event="someevent"> …event handling code… </catch>
However, in the case of a number of the built in events, like help, noinput, and nomatch, we can use the shorthand version shown in Example 2.

back to the top

 

Copyright © 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).