City
CarShare Reservation System: A VoiceXML Case Study
City
CarShare is a San Francisco nonprofit company whose
mission is to reduce the number of cars in the city
through shared ownership. Members can reserve a car
through the website or by calling the office. Because
of their limited resources and the need to service their
members around the clock, a VoiceXML application is
the perfect solution for them. Using BeVocal Café,
indigo egg recently completed this reservation system
for City CarShare.
VoiceXML
applications offer instant access to information and
automated services from any telephone. Before VoiceXML,
some of these types of services have been available
using costly human operated call centers or cumbersome
DTMF touch tone menu trees implemented on proprietary
IVR platforms. VoiceXML reduces many of these problems
and provides the added convenience of using voice to
complete the task at hand. Many of the VoiceXML applications
being developed on BeVocal's platform (http://cafe.bevocal.com)
fall into one of these general classes:
- Targeted
applications most useful when traveling or away from
the office: sales force automation and customer relationship
management applications.
- Cost
reduction and improved customer service: making reservations
and checking availability without waiting on hold
-- also getting more information like driving directions
to business locations.
- Employee
productivity improvements achieved by streamlining
work flow and task completion: expense approvals and
quality assurance applications.
City
CarShare falls squarely in to the second category. Their
executives have great expertise in their own domain
space but, as with most companies, have less of an understanding
of what a voice application can do. Indigo egg's first
task with City CarShare was to give them a realistic
idea of VoiceXML's capabilities. Together we decided
upon an application that would accept, list, and cancel
reservations, and, for callers who are not members,
play an informative message about City CarShare. In
this article we will explore some of the usability and
technical issues that arose during the development of
this application.
Usability
Speech
is the most natural form of communication, and people
have high but often unconscious expectations associated
with the medium. All successful application deployments
therefore demand a high degree of usability. The two
main factors that affect the usability of VoiceXML applications
are dialog design and speech recognition accuracy -
making sure the application listens for utterances that
people will actually say. Both of these concerns must
be addressed in order to provide an application that
people will want to use.
The
first decision to be made is the characterization of
the application. Talking on the phone is carrying on
a conversation, and people have many subconscious expectations
about a conversation's form and content. Creating an
actual personality for the application allows callers
to anthropomorphize it, and this gives several benefits.
First, the application is sticky: the caller feels friendly
towards the system and is more likely to call again.
Second, the caller is more forgiving of recognition
'mistakes'. Perhaps most importantly, it creates a greater
level of trust in the caller and a correspondingly greater
desire to cooperate with the application. For City CarShare,
indigo egg chose a friendly, informal college student
character, to blend with their 'green' aesthetic.
The
next usability hurdle arose in porting their authentication
procedure to voice. The passwords that City CarShare
had been issuing to their members had been alphanumeric
rather than solely numeric. This allows passwords of
the sort 'PAss+w0rd' where upper- and lower-case letters
are mixed, and numerals and punctuation are contained
within the password. Even if this could be pronounced,
the spelling cannot be understood by the recognition
engine. To solve this problem, indigo egg recommended
that City CarShare issue four-digit PINs instead of
their passwords. After we educated them on the problems
inherent in translating written text to spoken language,
City CarShare agreed. This decision was a difficult
one, as it impacted all City CarShare's members, and
it illustrates the kind of tradeoff that must often
be made. (A full discussion of alphanumeric passwords
in ASR requires an article of its own.)
Dialog
Design
Conversations
contain subtle but important cues, called discourse
markers, which convey meta-level information about the
conversation. These are usually small words or choices
of phrasing that may seem meaningless when written but
play an important part in keeping two speakers in sync.
A voice application dialog should use discourse markers
in appropriate places to keep the caller grounded in
the application and increase comfort and usability.
For example, it is easy to write a dialog such as the
following:
Application:
What location would you like to pick up the car from?
Caller: Downtown.
Application:
What day would you like to pick up the car?
Caller: Next Thursday.
Application: What
time would you like to pick up the car?
Caller: Noon.
This
dialog is unnatural and discontinuous. Indigo egg's
dialogs use pronouns extensively to refer to the car
under discussion, to create and reinforce continuity.
When it has recognized the response, the application
briefly repeats the caller's answer, with a confirmatory
discourse marker such as 'okay', 'got it', or 'mm-hmm'
so that the caller knows the application has heard them
These discourse markers are chosen randomly for a more
natural-sounding response. Compare the following:
Application:
What location do you want to pick up the car from?
Caller: Downtown.
Application: Downtown,
got it. And what day do you need it?
Caller: Next Thursday.
Application: Okay,
Thursday, March 12th. At what time?
Caller: Noon.
Because
this conversation works with the caller's subconscious
expectations rather than ignoring them, it is much more
comfortable to engage in. The first prompt deserves
further discussion. It would be more natural to say,
"Where do you want to pick up the car from?"
but this is not the best choice in a voice application.
The word 'where' leaves too much latitude in the caller's
answer, inviting responses that may not be included
in the grammar. The need for a more directed request
for the location makes discourse markers all the more
important in striking a balance between natural and
directed prompts.
In
general, some important considerations in dialog design
include:
- Create
a consistent character that engages the listener in
an appropriate style for a particular application.
- Dialogs
describe conversations that are very different from
written text. Writing is generally more formal and
impersonal than speech.
- Use
discourse markers in prompts to help ground the caller;
work with their subconscious expectations of how conversations
happen rather than against them.
back
to the top
Copyright
© 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE
Industry Standards and Technology Organization
(IEEE-ISTO).
|