Some
Thoughts on Speech Grammar
In
this monthly column, an industry expert will answer
common questions about VoiceXML and related technologies.
Readers are encouraged to submit questions about VoiceXML,
including development, voice-user interface design,
and speech technology in general, or how VoiceXML is
being used commercially in the marketplace. If you have
a question about VoiceXML, e-mail it to speak.and.listen@voicexmlreview.org
and be sure to read future issues of VoiceXML Review
for the answer.
Q:
How do I use data from a Microsoft Access database in
a VoiceXML application?
A:
To use data from any DBMS in a VoiceXML application,
you'll need to extract the data and format it in a syntax
and place it in a location that a VoiceXML interpreter
can fetch, parse, and execute. You have a number of
options including the following:
- Periodically export the data from the DBMS into VoiceXML
or JavaScript or some intermediary format that can be
further transformed, and place the exported data on a
Web server accessible to the VoiceXML interpreter.
- Use an API to extract and format the data directly
from the DBMS on demand. The VoiceXML interpreter makes
an HTTP request to a server-side script
that in turn fetches the data and formats it for use
in your voice application.
When
deciding between these two options, consider the following:
- How frequently does the data used by your voice
application change?
- Is it vital that users of your voice application
have access to the most up to date information?
- Does your DBMS scale to handle the additional demand
from users of your voice application?
- Are you prepared to secure the data in your DBMS
from hackers?
If your answers to these questions are "Infrequently",
"No", "No", and "No", then
the first option is probably good enough.
For the purposes of this article, I'll assume that's the
case. In a future column, I'll tackle the second option.
Some sample data
To help put the solution into perspective, let's define
two schemas. The first describes a simple employee table;
Field Name |
Type |
Description |
emp_id |
AutoNumber |
The employee's unique id (primary key) |
ssn |
Text (9) |
Social Security Number (also unique) |
fname |
Text (50) |
first name |
lname |
Text (50) |
last name |
phone |
Text (15) |
telephone number |
dept_id |
Integer |
foreign key into the department table |
4) If there are features that you want to use that aren't
quite perfectly cross-platform compatible today, what
will it really cost you in development time to make
the necessary changes should you choose to switch?
Remember,
millions of people made the decision to write slightly
different versions of their Web sites for IE vs. Navigator
to optimize perforance on both. In my opinion, VoiceXML
is already far superior (e.g. less inconsistencies across
implementatinos) to HTML in this regard---- but you
have to make your own decision specific to your business
objectives and needs.
Q:
Given the current state of speech recognition technology,
When writing speech grammars for VoiceXML apps, is it
best to write small compact grammars with a very narrow
set of possible utterances or is it better to write
larger wide open grammars?
A: It's most important to write your grammars
to closely match what your callers are actually saying
-- having too much coverage (too many phrases in the
grammar, especially ones that are confusable with one
another) is equally as bad as having too little (having
many things missing that your callers reguarly say).
Optimizing this balance through a combination of great
grammar design, and great UI design that carefully guides
callers to "say the right things" without
frustrating them, is the fine art that is voice appliation
design.
back
to the top
Copyright
© 2001-2002 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE
Industry Standards and Technology Organization (IEEE-ISTO).
|