Object-Oriented
VoiceXML
(Continued
from Part 1)
Standard
VoiceXML Tags
In
the first layer, our Java classes correspond simply
to the standard VoiceXML tags. For example, VField and
VBlock for <field>
</field> and <block>
</block>, or VProperty and VMeta for <property
/> and <meta
/>. We use these
simple classes inside the more complex classes defined
in the other layers.
VProperty
and VMeta, as "unpaired" tags, without other
tags inside them, derive from a base class we call VoiceXMLTag.
VoiceXMLTag carries functionality that all tags share,
whether paired or unpaired.
VField
and VBlock, as tags used in pairs, with other tags in
between, derive from a base class we call VoiceXMLTagPair.
VoiceXMLTagPair derives from VoiceXMLTag, picking up
functionality common to all VoiceXML tags, and adds
functionality for enclosing other VoiceXML tags inside,
recursively.
Finally,
in this first layer we specialize the tags for their
most commonly reused purposes. From VBlock, for example,
we derive VTransferBlock, a block used to transfer the
caller to another number. Such a block uses a <transfer>
</transfer> tag in one form or another,
so VBlock carries an instance of VTransfer. That VTransfer
instance, in turn, can be configured many different
ways.
Vendor-Specific
Data Types
A
second layer of the Java hierarchy uses the standard
VoiceXML tags of the first layer to invoke data types
provided by vendors such as Nuance Speech Objects and
Speechworks Dialog Modules and their third-party users.
Data
Types of Our Own Devising
We
get the biggest payoff in the third layer of the hierarchy,
where we derive our own complex data types. The resulting
VoiceXML runs anywhere.
One
example is VLayerList: a navigable list or tree of lists,
that we useagain and again in turn-key products, whether
for lists of Frequently Asked Questions, lists of ingredients
and steps for cooking (as first suggested by the blind
who still love the kitchen), or lists of nearly any
kind.
A
VoiceXML Equivalent of a Class or Object
Finally,
we wondered, how do you reuse VoiceXML without cut and
paste? When you build in VoiceXML, where do you accumulate
your team's expertise? Where do you build a library
of your own best practices and proven techniques? Your
own data types for rapid (drop-in, plug-in) reuse? How
do you package your own involved exchanges with the
caller, and pull them off the shelf the next time you
need them?
How
do you reuse an advanced VoiceXML construct as you would
a Java or C++ class or object? What could qualify as
the VoiceXML equivalent? A reusable VoiceXML part, from
which you could derive further parts, in a hierarchy?
One
approach is of course, the VoiceXML <subdialog>
tag. A subdialog can invoke and enclose (if not encapsulate)
others, in something of a hierarchy. But this takes
us only so far.
VoiceXMLComposite
Classes
At
SpeechBrowser we devised another answer we call VoiceXMLComposite.
We might have called it VoiceXMLObject or VoiceXMLClass
instead. We derive new VoiceXML classes from a base
class in Java called VoiceXMLComposite.
For
example, our two classes VRetryRecourse and VDBCapture.
Most of the VoiceXML we generate (75% or more of the
lines) either handles exceptions during the call (confirm,
retry, live operator, and other recourses) or provides
database access during and after a successful call.
VRetryRecourse supplies any given VoiceXML application
with the former, VDBCapture with the latter. VDBCapture
captures the results of a phone call to a backend database
table, whatever the sequence of questions we've asked
the caller, and whatever data types we've collected.
How
do VoiceXMLComposite classes differ from the classes
we discussed earlier, such as VField and VBlock? VoiceXMLComposite
classes reuse many of those earlier classes. But classes
such as VField and VBlock generate contiguous lines
in a VoiceXML application. VoiceXMLComposite classes
do not.
You
can mark off the first and last tag generated by VField
or VBlock inside the VoiceXML application. An instance
of VDBCapture or VRetryRecourse generates lines in and
for many separate parts of the VoiceXML application
it serves, including the root document and the several
documents (pages) that share the same root.
Functionality
that appears together in a VoiceXMLComposite class,
as data elements and methods of that class, will be
distributed throughout the generated VoiceXML. Some
of its data elements generate <var> tags in the
root document or elsewhere. Some of its methods generate
JavaScript and <block> and <field> tags
across the application.
Subclasses
of VoiceXMLComposite deliver their VoiceXML to various
parts of the VoiceXML application using a VoiceXMLComposite.asString()
method with parameters such as:
protected
static final int kForRootDoc = 3;
protected static final int kForRootDocForm = 2;
protected static final int kForDoc = 1;
protected static final int kForDocForm = 0;
Subclasses
further extend this list of parameters as well, for
delivering their generated VoiceXML to larger and more
complex applications. For tracing and debugging, VoiceXMLComposite
classes can be asked to mark (with comments) all the
lines and tags they generate across a large VoiceXML
application.
Using
VoiceXMLComposite, in a single place we work on some
of the most demanding professional features of any VoiceXML
application: its exception handling and its database
backend. From that one place, all the applications we
generate, and all the many sections in each application,
pick up the improvements.
So
What's It All For?
At
SpeechBrowser we want to bring Automatic Speech Recognition
to a mass market. We want to reach offices of any size
and budget: agencies, foundations, non-profits, small
businesses. We want to reach beyond the trading giants
that dominate today. We want development prices to start
at $10,000 or less.
We
want to reach the blind and vision-impaired and disabled
communities. We want to see the day when more of us
reach the internet by phone than by keyboard, mouse,
or computer screen.
For
that we need all the best that the sofware industry
has learned in four decades. We need XML and VoiceXML
for its universal availability, and we need object-oriented
languages. For us, that means VoiceXML generated from
hierarchies of reusable parts, with assembly-line efficiency.
back
to the top
Copyright
© 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE
Industry Standards and Technology Organization
(IEEE-ISTO).
|