Opera
8 Ships One Million Browsers with X+V Multimodal
Technology
By Igor Jablokov
Opera
Software ASA (http://www.opera.com) recently announced
that version 8.0 of its browser received over one million downloads
within four days of release. The Norwegian software
vendor has created a fast and standards compliant
Web experience.
While this
news is certainly commendable for any product introduction,
rivaling even Mozilla’s Firefox, it is also a milestone
for the multimodal and voice standards community.
Opera has included a feature that
could usher in an age of human-computer interaction
predicted long ago by many a science fiction writer.
The Windows
version of this browser now has an option that enables
voice interaction. This functionality is provided
by the IBM® Multimodal Runtime Environment, which connects
the Opera Browser to IBM Embedded ViaVoice® (the same
technology currently shipping in certain auto navigation
systems). Not only does this enable users to interact with
the entire browser interface using their voices (e.g. users
can say “browser go home” or “browser fullscreen”),
but they can also execute applications written in the
XHTML+Voice (X+V for short) markup language. The X+V
language permits
developers to write and deploy multimodal Web applications,
which allow users to interact through sight, sound
and speech. This language was co-authored by IBM, Motorola
and Opera
and is under consideration by the W3C standards body.
While
modern day VoiceXML applications require specialized
skills, X+V applications are different in that they more
closely resemble standard Web applications. This breaks
the current speech development paradigm and can allow
the
large
body of Web developers to simply add voice interaction
to existing Web applications. For instance, one of
IBM’s
customers augmented an existing enterprise-focused
application and moved it into production within a
month. This was without
prior experience developing X+V or speech in general.
Imagine
the potential use cases for this type of interaction.
In any environment where “hands-free” is not
just a buzzword but a necessity, such as in healthcare,
warehousing or enterprise applications, the value
of this system becomes
obvious. Doctors can ask for patient status by name
or get alerted to changes in medical conditions using
the natural
sounding voice output (CTTS) that is included with
the browser. In warehouses, companies can increase
worker productivity
by having the system communicate new orders to employees
and leaving their hands free to fulfill the order.
Also consider insurance adjusters speaking into complex
forms and recording
accident information while focused on investigating
a scene.
But multimodal is not just for workplace activities. With
the explosion of media content available to consumers, this
interface is a natural fit with entertainment devices, such
as set top boxes and digital video recorders. Instead of
hunting through several menus, users could find a song by
simply speaking the title into their remotes and playing
it instantly. Or think of how often you use the browser on
your mobile phone; would you use it more if you could simply
speak into it, asking it for the latest news or weather?
Service providers could then use this same framework to generate
additional revenues by reducing the distance between their
customers and licensed content, such as ring tones, or selling
value added services such as multimodal email and calendaring.
Opera
8 for Windows offers users a gateway into the multimodal
experience. IBM looks forward to developers’ creativity
in leveraging these standards-based technologies to
augment existing Web applications for increased end
user productivity.
More information on the X+V specification is available here:
http://www.voicexml.org/specs/multimodal/x+v/12/
More
information on IBM’s multimodal implementations
and toolkits is available here: http://www.ibm.com/pvc/multimodal
Igor
Jablokov is Program Director of IBM Software Group’s
Multimodal & Voice Portal Initiatives and currently serves
as a VoiceXML Forum Director. He can be reached at jablokov@us.ibm.com.
IBM and ViaVoice are trademarks or registered trademarks
of International Business Machines Corporation in the United
States, other countries or both.
Windows is a trademark of Microsoft Corporation in the United
States, other countries or both.
Other company, product and service names may be trademarks
or service marks of others.
Actual results may vary from any performance data contained
herein. Users of this document should verify the applicable
data for their specific environment.
|