Volume 2, Issue 3- April/May 2002

VoiceXML: A Publishing Standard for Accessibility

By Brian Ty Graham

The emergence of VoiceXML as an open and widely supported standard induced a paradigm shift in the voice application and IVR market by empowering web developers to write voice applications efficiently. VoiceXML is having a profound effect on not only IVR and traditional-telephony applications but also on extending the enterprise and consumer applications to be accessed by voice. In addition, VoiceXML is also quietly enabling the disabled to access applications, services, content, and information in a usable manner. Using VoiceXML as a document format for publishing is an effective, inexpensive, and expansive way to meet accessibility requirements. The VoiceXML Document Format (VDF) is a self contained W3C compliant, static VoiceXML file that is delivered through Voice Web Services's VANGARDTM authoring tool. In general, publishers may transcode any type of publication such as books, magazines, news articles, and much more into VDF files to then be accessed simply by any phone or computing device.

In comparison to traditional means of accessibility publishing, using VoiceXML as an accessible publishing format not only increases flexibility and usability of the content, but also decreases costs substantially. The ubiquitous nature of the telephone and hands-free speech access provides a highly compelling reason to use VoiceXML for broadcasting textual content to the disabled community and for mobile voice access of electronic documents.

Long documents, complex tables of contents, and the limitations of text-to-speech technology no longer restrict content providers from considering new alternatives for accessibility publishing. VoiceXML's convenience of natural and intuitive voice commands coupled with the ability to search and navigate within documents positions VoiceXML as the standard document format for accessibility to the visually impaired and the disabled.


Voice activated telephony deployments are extremely effective for a large majority of applicable scenarios. However, in the past, providing speech access to text via the telephone usually meant inconveniencing the caller with dialogs that prevented control of the playback of the content itself. While text-to-speech (TTS) technology helped voice the written word to the user, the user interfaces for presenting text audibly was at a considerable disadvantage compared to presenting text visually.

In a graphical environment, sighted users have many helpful features that aid in fully representing the textual content they choose. However, in a voice only environment, most text-driven voice applications fail to apply the features that are so prevalent and useful in a visual medium such as page scrolling, visual cues, word spelling, copying and pasting.

Since developers using VoiceXML rely mostly on TTS technology to convey content to the user, they have had to also rely on the TTS software to provide the usability for playing back the information to the user. Consequently, Voice User Interfaces (VUIs) designed for presenting text information to the disabled community lacked the flexibility to handle large amounts of text information conveniently and elegantly. The VUI required the user to either press buttons to arrive at predefined positions within the content or forced them to listen to the entire piece without any internal navigation at all. Moreover, neither the VUIs nor the TTS technology allowed for useful options such as copying and pasting text or spelling words through one accessible document format.


back to the top


Copyright © 2001-2002 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).