Volume 1, Issue 6 - June 2001

Answers to Your Questions About VoiceXML

By Jonathan Engelsma

In this monthly column, an industry expert will answer common questions about VoiceXML and related technologies. Readers are encouraged to submit questions about VoiceXML, including development, voice-user interface design, and speech technology in general, or how VoiceXML is being used commercially in the marketplace. If you have a question about VoiceXML, e-mail it to speak.and.listen@voicexmlreview.org and be sure to read future issues of VoiceXML Review for the answer.

This month we examine more questions from our readers.

Q: I am in the process of learning VoiceXML and I feel a little bit overwhelmed. I have limited skills in HTML and XML, as well as Dreamweaver. I am learning this new language with minimum resources and no mentor to help me. Do you have any suggestions, advice, or resources where someone like me can gain knowledge and become proficient in VoiceXML?

A: One approach of course is to simply start writing code with the VoiceXML 1.0 specification at your side for reference. However, depending on your level of programming skills and experience, that could prove to be a rather hit and miss endeavor. This is particularly true if you don't have a proper understanding of the big picture. To get a bit of background and appreciation for the language the VoiceXML Forum has made a variety of educational resources that will help you get started.

To begin, I would point you to the First Words column written by Rob Marchand and published monthly in this e-zine. The First Words column is a series of short articles describing various aspects of the language based on a running example. There are a number of annotated code examples, diagrams, etc. each month that illustrate the concepts being introduced. You can read previous installments of the column in the VoiceXML Review Archives.

Other resources available to those interested in learning VoiceXML include the three on-line Tutorials published on the VoiceXML Forum's website. The Forum has received a lot of positive feedback from individuals who have worked their way through the tutorials since they were published a few months ago, and I would highly recommend them to you.

In addition to reading tutorials, articles, the language specification, etc., it is critical that you get some hands on experience writing your own applications. To do this, you will need to have access to a VoiceXML interpreter. There are a variety of implementations freely available that basically come in one of two flavors: on-line studio-based tools and VoiceXML simulators running on your PC desktop. The on-line studio-based approach consists of a website that allows you to author your VoiceXML application on-line, or register the URI of a VoiceXML application you are hosting on your own web server. You will be provided with a telephone number you can call to execute your application. The simulator approach typically involves a VoiceXML simulator that you can download and run on your local PC desktop. The simulator may actually do speech recognition and text-to-speech synthesis using your sound card and microphone for input/output, or it may simulate speech input/output by entering/displaying text on the screen. A number of VoiceXML Forum member companies provide one or both of these types of development tools. Links to a few of the more popular tools are listed in the FAQ's published on the Forum's website.

You also mentioned the lack of a mentor. In addition to the message boards on the support sites of companies with VoiceXML-related products, there are a few places here and there on-line where VoiceXML application developers and platform implementors congregate. The Yahoo "voicexml" Group is one example that comes to mind. While traffic is rather sporadic at times, there seems to be a number of experts tuned in that are usually happy to provide you with some mentoring - particularly if you happen to be using their company's product!

Q: I have a VoiceXML application running successfully on my PC using my computer's sound card, speakers, and microphone for input and output. Now I would like to invoke my application using a telephone. Can you provide suggestions on how I might go about doing this?

A: Without knowing the exact details of your particular configuration it is difficult to suggest a bulletproof solution. You are going to have to do some more detailed investigation to determine what your options are. It sounds like you have downloaded one of several tools available that implement a VoiceXML interpreter in the context of your desktop PC. These tools are very convenient for authoring and running VoiceXML applications without requiring expensive telephony hardware and services. One thing to find out is what interfaces in addition to your PC's sound card, does the tool you are running support? Chances are the software you are running is hard-wired to work only with your sound card. While there is a remote chance that you can purchase additional telephony hardware (analog or digital interface cards, etc.) and configure them to run with your tool, this will involve a fair amount of telephony know-how, and require additional telephony services (analog lines, T1's, etc.) that you may not have readily available. Another alternative to investigate is to determine whether or not the tool supports VoIP-based interfaces such as SIP/RTP. If so, you should be able to obtain a software-based IP phone and configure that to call your application. There are a number of IP phones freely available for download, such as the pingtel xpressa softphone.

A far simpler solution is to locate a company who provides VoiceXML hosting services and setup your application on their system. They will provide you with a phone number that you can use to call your application. There are a number of free online VoiceXML developer services available for developing and running VoiceXML applications, as mentioned in the answer to the previous question. The downside is that you may have to "port" your application to the hosting platform's interpreter. For example, if the underlying speech recognizers are different, the grammar syntax your application currently uses may not be supported on the hosting platform. Portability issues will become less of an issue in the future given the VoiceXML Forum's current efforts in the area of conformance, and the W3C's efforts defining standard grammar formats, text-to-speech markup, and additional specifications. These areas were left unspecified in the original VoiceXML 1.0 specification.

The advantage of letting somebody else host your VoiceXML application is of course that you don't have to worry about purchasing, implementing or maintaining a rather complex telephony platform.


back to the top


Copyright © 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).