Volume 1, Issue 4 - April 2001
   
   
 

Answers to Your Questions About VoiceXML

By Jeff Kunins

(Continued from Part 1)

Q: I'm interested in building my own VoiceXML platform. What are the requirements? Can I buy all the components separately and then put them together? What skills will I need, and how much budget should I allocate?

A: It is certainly possible to build your own VoiceXML interpreter and commercial VoiceXML platform. There are at least 10 commercially available VoiceXML platforms today. These platforms cover a variety of deployment options:

  • Outsourced. Vendors providing outsourced VoiceXML solutions allow customers to simply focus on writing VoiceXML applications. Thanks to the distributed, Web-based nature of VoiceXML, these vendors are able to run large-scale, reliable, scalable, and secure deployments of VoiceXML infrastructure on behalf of their customers. Customers' applications live on their own Web servers, and the remote infrastructure simply "browses" to the application when calls come in. The benefit of this solution is that customers retain full control of their applications and data, without having to purchase or manage speech and telephonyspecific hardware and software.

  • On-premises, "Turnkey". Vendors providing on-premises, "turnkey" solutions sell a complete hardware/software package to their customers for VoiceXML development. These systems are designed to include all the requisite hardware and software for delivering VoiceXML applications--voice recognition engine, text-to-speech engine, VoiceXML interpreter, telephony control cards, operating system software, etc. Customers that purchase and install these solutions then connect them to telecom capacity and Internet connectivity which they've provisioned.

  • On-premises, "Software". Many vendors sell various pieces of the overall stack required to deliver a comprehensive VoiceXML platform solution. Platform developers looking to create a custom solution purchase some, build others, and link them together using customized software and hardware.

The skills involved in developing these solutions include expertise in scalable server programming, telephony programming, speech science, databases, XML, and both Web servers and Web browsers.

It is difficult to say how much budget is required to develop a custom VoiceXML platform; the number can shift dramatically (e.g. several orders of magnitude) depending on the scalability, robustness, and overall level of functionality desired, as well as how much custom development of underlying components you intend to undertake. It is fair to say that most companies offering complete outsourced or on-premises "turnkey" VoiceXML solutions have invested tens of millions of dollars and at least a dozen person-years of effort building and honing their commercial-grade products.

Q: My VoiceXML platform consistently complains that my documents must start with
"<?xml version='1.0'?>". My documents do begin this way... what is the problem?

A: This is a common problem that VoiceXML developers encounter. To be considered well-formed XML, the very first line of your documents must contain the XML declaration. Initial blank lines or extraneous whitespace are not permitted. Be sure to remove these, and always check your VoiceXML documents in an XML-compatible Web browser (such as Microsoft Internet Explorer 5.x) before running them on your VoiceXML platform. This will help ensure that your Web server is delivering syntactically well-formed XML before delving into any VoiceXML-specific bugs or errors your documents may contain.

Q: Is it possible to send voice alerts like WAP alerts? How?

A: Yes. Several companies today (including many of the VoiceXML Forum's member companies) offer voice notification services that support the VoiceXML standard. These services typically operate by allowing application developers to submit a request using a standard HTTP POST (or even by sending an email). The request contains information such as the phone number to call, when to place the call, and the URL of a VoiceXML application to trigger when the call is made. For security and privacy assurance, some level of security--including a company identifier, password, and SSL or S-MIME--isalso typically required.

Anyone who operates a VoiceXML platform can theoretically build a notification service that provides this functionality.

VoiceXML itself does not specify a way to deliver voice notifications, just as HTML and WML do not. VoiceXML 1.0 simply defines the interface for voice applications; how the phone call that initiates the conversation gets generated is a separate issue. That said, it is definitely conceivable that a separate accepted standard that specifies how to use HTTP, email, and other open Internet protocols to trigger alerts across all devices including voice, phones, PDAs, and other wireless devices will emerge in the future.

 

back to the top

 

Copyright © 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).