VoiceXML Review - Columns

Volume 1, Issue 10 - November 2001

The Record Tag

By Rob Marchand

(Continued from Part 1)

About VoiceXML 2.0

VoiceXML 2.0 (if you haven't heard about it, get out from under that rock and check it out at http://www.w3.org/voice) adds a few features to the <record> tag:

dest - This attribute is a URI to which the recorded data will be saved (a file, for example) or posted (a CGI).

There are also two new shadow variables:

name$.maxtime - This Boolean variable will be set to true if the maximum recording time was exceeded. We're using this in our second sample above.
name$.dest - This is a URI referencing the final destination of the recording. If the dest attribute was specified, this shadow variable will receive the actual destination of the data. This accounts for any redirections, etc., made by the data storage mechanism.

Now what do I do with it?

In order to do anything interesting with recorded data, you probably want to save it on your web or application server. You can do this with any server-side technology with which you're comfortable (see the June First Words column for an introduction to this.

The recorded data will be available either as a CGI parameter variable, or, depending upon your server-side configuration and the method of encoding the data in the submit tag, it might show up as an uploaded file. You'll have to check your server side documentation for this.

You'll also want to check your VoiceXML platform vendor documentation verify the default <submit> encoding type. It will likely be either form-url-encoded or multipart/form-data. In the former case, your data will be available in the same way as a regular CGI variable, and in the latter case, it will probably be available as an uploaded file.

Here is a simple perl script to receive the data submitted by example 2, and write it to the (web server local) file /tmp/foo.vox.

#!/usr/local/bin/perl -w

use CGI;

$q = new CGI;

# Upload file.

# this is the raw data.
$audiodata = $q->param('recorded_message');

open (FOO, ">/tmp/foo.vox");

print FOO $audiodata;

close (FOO);

# Now dump out a thank you page.

print $q->header;
print qq(
<?xml version="1.0"?>
<vxml version="1.0">
<form>
     <block>
          Thanks! Bye.
     </block>
</form>
</vxml>
);

exit;

See the June First Words for other server-side examples.

Painful Things

Here are some things to watch out for on the server side when dealing with recorded data.

General

Long recordings might take a while to submit; try this out with your platform. As they say, your mileage may vary;
When using multipart/form-data, you might find it useful to know whether your VoiceXML Gateway vendor provides the data with a 'name' (variable) or 'filename' (uploaded file) header in the multipart headers (see RFC 2388) for the gory details); this will determine how it will show up in your server-side application;

PHP

If you're using PHP, you'll have to use the file upload version (i.e., multipart/form-data encoding), as PHP can't deal with binary data in form-url-encoded information; note that the multipart/form-data version relies upon the filename header mentioned above as well, and will treat the data as an uploaded file;

Perl

I'm not aware of any problems when working with perl and recorded data; you should be able to use either encoding method and to access the data as a variable or as a file;
You'll probably want to use CGI.pm or some similar package to make your life a little easier;

Java

You can get the recorded data as a variable (this servlet code fragment was kindly provided by Neil Martin of VnP Software):
OutputStream writer = new BufferedOutputStream(new FileOutputStream(file));
String paramString = request.getParameter("recorded_message");
byte[] bytes = paramString.getBytes("ISO-8859-1");
writer.write(bytes);
writer.close();
For multipart/form-data, you'll need a third-party class. You might want to try the com.oreilly.servlet.

ASP

Using the Request.Form('variable') form of referencing a variable in ASP will fail for contents (of a single form variable) larger than about 102,399 bytes. This is documented by Microsoft in Knowledge Base article Q273482;
You can use the Request.BinaryRead method to retrieve the actual binary data for the entire request. However, you must then parse the form-url-encoded data from the binary data (i.e., pick out variable names, and values, and then decode the values). The result from this would be the raw audio data that was recorded. This could then be written to a file.
The best solution is to get a third party object to do all this dirty work. I've had success with a free one: http://www.codeproject.com/asp/aspupl.asp#xx55436xx

I hope I haven't offended anyone by not addressing their favorite server-side tools; hopefully this article will get your started with whatever you might use.

We've had a pretty close look at the <record> tag, and hopefully pointed out some of the pitfalls to watch out for.

What's Next?

Next month, we'll look at some of the changes in VoiceXML 2.0 (in particular, how they affect what we've done in the past), and tidy up some other odds and ends.

If there are particular things you'd like to see covered in a future First Words column, drop me a line at rob@voicegenie.com, and I'll try to cover it.

Watch future issues of VoiceXML Review for more articles about getting started with VoiceXML. Here at the VoiceXML Review, our thoughts are with the families and victims of September 11th.

back to the top