VoiceXML Review - Columns - Speak & Listen

Volume 4, Issue 4 - November / December 2004

In this monthly column, an industry expert will answer common questions about VoiceXML and related technologies. Readers are encouraged to submit questions about VoiceXML, including development, voice-user interface design, and speech technology in general, or how VoiceXML is being used commercially in the marketplace. If you have a question about VoiceXML, e-mail it to speak.and.listen@voicexmlreview.org and be sure to read future issues of VoiceXML Review for the answer.

By Matt Oshry

Q: I'm writing a stock quote application: the user says a company or fund name, my voice application recognizes it, and, using the ticker symbol returned by the grammar, the application looks up relevant data about the company or fund including the current price from a back-end stock feed. The application then plays that information back to the user.

Because the list of publicly traded companies and funds is perpetually changing, my library of recorded company and fund names doesn't always have the name the user asked for even though I'm able to retrieve the relevant data from the stock feed. Do you have any suggestions on how I can provide an acceptable user experience when the company name isn't available?

A: If the data returned from your feed includes the company or fund name, and your VoiceXML platform integrates with a decent TTS engine, you could rely on the voice browser to fall back on TTS when a recording of the company name or fund is not available. In the following example, let's make a couple of assumptions:

The ticker symbol is stored in the variable 'ticker'.
Each recorded company or fund name is stored in a file the name of which is equivalent to the ticker (e.g. 'msft.wav')
Using the ticker, you make an HTTP request (using <subdialog> in VoiceXML 2.0 or <data> in VoiceXML 2.1) to get back relevant data about the equity. See Section 5 of the VoiceXML 2.1 specification for an example using <data> (http://www.w3.org/TR/2004/WD-voicexml21 20040728/#sec-data).
After fetching the data associated with the ticker, you store the company or fund name in the variable 'equityName'.

Q. What if the company or fund name is not available in the data feed.

A. If the company or fund name is unavailable in the data feed, you could resort to playing back the individual characters that make up the ticker symbol. First, determine the number of characters in the longest possible ticker symbol by reviewing a list of the ticker symbols on the stock exchanges supported by your application. The following example assumes that the longest ticker symbol is four characters.

<var name="arrTicker" expr="GetTickerLettersArray(ticker, 4)"/>
<audio expr="'equities/' + ticker + '.wav'">
	<!-- tts fallback -->
	<value expr="arrTicker[0]"/>
	<value expr="arrTicker[1]"/>
	<value expr="arrTicker[2]"/>
	<value expr="arrTicker[3]"/>
< /audio>

The user-defined JavaScript function 'GetTickerLettersArray' returns an array of the individual letters that make up the ticker symbol. It takes two parameters:

A string representing a ticker symbol
The maximum number of characters expected by the calling function.

If the ticker symbol (e.g. "F" for "Ford Motor Company") contains fewer letters than the number of <value> elements, our function sets the additional array elements to the empty string so that none of the value tags dereference a non-existent element of the array. If we don't pad the array, some platforms may play the string 'undefined' for each non-existent element we attempt to access via the <value> tag.

Here's an implementation of 'GetTickerLettersArray':

function GetTickerLettersArray(ticker, max) {
  var arr = ticker.split("");
  // pad array so that there are max elements
  for (var i = arr.length-1; i < max-1; i++) {
    arr.push("");
  }
  return arr;
}

Let's take this a step further. Rather than relying solely on the TTS engine to synthesize each character in the ticker symbol, what if we were to create a recording for each of the possible characters that can be used in a ticker symbol. Then we can leverage VoiceXML's support for nested audio elements. Instead of returning an array of characters, we revise our user-defined function to return an array of objects where each object consists of two properties - 'wav' and 'tts'. The 'wav' property contains the path to the recording for the character; the 'tts' property
contains the character.

<var name="arrTicker" expr="GetTickerLettersArrayEx('letters/', ticker, 4)"/>
<audio expr="'equities/' + ticker + '.wav'">
  <!-- fall back on ticker if company/fund name recording is unavailable -->
  <audio expr="arrTicker[0].wav"><value expr="arrTicker[0].tts"/></audio>
  <audio expr="arrTicker[1].wav"><value expr="arrTicker[1].tts"/></audio>
  <audio expr="arrTicker[2].wav"><value expr="arrTicker[2].tts"/></audio>
  <audio expr="arrTicker[3].wav"><value expr="arrTicker[3].tts"/></audio>
</audio>

Here's an implementation of the revised 'GetTickerLettersArrayEx':

function GetTickerLettersArrayEx(baseUrl, ticker, max) {
  var arr = ticker.split("");
  for (var i = 0; i < arr.length; i++) {
    var ch = arr[i];
    // swap out the single character for an object that 
    // references the recorded character and the tts
    arr[i] = {'wav' : baseUrl + ch + '.wav', 'tts' : ch}
  }

  //pad array so that there are max elements
  for (var i = arr.length-1; i < max-1; i++) {
    // reference an empty recording
    // good platforms will cache this
    arr.push({'wav' : baseUrl + 'blank.wav', 'tts' : ''});
  }
  return arr;
}

The baseUrl parameter provides your application with the flexibility of specifying the location of the recorded characters and the empty audio file, 'blank.wav'.

Using the TTS engine as a last resort whenever possible will help provide a more seamless user experience. The latter approach, playing back the recorded ticker letters, allows you to use the same audio talent that you have chosen for the rest of your voice application.

back to the top