In this monthly column, an industry expert will
answer common questions about VoiceXML and related technologies.
Readers are encouraged to submit questions about VoiceXML,
including development, voice-user interface design,
and speech technology in general, or how VoiceXML is
being used commercially in the marketplace. If you have
a question about VoiceXML, e-mail it to
speak.and.listen@voicexmlreview.org
and be sure to read future issues of VoiceXML Review
for the answer.
By Matt Oshry
Q: I'm writing a stock quote application:
the user says a company or fund name,
my voice application recognizes it, and, using the
ticker symbol returned by the grammar, the application
looks up relevant data about the company or fund including
the current price from a back-end stock feed. The application
then plays that information back to the user.
Because the list of publicly traded companies and
funds is perpetually changing, my library of recorded
company and fund names doesn't always have the name
the user asked for even though I'm able to retrieve
the relevant
data from the stock feed. Do you have any suggestions
on how I can provide an acceptable user experience
when
the company name isn't available?
A:
If the data returned from your feed includes the company
or fund name, and your VoiceXML platform integrates
with a decent TTS engine, you could rely on the
voice
browser to fall back on TTS when a recording of the
company name or fund is not available. In the following
example, let's make a couple of assumptions:
- The ticker symbol is stored in the variable 'ticker'.
- Each
recorded company or fund name is stored in a file
the name of which is equivalent to the ticker
(e.g. 'msft.wav')
- Using
the ticker, you make an HTTP request (using <subdialog> in
VoiceXML 2.0 or <data> in VoiceXML 2.1) to
get back relevant data about the equity. See Section
5 of the VoiceXML 2.1 specification for an example
using <data> (http://www.w3.org/TR/2004/WD-voicexml21
20040728/#sec-data).
- After
fetching the data associated with the ticker, you
store the company or fund name in the variable
'equityName'.
<audio expr="'equities/' + ticker + '.wav'"><value expr="equityName"/></audio> |
Q.
What if the company or fund name is not available in
the data feed.
A.
If the company or fund name is unavailable in the data
feed, you could resort to playing back the individual
characters that make up the ticker symbol. First,
determine the number of characters in the longest
possible ticker symbol by reviewing a list of the
ticker symbols on the stock exchanges supported
by your application. The following example assumes
that the longest ticker symbol is four characters.
<var name="arrTicker" expr="GetTickerLettersArray(ticker, 4)"/>
<audio expr="'equities/' + ticker + '.wav'">
<!-- tts fallback -->
<value expr="arrTicker[0]"/>
<value expr="arrTicker[1]"/>
<value expr="arrTicker[2]"/>
<value expr="arrTicker[3]"/>
< /audio> |
The
user-defined JavaScript function 'GetTickerLettersArray'
returns an array of the individual letters that make up the
ticker symbol. It takes two parameters:
- A
string representing a ticker symbol
- The
maximum number of characters expected by the calling
function.
If
the ticker symbol (e.g. "F" for "Ford
Motor Company") contains fewer letters than the number
of <value> elements, our function sets the additional
array elements to the empty string so that none of the value
tags dereference a non-existent
element of the array. If we don't pad the array, some platforms
may play the string 'undefined' for each non-existent element
we attempt to access via the <value> tag.
Here's an implementation of 'GetTickerLettersArray':
function GetTickerLettersArray(ticker, max) { var arr = ticker.split(""); // pad array so that there are max elements for (var i = arr.length-1; i < max-1; i++) { arr.push(""); } return arr; } |
Let's
take this a step further. Rather than relying solely on
the TTS engine to synthesize each character in the ticker
symbol, what if we
were to create a recording for each of the possible characters
that can be used in a ticker symbol. Then we can leverage
VoiceXML's support for nested audio elements. Instead of
returning an array of characters, we revise our user-defined
function to return an array of objects where each object
consists of two properties - 'wav' and 'tts'. The 'wav'
property contains the path to the recording for the character;
the 'tts' property
contains the character.
<var name="arrTicker" expr="GetTickerLettersArrayEx('letters/', ticker, 4)"/> <audio expr="'equities/' + ticker + '.wav'"> <!-- fall back on ticker if company/fund name recording is unavailable --> <audio expr="arrTicker[0].wav"><value expr="arrTicker[0].tts"/></audio> <audio expr="arrTicker[1].wav"><value expr="arrTicker[1].tts"/></audio> <audio expr="arrTicker[2].wav"><value expr="arrTicker[2].tts"/></audio> <audio expr="arrTicker[3].wav"><value expr="arrTicker[3].tts"/></audio> </audio>
|
Here's an implementation of the revised 'GetTickerLettersArrayEx':
function GetTickerLettersArrayEx(baseUrl, ticker, max) { var arr = ticker.split(""); for (var i = 0; i < arr.length; i++) { var ch = arr[i]; // swap out the single character for an object that // references the recorded character and the tts arr[i] = {'wav' : baseUrl + ch + '.wav', 'tts' : ch} }
//pad array so that there are max elements
for (var i = arr.length-1; i < max-1; i++) {
// reference an empty recording
// good platforms will cache this
arr.push({'wav' : baseUrl + 'blank.wav', 'tts' : ''});
}
return arr;
} |
The
baseUrl parameter provides your application with the flexibility
of specifying the location
of the recorded characters and the empty audio file, 'blank.wav'.
Using
the TTS engine as a last resort whenever possible will
help provide a more seamless user experience. The latter
approach, playing back the recorded ticker letters,
allows you to use the same audio talent that you have chosen
for the rest of your voice application. |