voice-push.com

Pushing VoiceXML to the masses!



Home

Blog

VoiceXML
VoiceXML 2.0 & 2.1
VoiceXML 3.0
State Control XML
ASR & TTS
VoiceXML Applications

Video & VoiceXML
Video Apps

VUI vs. GUI
Client vs. Server Apps

Voice User Interfaces
DTMF vs. ASR
Target Audience
Dialog States
Global Commands
Zeroing Out
Personality
NLU vs. Directed Dialogs
Prompts - Wording
Prompts - Snippets
Prompts - Randomising
Prompts - Recording
Grammar Design
Waiting
Error Handling

Project Phases
User Requirements
Technical Spex
VUI Specifications
Development
Going Live!

Links

Contact

VoiceXML 2.0 & 2.1

VoiceXML 2.0

This is the version that developers are most familiar with. It did a good job of clearing up many of the ambiguities in 1.0, which had allowed platforms to behave quite differently while still fulfilling the standard. Furthermore it really brought the language to maturity, allowing it to become a true standard with widespread acceptance. It has been added to in 2.1, with work on 3.0 already in the pipeline.

VoiceXML 2.1

Ok, the best thing about 2.1 has to be the data tag - or at least that’s my opinion. In effect it's AJAX for VoiceXML. With 2.0 if you wanted to interact with the server you had to leave the VoiceXML page that you were in. The only interaction that you had with the caller was playing an audio prompt while the next page was being fetched. If for whatever reason the tasks to be carried out in the back-end actually required a significant amount of time (anything over about 10 seconds - i.e. before the caller gets restless) then you could run into trouble. The tag in 2.1 lets you get around this, by making asynschronous calls to the server - while your VoiceXML page continues on the VoiceXML browser. So basically, you can obtain information from the caller, send it to the server, continue the interaction with the caller and then check if there has been a response from the server. Pretty cool.

VoiceXML 2.1 has 8 changes:

  • expr can now be used with grammars - i.e. dynamic grammar definition at runtime.
  • expr can now be used with scripts - i.e. dynamic script definition at runtime.
  • The <mark> tag can now be used to determine where a caller barged in. You 'mark' parts of the prompts and then when the caller barges in you can determine at what point in the prompt they were. Useful if you're listing options and that caller just needs to say "That one!" when they hear the one that they want.
  • There is a new <data> tag which allows the application to send a query to the server, receive information back without having to change documents.
  • There is a new prompt tag which iterates through ECMAscript arrays of prompts and their properties - useful for playing lists of options or content. Even better in conjunction with <mark>!
  • You can now record user utterances during recognition. This could be used for off-line speaker verification or recording prompts that cause nomatch errors so that they can be analysed as part of the tuning process.
  • A namelist has been added to <disconnect>.
  • The <transfer> tag has been modified to have a type, which can be “blind”, “bridge” or “consultation”.


If you have any comments, ideas, issues, etc. about this topic why not try the voice-push forums






© voice-push.com