voice-push.com

Pushing VoiceXML to the masses!



Home

Blog

VoiceXML
VoiceXML 2.0 & 2.1
VoiceXML 3.0
State Control XML
ASR & TTS
VoiceXML Applications

Video & VoiceXML
Video Apps

VUI vs. GUI
Client vs. Server Apps

Voice User Interfaces
DTMF vs. ASR
Target Audience
Dialog States
Global Commands
Zeroing Out
Personality
NLU vs. Directed Dialogs
Prompts - Wording
Prompts - Snippets
Prompts - Randomising
Prompts - Recording
Grammar Design
Waiting
Error Handling

Project Phases
User Requirements
Technical Spex
VUI Specifications
Development
Going Live!

Links

Contact

DTMF vs. ASR

Poor old DTMF - treated like the poor relation of the more user friendly ASR! Well don’t write DTMF off so quickly - it is quite often a much better choice than ASR for many applications. Unfortunately it has a bad rap because of so many "IVR Hell" applications. However, most of the hell was caused by bad user interface design rather than having to use DTMF keys.

DMTF applications

Don’t get carried away with the speech recognition hype (which has only been going on for like the last decade!) - there’s still plenty of room for DTMF applications. There is also plenty of room to make DTMF applications better, so that callers have less reason to complain about ‘IVR hell’.

How complicated is the application? If it’s very simple then it may be a waste of resources to use ASR. ASR licences cost money - and if you’re using barge-in then you’re using one ASR licence per call. If you have a simple menu with four items in it, then using ASR is probably overkill. If you want someone to enter the name of a city, then ASR is the best choice. ASR is necessary when the number of input possibilities is high, such as an addressbook or a list of destinations - DTMF is not ideally suited to these tasks.

Another issue can be passwords. People don’t like saying passwords out loud. Letting someone enter their pin code using DTMF rather than saying it out loud is sometimes the better option. Admittedly this doesn’t work if the password is a word and not a PIN code. The other alternative is to use speaker verification - where the caller is asked to repeat a random phrase, and they are recognised on the basis of their voice.

Any application that is likely to be used in a noisy environment should probably use DTMF, or at least have DTMF fallback. There’s nothing like heavy-duty background noise to ruin a speech recogniser’s day.

DTMF fallback

Sometimes it makes sense to offer DTMF fallback. If there is a lot of background noise and the recognition rate is poor, then it may make sense to let the caller continue with DTMF, which has better recognition rates ;-)

Whatever you do, please don’t write applications with prompts like this:

For accounts say ‘accounts’ or press 1.

Which is only marginally better than:

Press 1 for accounts or say ‘one’.

There should be a law against that kind of thing.

It is possible to use DTMF and ASR in the same application - it's just normally they aren’t used at the same time. It’s one or the other. So you need to let your caller know that they can switch to DTMF (maybe by pressing and key) and tell them how they can get back to ASR if they want to. You may only need to impart this information when you realise that the caller has had 3 unrecognised phrases in a row and therefore the recongition may be suffering due to high background noise.

For some applications it doesn’t make sense to have DTMF fallback - these are the real ASR applications, which allow you to say “I’d like to fly to Paris tomorrow evening at 8 o’clock”. Try entering that using DTMF!



If you have any comments, ideas, issues, etc. about this topic why not try the voice-push forums






© voice-push.com