Recording Prompts
Record more rather than less
Make sure to record all possible variations of your prompts. If you're not sure about the particular wording of a prompt then record all the variations that you can think of. If a particular product can be referred to in different ways, then record all variations. It's cheaper to do that than having to do another recording session. Be sure to record all snippets as both open and closed. Remember, the customer may change their minds about the call flow and the menu order. If you've recorded flexble prompts using snippets then you may be able to avoid a trip back to the recording studio.
Formats
Generally speaking a good recording studio will record a voice in CD quality - i.e. 44 kHz, 16 bit stereo. However, there’s not much point playing this kind of quality across a telephone line. Particularly as the top frequency is about 3.4 kHz, the sampling frequency is 8 kHz and the number of bits 8. Having said that, always get your recordings in the original quality and in your desired format. If the sound studio have got things wrong, then you can use the originals to create a better version. A program like CoolEdit (ok, it has a new name, but I still think of it as CoolEdit '96) should handle most of your needs. Don't let the studio run compressors and the like over the file - they aren't really necessary.
Now the format you actually choose for your files will often depend on what your platform can actually play. For most telephony applications the sampling rate is going to be 8 kHz. That gives you a top frequency of 4 kHz - which is better than you will probably get across a telephony line (the channel characteristics diminish the higher frequencies - hence a top frequency of about 3.4 kHz). As for mono/stereo, you won’t be surprised when I say that it has to be mono. The number of bits can be 8 or 16 - though a 16 bit file will have to be converted to 8 bit by the platform for the telephone.
In order to optimise the transmission of speech across a telephone channel, two codecs were developed which use compression to get (in effect) 13 bit resolution out of an 8 bit codec. These are a-law in Europe and mu-law in America. They have been optimised for speech and provide better quality than having your 16 bit files downsized to 8 bit. As regards amplitude, you may need to experiment a little with the VXML browser that you're using, but a good rule of thumb is that an RMS of -13dB should be ok.
If you have any comments, ideas, issues, etc. about this topic why not try the voice-push forums
|