VoiceXML - Recognize DTMF in Recording

问题

I've been doing IVR work for a while, but we have a case where I'd love some expertise/feedback:

Is it possible to record a message where the user could press a DTMF tone to indicate a pause where we would insert our own sound? In this scenario, the user would record something like: "Good Morning, [DTMF], please call the office at [DTMF] to reconcile your account.".

Not sure whether we would chop the resulting WAV file into pieces to insert our variables, or do some post-processing before sending out our message.

Does anyone have any experience with something like this?

Thanks

Jim Stanley Blackboard Connect

回答1:

In VoiceXML you would use a record element to record a message from a user. The record element has an attribute call dtmfterm which if set to true (default setting) will terminate recording. If this attribute is set to false then recording is terminated when maxtime setting is reached or silence for the duration of finalsilence is reached. Having dtmfterm set to false will just result in the DTMF being part of the recording. Setting dtmfterm to true will result in the recording being terminated.

I have created applications that use caller created recordings but never one that manipulates the recordings like in your requirements. What you may be able to do is concatenate recordings together. Here is a QA that shows how to concatenate wav recordings using C#.

What you will have to experiment with is whether you can catch which DTMF key was pressed by using grammars. The spec eludes to this but it may be somewhat specific to the VoiceXML IVR platform that you are using. If you know what DTMF key was used then you can instruct the user to press * to insert silence and # to terminate recording. Both will terminate a recording but the logic in your VoiceXML will go right back into recording again if the * is pressed and stop the recording process completely if the # is pressed. Then you would use the concatenation to string these recording together and use a wav file with pre-recorded silence in the concatenation process that is inserted between the users recorded snippets.

From the tags it looks like you are using C# and MVC for your VoiceXML application. There is an open source project called VoiceModel that makes it easier to develop VoiceXML applications using ASP.NET MVC 4. You can read about how it handles recording in this environment here.

回答2:

If you want to insert a pause and want to stay within the UI tag , So far how much work I had in IVR, the only dtmf with which we could stay within the UI is * and we would return a grammar "REPEAT" on pressing '*' , in the UI condition tag for REPEAT , you would add the silence (pause) wav file.

The recording part , we used osdmtype = record which mapped to an xslt which helped in the recording and recognising Customer's answer yes/no.
But nevertheless I'm bit confused on the requirement exactly , would need more details.
Sorry can't add comments as don't have enough Rep.
You can mail me or i can add more answers here.

来源：https://stackoverflow.com/questions/15482620/voicexml-recognize-dtmf-in-recording

标签

asp.net-mvc

voicexml