Speech-to-text large audio files [Microsoft Speech API]

问题

What is the best way to transcribe medium/large audio files, ~ 6-10 mins each file, using Microsoft Speech API? Something like batch audio files transcription?

I have used the code provided in https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-to-text-sample, for continuously transcribing speech, but it stops transcribing at some point. Is there any restriction on the transcription? I am only using the free trial account atm.

Btw, I assume there is no difference between Bing Speech API and the new Speech service API, right?

Thanks everyone!

回答1:

thank you for your feedback.

I agree the sample (and the documentation you are looking at) is not very clear, we will update this soon.

The sample uses RecognizeAsync, and it should be call RecognizeOnceAsync. It is currently just trying to return the FIRST FinalResult from the service. You should use Start/StopRecognizeAsync, and register to receive Result events.

Again, sorry for the bad documentation here, we will update this soon, and also will rename the API probably in a refresh.

If you have audio files, you could also use the batch transcription feature. Perhaps that helps? https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription

Cheers Wolfgang

回答2:

The Speech services allow 5,000 transactions per month, 20 per minute during the free trial so maybe at some point you exceed the 20 per minute limit because of real-time continuous recognition.

来源：https://stackoverflow.com/questions/50796434/speech-to-text-large-audio-files-microsoft-speech-api

标签

speech-recognition

speech-to-text

microsoft-cognitive

bing-api

microsoft-speech-api