Convert audio to text

前端 未结 5 716
失恋的感觉
失恋的感觉 2020-12-13 16:35

I just want to know if there is any build in libraries or external libraries in Java or C# that allow me to take an audio file and parse it and extract the text from it.

相关标签:
5条回答
  • 2020-12-13 16:43

    You might check Microsoft Speech API. I think they provide a SDK that you can use for your objective.

    0 讨论(0)
  • 2020-12-13 16:44

    For Java, it seems there is a solution from Sun: javax.speech.recognition

    0 讨论(0)
  • 2020-12-13 16:45

    You can use SoX (the Swiss Army knife of sound processing programs) to convert audio file to text file with numeric values corresponding to sound frequency/volume.

    I have done it for a previous project but don't know the exact command options.

    Here is a link to the project: http://sox.sourceforge.net/Main/HomePage

    0 讨论(0)
  • 2020-12-13 17:00

    Here is a complete example using C# and System.Speech

    The code can be divided into 2 main parts:

    configuring the SpeechRecognitionEngine object (and its required elements) handling the SpeechRecognized and SpeechHypothesized events.

    Step 1: Configuring the SpeechRecognitionEngine

    _speechRecognitionEngine = new SpeechRecognitionEngine();
    _speechRecognitionEngine.SetInputToDefaultAudioDevice();
    _dictationGrammar = new DictationGrammar();
    _speechRecognitionEngine.LoadGrammar(_dictationGrammar);
    _speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
    

    At this point your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.

    Step 2: Handling the SpeechRecognitionEngine Events

    _speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);

    _speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);

    private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e) { ///real-time results from the engine string realTimeResults = e.Result.Text; }

    private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e) { ///final answer from the engine string finalAnswer = e.Result.Text; }

    That’s it. If you want to use a pre-recorded .wav file instead of a microphone, you would use

    _speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile);

    instead of

    _speechRecognitionEngine.SetInputToDefaultAudioDevice();

    There are a bunch of different options in these classes and they are worth exploring in more detail.

    http://ellismis.com/2012/03/17/converting-or-transcribing-audio-to-text-using-c-and-net-system-speech/

    0 讨论(0)
  • 2020-12-13 17:01

    Here are some of your options:

    • Microsoft Speech
    • Lumenvox
    • Dragon naturally speaking
    • sphinx4
    0 讨论(0)
提交回复
热议问题