I am hacking a little project using iOS 10 built-in speech recognition. I have working results using device\'s microphone, my speech is recognized very accurately.
M
Based on my test on iOS10, when shouldReportPartialResults is set to false, you have to wait 60 seconds to get the result.
It seems that isFinal flag doesn't became true when user stops talking as expected. I guess this is a wanted behaviour by Apple, because the event "User stops talking" is an undefined event.
I believe that the easiest way to achieve your goal is to do the following:
You have to estabilish an "interval of silence". That means if the user doesn't talk for a time greater than your interval, he has stopped talking (i.e. 2 seconds).
Create a Timer at the beginning of the audio session
:
var timer = NSTimer.scheduledTimerWithTimeInterval(2, target: self, selector: "didFinishTalk", userInfo: nil, repeats: false)
when you get new transcriptions in recognitionTask
invalidate and restart your timer
timer.invalidate()
timer = NSTimer.scheduledTimerWithTimeInterval(2, target: self, selector: "didFinishTalk", userInfo: nil, repeats: false)
if the timer expires this means the user doesn't talk from 2 seconds. You can safely stop Audio Session and exit
I am using Speech to text in an app currently and it is working fine for me. My recognitionTask block is as follows:
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
if let result = result, result.isFinal {
print("Result: \(result.bestTranscription.formattedString)")
isFinal = result.isFinal
completion(result.bestTranscription.formattedString, nil)
}
if error != nil || isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
completion(nil, error)
}
})
if result != nil {
self.timerDidFinishTalk.invalidate()
self.timerDidFinishTalk = Timer.scheduledTimer(timeInterval: TimeInterval(self.listeningTime), target: self, selector:#selector(self.didFinishTalk), userInfo: nil, repeats: false)
let bestString = result?.bestTranscription.formattedString
self.fullsTring = bestString!.trimmingCharacters(in: .whitespaces)
self.st = self.fullsTring
}
Here self.listeningTime
is the time after which you want to stop after getting end of the utterance.