Sphinx4 ConfidenceResult and SpeechResult

浪子不回头ぞ 提交于 2019-12-08 13:14:24

Yes, you can do this, although it's a little bit roundabout. A confidence result is actually a Sausage (no, not kidding, that's what it's called: SphinxDocs:Sausage. Although it's also known as a Word Confusion Network, it's sometimes referred to as a sausage because of what the graph looks like. See Fig 1. of Hakkani-Tur, et. al.. That paper is a great reference for understanding confidence and speech recognition, although it is a bit long, I highly recommend reading the sections you might find relevant if you're interested in further work in Speech. It describes the Pivot Algorithm, which is used in Sphinx 4 in the class: PivotSausageMaker).

Anyway, the point is that you can get a Lattice from your SpeechResult. A Lattice is a graph that is a condensed form of all the hypotheses the recognizer produced. You can give your lattice to a SausageMaker, and call SausageMaker.makeSausage(), which will give you a Sausage, which is a ConfidenceResult (note: calling SausageMaker.score(Result result) just makes a Lattice from the result, and then calls it's own makeSausage method). Unfortunately ASR confidence values are not very clear, and it's an open topic of research how to best compute, process and understand them.

Another possibility would be to the the confidence scores in the WordResult's you can get from your SpeechResult.

Hope that helps!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!