Detect fluency from google speech api results

旧城冷巷雨未停 提交于 2019-12-11 04:16:43

问题


Trying to determine fluency of a speaker using google speech (to text) api.

So far i have found that api (betav1) can show the time taken to speak a word ( its starting time and ending time ).

And from wikipedia,

Oral fluency or speaking fluency is a measurement both of production and reception of speech, as a fluent speaker must be able to understand and respond to others in conversation. Spoken language is typically characterized by seemingly non-fluent qualities (e.g., fragmentation, pauses, false starts, hesitation, repetition) because of ‘task stress.’ How orally fluent one is can therefore be understood in terms of perception, and whether these qualities of speech can be perceived as expected and natural (i.e., fluent) or unusual and problematic (i.e., non-fluent)

I can see we can get the pause, repetition etc from the api of a word. But relative measurement can be difficult as i cant find any standard values.

Is there any proper approach to achieve this? Can anyone give a guideline to detect the fluency from google api (or any other valid approach using some short of open source speech libraries or external softwares)

Its completely fine if i am going in completely wrong direction, just need a proper guideline to achieve the feature.

来源:https://stackoverflow.com/questions/50020796/detect-fluency-from-google-speech-api-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!