Voice matching in Android [closed]

╄→гoц情女王★ 提交于 2019-11-27 18:56:57

问题


Is there any way we can do voice matching in Android? Take the below scenario.

  1. User "A" speak something in the app and record it in phone via the app.
  2. User "B" speak something in the app and record it in phone via the app.
  3. User "C" speak something in the app and record it in phone via the app.
  4. After all of these recordings, user "A" come and speak to the app. Since his voice is already recorded, app identifies this is user "A".

Or else something like this..

  1. User "A" speak the word "House" in the app and record it in phone via the app.
  2. User "B" speak the word "House" in the app and record it in phone via the app.
  3. User "C" speak the word "House" in the app and record it in phone via the app.
  4. After all of these recordings, user "A" come and speak the word "House" to the app. Since his voice is already recorded, app identifies this is user "A".

Is this is possible in Android? Which method is possible? I haven't seen any built in libraries for this, but is there any way around?


回答1:


You may want to check Recognito that does text independent speaker recognition in Java

It's a FOSS lib licensed under Apache 2.0

https://github.com/amaurycrickx/recognito

disclaimer: I'm the author :-)

It has a light dependency on Oracle's javax.sound for file handling but it should be straightforward to remove this dependency from the main Recognito class (a few methods to discard: look for "file" in params and hit del)

I'm not aware of any other FOSS alternatives that would be Android compatible without modifications

There's plenty of javadoc, the code should be straightforward.

The one thing you'll wonder is how to create the double[] with values between -1.0 and 1.0 For a start you may want to look at FileHelper class which does just that with a 16bit PCM encoded file.

Please note a single word won't suffice to extract a good vocal print and to recognize the user afterwards

For the process, I'd say use a phrase repeated 3 times to build an averaged vocal print. Use the same phrase at recognition time.

The lib is text independent but it will help to use the same phrase if you need to keep the recording short. If you want it truly text independent (user says anything and gets recognized), you'll need longer vocal samples.

HTH



来源:https://stackoverflow.com/questions/22443124/voice-matching-in-android

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!