Python Voice Recognition Library - Always Listen?

不羁岁月 提交于 2019-12-18 12:07:56

问题


I've recently been working on using a speech recognition library in python in order to launch applications. I Intend to ultimately use the library for voice activated home automation using the Raspberry Pi GPIO.

I have this working, it detects my voice and launches application. The problem is that it seems to hang on the one word I say (for example, I say internet and it launches chrome an infinite number of times)

This is unusual behavior from what I have seen of while loops. I cant figure out how to stop it looping. Do I need to do something out of the loop to make it work properly? Please see the code below.

http://pastebin.com/auquf1bR

import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
        audio = r.listen(source)

def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction():
        user = r.recognize(audio)
        print(user)
        if user == "Excel":
                excel()
        elif user == "Internet":
                internet()
        elif user == "music":
                media()
while 1:
        mainfunction()

回答1:


Just in case, here is the example on how to listen continuously for keyword in pocketsphinx, this is going to be way easier than to send audio to google continuously. And you could have way more flexible solution.

import sys, os, pyaudio
from pocketsphinx import *

modeldir = "/usr/local/share/pocketsphinx/model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-dict', os.path.join(modeldir, 'lm/en_US/cmu07a.dic'))
config.set_string('-keyphrase', 'oh mighty computer')
config.set_float('-kws_threshold', 1e-40)

decoder = Decoder(config)
decoder.start_utt('spotting')

stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()        

while True:
    buf = stream.read(1024)
    decoder.process_raw(buf, False, False)
    if decoder.hyp() != None and decoder.hyp().hypstr == 'oh mighty computer':
        print "Detected keyword, restarting search"
        decoder.end_utt()
        decoder.start_utt('spotting')



回答2:


The problem is that you only actually listen for speech once at the beginning of the program, and then just repeatedly call recognize on the same bit of saved audio. Move the code that actually listens for speech into the while loop:

import pyaudio,os
import speech_recognition as sr


def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction(source):
    audio = r.listen(source)
    user = r.recognize(audio)
    print(user)
    if user == "Excel":
        excel()
    elif user == "Internet":
        internet()
    elif user == "music":
        media()

if __name__ == "__main__":
    r = sr.Recognizer()
    with sr.Microphone() as source:
        while 1:
            mainfunction(source)



回答3:


I've spent a lot of time working on this subject.

Currently I'm developing a Python 3 open-source cross-platform virtual assistant program called Athena Voice: https://github.com/athena-voice/athena-voice-client

Users can use it much like Siri, Cortana, or Amazon Echo.

It also uses a very simple "module" system where users can easily write their own modules to enhance it's functionality. Let me know if that could be of use.

Otherwise, I recommend looking into Pocketsphinx and Google's Python speech-to-text/text-to-speech packages.

On Python 3.4, Pocketsphinx can be installed with:

pip install pocketsphinx

However, you must install the PyAudio dependency separately (unofficial download): http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio

Both google packages can be installed by using the command:

pip install SpeechRecognition gTTS

Google STT: https://pypi.python.org/pypi/SpeechRecognition/

Google TTS: https://pypi.python.org/pypi/gTTS/1.0.2

Pocketsphinx should be used for offline wake-up-word recognition, and Google STT should be used for active listening.




回答4:


That's sad but you have to initialise microphone in every loop and since, this module always have r.adjust_for_ambient_noise(source), which makes sure, that it understands your voice in noisy room too. Setting threshold takes time and can skip some of your words, if you are continuously giving commands

import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()


def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction():
        with sr.Microphone() as source:
            r.adjust_for_ambient_noise(source)
            audio = r.listen(source)
        user = r.recognize(audio)
        print(user)
        if user == "Excel":
                excel()
        elif user == "Internet":
                internet()
        elif user == "music":
                media()
while 1:
        mainfunction()


来源:https://stackoverflow.com/questions/25394329/python-voice-recognition-library-always-listen

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!