speech | 易学教程

Split speech audio file on words in python

阅读更多关于 Split speech audio file on words in python

问题 I feel like this is a fairly common problem but I haven't yet found a suitable answer. I have many audio files of human speech that I would like to break on words, which can be done heuristically by looking at pauses in the waveform, but can anyone point me to a function/library in python that does this automatically? 回答1: An easier way to do this is using pydub module. recent addition of silent utilities does all the heavy lifting such as setting up silence threahold , setting up silence

Audio analysis to detect human voice, gender, age and emotion — any prior open-source work done?

阅读更多关于 Audio analysis to detect human voice, gender, age and emotion — any prior open-source work done?

问题 Is there prior open-source work done in the field of 'Audio analysis' to detect human-voice (say in spite of some background noise), determine speaker's gender, possibly determine no. of speakers, age of speaker(s), and the emotion of speakers? My hunch is that the speech recognition software like CMU Sphinx could be a good place to start, but if there's something better, it'd be great. 回答1: I'm a graduate student doing speech recognition research. These are open research problems, and,

Android Speech Recognition not working

阅读更多关于 Android Speech Recognition not working

问题 I'm using this example from newboston and it prompt me for recording but after it recognized what I said, it won't update the list view. Here is the code. public class MainActivity extends Activity { private static final int RECOGNIZER_RESULT = 1234; ListView list; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); list = (ListView) findViewById(R.id.list); Button btn_speach = (Button)findViewById(R.id.btn

Using Google Speech API

阅读更多关于 Using Google Speech API

问题 What is the code for implementing the Google Speech API in my C# based application? I found out that it is possible to create an audio file and sent it to http://slides.html5rocks.com/#speech-input and receive it as text. Could you please explain how to do this or provide me with the code if you have attempted this before? Been stuck here for a while now Much appreciated. Code So far: SpeechRecognitionEngine rec = new SpeechRecognitionEngine(); SpeechSynthesizer dummy = new SpeechSynthesizer(

good Speech recognition API

阅读更多关于 good Speech recognition API

问题 I am working on a college project in which I am using speech recognition. Currently I am developing it on Windows 7 and I'm using system.speech API package which comes along with .net and I am doing it on C#. The problem I am facing is dictation recognition is not accurate enough. Then whenever I start my application the desktop speech recognition starts automatically. This is a big nuicance to me. As already the words I speak are not clear enough and conflicting recognition are interpreted

How can I use speech recognition without the annoying dialog in android phones

阅读更多关于 How can I use speech recognition without the annoying dialog in android phones

问题 Is this possible without modify the android APIs? I've found a article about this. There's one a comment that I should do modifications to the android APIs. But it didn't say how to do the modification. Can anybody give me some suggestions on how to do that? Thanks! I've found this article; SpeechRecognizer His needs is almost the same as mine. It is a good reference for me! I've totally got this problem solved. I googled a usable sample code from this China website Here's my source code

How can I use speech recognition without the annoying dialog in android phones

阅读更多关于 How can I use speech recognition without the annoying dialog in android phones

Why do MFCC extraction libs return different values?

阅读更多关于 Why do MFCC extraction libs return different values?

问题 I am extracting the MFCC features using two different libraries: The python_speech_features lib The BOB lib However the output of the two is different and even the shapes are not the same. Is that normal? or is there a parameter that I am missing? The relevant section of my code is the following: import bob.ap import numpy as np from scipy.io.wavfile import read from sklearn import preprocessing from python_speech_features import mfcc, delta, logfbank def bob_extract_features(audio, rate):

Speech only first time

阅读更多关于 Speech only first time

问题 I am interested in speech recognition in Android but I can't do it: it is not continuous. If you stop speaking, it doesn't continue, and you have to click on the button again. I do not want this behaviour.. Does anybody have any suggestions as to what I can to fix this? Recognise speech only first time I do not want this behaviour. Here is the code: private SpeechRecognizer speech; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R

How can I launch my Android activities with the speech recognizer?

阅读更多关于 How can I launch my Android activities with the speech recognizer?

问题 I want to alter this switch such that instead of clicking the buttons my activities will launch by speaking the name of the associated fruit. For instance, the Apple class will launch by speaking the word "Apple". How should I rewrite this switch? All of my attempts at doing this have not seemed to work thus far. Any answers provided will be greatly appreciated. Thank you! package com.example.speech; import com.example.speech.R; import android.app.Activity; import android.content.Intent;