Applying neural network to MFCCs for variable-length speech segments

空扰寡人 提交于 2019-12-08 03:10:29

I'm currently trying to create and train a neural network to perform simple speech classification using MFCCs.

Simple neural networks do not have input lenght invariance and do not allow to analyze time series.

For classification of time series like a series of MFCC frames you can use a classifier with time invariance. For example you can use neural networks combined with hidden Markov models (ANN-HMM), gaussian mixture model with hidden markov models (GMM-HMM) or recurrent neural networks (RNN). Matlab implementation for RNN is here. Theano implementation is also available. You can find a detailed description of those structures in Google.

Speech recognition is not a simple thing to implement, it is better to use existing software like CMUSphinx

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!