Matlab: Finding dominant frequencies in a frame of audio data

北战南征 提交于 2019-12-03 21:11:37

What you are trying to do is called speech activity detection. There are many approaches to this, the simplest might be a simple band pass filter, that passes frequencies where speech is strongest, this is between 1kHz and 8kHz. You could then compare total signal energy with bandpass limited and if majority of energy is in the speech band, classify frame as speech. That's one option, but there are others too.

To get frequencies at peaks you could use FFT to get spectrum and then use peakdetect.m. But this is a very naïve approach, as you will get a lot of peaks, belonging to harmonic frequencies of a base sine.

Theoretically you should use some sort of cepstrum (also known as spectrum of spectrum), which reduces harmonics' periodicity in spectrum to base frequency and then use that with peakdetect. Or, you could use existing tools, that do that, such as praat.

Be aware, that speech analysis is usually done on a frames of around 30ms, stepping in 10ms. You could further filter out false detection by ensuring formant is detected in N sequential frames.

0x90

Why don't you use fft with `fftshift:

  %% Time specifications:
   Fs = 100;                      % samples per second
   dt = 1/Fs;                     % seconds per sample
   StopTime = 1;                  % seconds
   t = (0:dt:StopTime-dt)';
   N = size(t,1);
   %% Sine wave:
   Fc = 12;                       % hertz
   x = cos(2*pi*Fc*t);
   %% Fourier Transform:
   X = fftshift(fft(x));
   %% Frequency specifications:
   dF = Fs/N;                      % hertz
   f = -Fs/2:dF:Fs/2-dF;           % hertz
   %% Plot the spectrum:
   figure;
   plot(f,abs(X)/N);
   xlabel('Frequency (in hertz)');
   title('Magnitude Response');

Why do you want to use complex stuff?

a nice and full solution may found in https://dsp.stackexchange.com/questions/1522/simplest-way-of-detecting-where-audio-envelopes-start-and-stop

Have a look at the STFT (short-time fourier transform) or (even better) the DWT (discrete wavelet transform) which both will estimate the frequency content in blocks (windows) of data, which is what you need if you want to detect sudden changes in amplitude of certain ("speech") frequencies.

Don't use a FFT since it calculates the relative frequency content over the entire duration of the signal, making it impossible to determine when a certain frequency occured in the signal.

If you still use inbuilt STFT function, then to plot the maximum you can use following command

plot(T,(floor(abs(max(S,[],1)))))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!