generating correct spectrogram using fftw and window function

前端 未结 3 2064
傲寒
傲寒 2021-02-05 21:45

For a project I need to be able to generate a spectrogram from a .WAV file. I\'ve read the following should be done:

  1. Get N (transform size) samples
  2. Apply
3条回答
  •  南旧
    南旧 (楼主)
    2021-02-05 22:06

    The code you produced, was almost correct. So, you didn't left me much to correct:

    void Spectrogram::process(){
        int transform_size = 1024;
        int half = transform_size/2;
        int step_size = transform_size/2;
        double in[transform_size];
        double processed[half];
        fftw_complex *out;
        fftw_plan p;
    
        out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * transform_size);
    
    
        for (int x=0; x < wavFile->getSamples()/step_size; x++) {
    
            // Fill the transformation array with a sample frame and apply the window function.
            // Normalization is performed later
            // (One error was here: you didn't set the last value of the array in)
            for (int j = 0, int i = x * step_size; i < x * step_size + transform_size; i++, j++)
                in[j] = wavFile->getSample(i) * windowHanning(j, transform_size);
    
            p = fftw_plan_dft_r2c_1d(transform_size, in, out, FFTW_ESTIMATE);
    
            fftw_execute(p); /* repeat as needed */
    
            for (int i=0; i < half; i++) {
                // (Here were some flaws concerning the access of the complex values)
                out[i][0] *= (2./transform_size);                         // real values
                out[i][1] *= (2./transform_size);                         // complex values
                processed[i] = out[i][0]*out[i][0] + out[i][1]*out[i][1]; // power spectrum
                processed[i] = 10./log(10.) * log(processed[i] + 1e-6);   // dB
    
                // The resulting spectral values in 'processed' are in dB and related to a maximum
                // value of about 96dB. Normalization to a value range between 0 and 1 can be done
                // in several ways. I would suggest to set values below 0dB to 0dB and divide by 96dB:
    
                // Transform all dB values to a range between 0 and 1:
                if (processed[i] <= 0) {
                    processed[i] = 0;
                } else {
                    processed[i] /= 96.;             // Reduce the divisor if you prefer darker peaks
                    if (processed[i] > 1)
                        processed[i] = 1;
                }
    
                In->setPixel(x,(half-1)-i,processed[i]*255);
            }
    
            // This should be called each time fftw_plan_dft_r2c_1d()
            // was called to avoid a memory leak:
            fftw_destroy_plan(p);
        }
    
        fftw_free(out);
    }
    

    The two corrected bugs were most probably responsible for the slight variation of successive transformation results. The Hanning window is very vell suited to minimize the "noise" so a different window would not have solved the problem (actually @Alex I already pointed to the 2nd bug in his point 2. But in his point 3. he added a -Inf-bug as log(0) is not defined which can happen if your wave file containts a stretch of exact 0-values. To avoid this the constant 1e-6 is good enough).

    Not asked, but there are some optimizations:

    1. put p = fftw_plan_dft_r2c_1d(transform_size, in, out, FFTW_ESTIMATE); outside the main loop,

    2. precalculate the window function outside the main loop,

    3. abandon the array processed and just use a temporary variable to hold one spectral line at a time,

    4. the two multiplications of out[i][0] and out[i][1] can be abandoned in favour of one multiplication with a constant in the following line. I left this (and other things) for you to improve

    5. Thanks to @Maxime Coorevits additionally a memory leak could be avoided: "Each time you call fftw_plan_dft_rc2_1d() memory are allocated by FFTW3. In your code, you only call fftw_destroy_plan() outside the outer loop. But in fact, you need to call this each time you request a plan."

提交回复
热议问题