Below I have code that will take input from a microphone, and if the average of the audio block passes a certain threshold it will produce a spectrogram of the audio block (
I am not sure that working directly in Python is the best way for sound processing and most precisely with FFT... [ in my opinion using cython appear like an obligation in sound processing with python]
Have you evaluate the possiblity to bind any external FFT method (using fftw for example) and keep using python only to dispatch job to external method & to update the picture result ?
You may found some information relatively to optimze FFT in python here, and may also take a look at scipy FFT implementation.