I have found for several times the following guidelines for getting the power spectrum of an audio signal:
Note that a non-rectangular window has both benefits and costs. The result of a window in the time-domain is equivalent to a convolution of the window's transform with the signal's spectrum. A typical window, such as a von Hann window, will reduce the "leakage" from any non-periodic spectral content, which will result in a less noisy looking spectrum; but, in return, the convolution will "blur" any exactly or close to periodic spectral peaks across a few adjacent bins. e.g. all the spectral peaks will become rounder looking which may reduce frequency estimation accuracy. If you know, apriori, that there is no non-periodic content (e.g. data from some rotationally synchronous sampling system), a non-rectangular window could actually make the FFT look worse.
A non-rectangular window is also an informationally lossy process. A significant amount of spectral information near the edges of the window will be thrown away, assuming finite precision arithmetic. So non-rectangular windows are best used with overlapping window processing, and/or when one can assume that the spectrum of interest is either stationary across the entire window width, or centered in the window.