What exactly does a Sample Rate of 44100 sample?

问题

I'm using FMOD library to extract PCM from an MP3. I get the whole 2 channel - 16 bit thing, and I also get that a sample rate of 44100hz is 44,100 samples of "sound" in 1 second. What I don't get is, what exactly does the 16 bit value represent. I know how to plot coordinates on an xy axis, but what am I plotting? The y axis represents time, the x axis represents what? Sound level? Is that the same as amplitude? How do I determine the different sounds that compose this value. I mean, how do I get a spectrum from a 16 bit number.

This may be a separate question, but it's actually what I really need answered: How do I get the amplitude at every 25 milliseconds? Do I take 44,100 values, divide by 40 (40 * 0.025 seconds = 1 sec) ? That gives 1102.5 samples; so would I feed 1102 values into a blackbox that gives me the amplitude for that moment in time?

Edited original post to add code I plan to test soon: (note, I changed the frame rate from 25 ms to 40 ms)

// 44100 / 25 frames = 1764 samples per frame -> 1764 * 2 channels * 2 bytes [16 bit sample] = 7056 bytes
private const int CHUNKSIZE = 7056;
uint    bytesread = 0;
var squares = new double[CHUNKSIZE / 4];
const double scale = 1.0d / 32768.0d;

do
{
    result = sound.readData(data, CHUNKSIZE, ref read);

    Marshal.Copy(data, buffer, 0, CHUNKSIZE);

    //PCM samples are 16 bit little endian
    Array.Reverse(buffer);

    for (var i = 0; i < buffer.Length; i += 4)
    {
        var avg = scale * (Math.Abs((double)BitConverter.ToInt16(buffer, i)) + Math.Abs((double)BitConverter.ToInt16(buffer, i + 2))) / 2.0d;
        squares[i >> 2] = avg * avg;
    }

    var rmsAmplitude = ((int)(Math.Floor(Math.Sqrt(squares.Average()) * 32768.0d))).ToString("X2");

    fs.Write(buffer, 0, (int) read);

    bytesread += read;

    statusBar.Text = "writing " + bytesread + " bytes of " + length + " to output.raw";
} while (result == FMOD.RESULT.OK && read == CHUNKSIZE);

After loading mp3, seems my rmsAmplitude is in the range 3C00 to 4900. Have I done something wrong? I was expecting a wider spread.

回答1:

Yes, a sample represents amplitude (at that point in time).

To get a spectrum, you typically convert it from the time domain to the frequency domain.

Last Q: Multiple approaches are used - You may want the RMS.

回答2:

Generally, the x axis is the time value and y axis is the amplitude. To get the frequency, you need to take the Fourier transform of the data (most likely using the Fast Fourier Transform [fft] algorithm).

To use one of the simplest "sounds", let's assume you have a single frequency noise with frequency f. This is represented (in the amplitude/time domain) as y = sin(2 * pi * x / f). If you convert that into the frequency domain, you just end up with Frequency = f.

回答3:

Each sample represents the voltage of the analog signal at a given time.

来源：https://stackoverflow.com/questions/10387845/what-exactly-does-a-sample-rate-of-44100-sample

标签

audio

pcm