Convert multi-channel PyAudio into NumPy array

匿名 (未验证) 提交于 2019-12-03 02:14:01

问题:

All the examples I can find are mono, with CHANNELS = 1. How do you read stereo or multichannel input using the callback method in PyAudio and convert it into a 2D NumPy array or multiple 1D arrays?

For mono input, something like this works:

def callback(in_data, frame_count, time_info, status):     global result     global result_waiting      if in_data:         result = np.fromstring(in_data, dtype=np.float32)         result_waiting = True     else:         print('no input')      return None, pyaudio.paContinue  stream = p.open(format=pyaudio.paFloat32,                 channels=1,                 rate=fs,                 output=False,                 input=True,                 frames_per_buffer=fs,                 stream_callback=callback) 

But does not work for stereo input, the result array is twice as long, so I assume the channels are interleaved or something, but I can't find documentation for this.

回答1:

It appears to be interleaved sample-by-sample, with left channel first. With signal on left channel input and silence on right channel, I get:

result = [0.2776, -0.0002,  0.2732, -0.0002,  0.2688, -0.0001,  0.2643, -0.0003,  0.2599, ... 

So to separate it out into a stereo stream, reshape into a 2D array:

result = np.fromstring(in_data, dtype=np.float32) result = np.reshape(result, (frames_per_buffer, 2)) 

Now to access the left channel, use result[:, 0], and for right channel, use result[:, 1].

def decode(in_data, channels):     """     Convert a byte stream into a 2D numpy array with      shape (chunk_size, channels)      Samples are interleaved, so for a stereo stream with left channel      of [L0, L1, L2, ...] and right channel of [R0, R1, R2, ...], the output      is ordered as [L0, R0, L1, R1, ...]     """     # TODO: handle data type as parameter, convert between pyaudio/numpy types     result = np.fromstring(in_data, dtype=np.float32)      chunk_length = len(result) / channels     assert chunk_length == int(chunk_length)      result = np.reshape(result, (chunk_length, channels))     return result   def encode(signal):     """     Convert a 2D numpy array into a byte stream for PyAudio      Signal should be a numpy array with shape (chunk_size, channels)     """     interleaved = signal.flatten()      # TODO: handle data type as parameter, convert between pyaudio/numpy types     out_data = interleaved.astype(np.float32).tostring()     return out_data 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!