audio-processing | 易学教程

Java - Adjust playback speed of a WAV file

阅读更多关于 Java - Adjust playback speed of a WAV file

问题 I'm likely dense but I cannot seem to find a solution to my issue ( NOTE: I CAN find lots of people reporting this issue, seems like it happened as a result of newer Java (possible 1.5?). Perhaps SAMPLE_RATE is no longer supported? I am unable to find any solution) . I'm trying to adjust the SAMPLE_RATE to speed up/slow down song. I can successfully play a .wav file without issue, so I looked into FloatControl which worked for adjusting volume: public void adjustVolume(String audioType, float

Is there anything special I have to do to create a 24-bit WAV file?

阅读更多关于 Is there anything special I have to do to create a 24-bit WAV file?

问题 I can successfully create a 16-bit wav file, but when creating a 24-bit file, all I hear is white noise. I'm setting 24-bit signed integer data chunks. Do I have to set some special audio format at byte 20 in the wav file header? I'm currently using format 1. Edit #1 The wBitsPerSample field is set to 24. The wAvgBytesPerSec (byte rate) field is set to // 44100 * (2 * 3) sampleRate * blockAlign and wBlockAlign is set to // 2 * 3 numChannels * bytesPerSampe Assuming you already did, the data

Open source FSK decoder library? [closed]

阅读更多关于 Open source FSK decoder library? [closed]

问题 As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 6 years ago . I'm looking for a library or tool to decode FSK in wav files, e.g. caller id. Currently using the tools bundled with vpb-driver for

Implementing a post-processed low-pass filter using core audio

阅读更多关于 Implementing a post-processed low-pass filter using core audio

问题 I have implemented a rudimentary low-pass filter using a time based value. This is ok, but trying to find the correct time slice is guess work, and gives different results based on different input audio files. Here is what I have now: - (void)processDataWithInBuffer:(const int16_t *)buffer outBuffer:(int16_t *)outBuffer sampleCount:(int)len { BOOL positive; for(int i = 0; i < len; i++) { positive = (buffer[i] >= 0); currentFilteredValueOfSampleAmplitude = LOWPASSFILTERTIMESLICE * (float)abs

AVFoundation audio processing using AVPlayer's MTAudioProcessingTap with remote URLs

阅读更多关于 AVFoundation audio processing using AVPlayer's MTAudioProcessingTap with remote URLs

问题 There is precious little documentation on AVAudioMix and MTAudioProcessingTap, which allow processing to be applied to the audio tracks (PCM access) of media assets in AVFoundation (on iOS). This article and a brief mention in a WWDC 2012 session is all I have found. I have got the setup described here working for local media files but it doesn't seem to work with remote files (namely HLS streaming URLs). The only indication that this is expected is the note at the end of this Technical Q&A:

AVAudioPlayer rate

阅读更多关于 AVAudioPlayer rate

问题 So I'm trying to play a sound file at a different rate in iOS 5.1.1, and am having absolutely no luck. So far I have tried setting the rate of the AVAudioPlayer: player = [[AVAudioPlayer alloc] initWithContentsOfURL:referenceURL error:&error]; player.enableRate = YES; player.rate = 1.5; player.numberOfLoops = 0; player.delegate = self; [player prepareToPlay]; [player play]; with no luck at all, the sound plays but just ignores the rate I give it. I have also tried AVPlayer: avPlayer = [

How to get the fundamental frequency using Harmonic Product Spectrum?

阅读更多关于 How to get the fundamental frequency using Harmonic Product Spectrum?

问题 I'm trying to get the pitch from the microphone input. First I have decomposed the signal from time domain to frequency domain through FFT. I have applied Hamming window to the signal before performing FFT. Then I get the complex results of FFT. Then I passed the results to Harmonic product spectrum, where the results get downsampled and then multiplied the downsampled peaks and gave a value as a complex number. Then what should I do to get the fundamental frequency? public float[]

keras: how to aggregate over frame-level predictions to song-level prediction

阅读更多关于 keras: how to aggregate over frame-level predictions to song-level prediction

问题 I am doing a song genre classification. For each song, I have chopped them into small frames (5s) to generate spectrogram as input features for a neural network and each frame has an associated song genre label. The data looks like the following: name label feature .... song_i_frame1 label feature_vector_frame1 song_i_frame2 label feature_vector_frame2 ... song_i_framek label feature_vector_framek ... I can get a prediction accuracy for each frame from Keras with no problem. But currently, I

Peak frequencies from .wav file

阅读更多关于 Peak frequencies from .wav file

问题 I have a .wav file which recorded by me when I was playing guitar notes. Then I used below program to read my .wav file data. I used Naudio library. AudioFileReader readertest = new AudioFileReader(@"E:\song\music.wav"); int bytesnumber = (int)readertest.Length; var buffer = new float[bytesnumber]; readertest.Read(buffer, 0, bytesnumber); for (int i = 0; i < buffer.Length; i++) { Console.Write(buffer[i] + "\n"); } it outputs like below.(part of output). 0.00567627 0.007659912 0.005187988 0

How to train a machine learning algorithm using MFCC coefficient vectors?

阅读更多关于 How to train a machine learning algorithm using MFCC coefficient vectors?

问题 For my final year project i am trying to identify dog/bark/bird sounds real time (by recording sound clips). I am using MFCC as the audio features. Initially i have extracted altogether 12 MFCC vectors from a sound clip using jAudio library. Now I'm trying to train a machine learning algorithm(at the moment i have not decided the algorithm but it is most probably SVM). The sound clip size is like around 3 seconds. I need to clarify some information about this process. They are, Do i have to