问题
This is an extended question from here Using UWP monitor live audio and detect gun-fire/clap sound
Thanks to Dernis I finally got the code working to monitor live audio and trigger events when decibel count is above a certain range.
This works perfectly when we run it in office/closed/silent area.
But when I take the app to open road, there will be traffic sound, wind sound, people talk sound and other noises and BLOW events are not identified correctly.
- I would like to implement something like Lean Environment button. Before app starts monitoring, the user clicks on "Lean Environment" that recognize the sensitivity levels and set filtering to my live audio and then I start monitoring blows.
- If it doesn't add too much load, I would like to record the audio to a file.
Any help on where to start would be appreciated.
OnNavigatedTo
protected override async void OnNavigatedTo(NavigationEventArgs e)
{
//other logic
await CreateInputDeviceNodeAsync(_deviceId);
}
CreateInputDeviceNodeAsync
public async Task<bool> CreateInputDeviceNodeAsync(string deviceId)
{
Console.WriteLine("Creating AudioGraphs");
// Create an AudioGraph with default settings
AudioGraphSettings graphSettings = new AudioGraphSettings(AudioRenderCategory.Media)
{
EncodingProperties = new AudioEncodingProperties
{
Subtype = "Float",
SampleRate = 48000,
ChannelCount = 2,
BitsPerSample = 32,
Bitrate = 3072000
}
};
CreateAudioGraphResult audioGraphResult = await AudioGraph.CreateAsync(graphSettings);
if (audioGraphResult.Status != AudioGraphCreationStatus.Success)
{
_rootPage.NotifyUser("Cannot create graph", NotifyType.ErrorMessage);
return false;
}
_audioGraph = audioGraphResult.Graph;
AudioGraphSettings audioGraphSettings =
new AudioGraphSettings(AudioRenderCategory.GameChat)
{
EncodingProperties = AudioEncodingProperties.CreatePcm(48000, 2, 32),
DesiredSamplesPerQuantum = 990,
QuantumSizeSelectionMode = QuantumSizeSelectionMode.ClosestToDesired
};
_frameOutputNode = _audioGraph.CreateFrameOutputNode(_audioGraph.EncodingProperties);
_quantum = 0;
_audioGraph.QuantumStarted += Graph_QuantumStarted;
LoudNoise += BlowDetected;
DeviceInformation selectedDevice = null;
if (!string.IsNullOrWhiteSpace(_deviceId))
selectedDevice = await DeviceInformation.CreateFromIdAsync(_deviceId);
if (selectedDevice == null)
{
string device = Windows.Media.Devices.MediaDevice.GetDefaultAudioCaptureId(
Windows.Media.Devices.AudioDeviceRole.Default);
if (!string.IsNullOrWhiteSpace(device))
selectedDevice = await DeviceInformation.CreateFromIdAsync(device);
else
{
_rootPage.NotifyUser($"Could not select Audio Device {device}", NotifyType.ErrorMessage);
return false;
}
}
CreateAudioDeviceInputNodeResult result =
await _audioGraph.CreateDeviceInputNodeAsync(MediaCategory.Media, audioGraphSettings.EncodingProperties,
selectedDevice);
if (result.Status != AudioDeviceNodeCreationStatus.Success)
{
_rootPage.NotifyUser("Cannot create device output node", NotifyType.ErrorMessage);
return false;
}
_selectedMicrophone = selectedDevice.Name;
_deviceInputNode = result.DeviceInputNode;
_deviceInputNode.AddOutgoingConnection(_frameOutputNode);
_frameOutputNode.Start();
_audioGraph.Start();
return true;
}
Graph_QuantumStarted
private void Graph_QuantumStarted(AudioGraph sender, object args)
{
if (++_quantum % 2 != 0) return;
AudioFrame frame = _frameOutputNode.GetFrame();
float[] dataInFloats;
using (AudioBuffer buffer = frame.LockBuffer(AudioBufferAccessMode.Write))
using (IMemoryBufferReference reference = buffer.CreateReference())
unsafe
{
// Get the buffer from the AudioFrame
// ReSharper disable once SuspiciousTypeConversion.Global
((IMemoryBufferByteAccess) reference).GetBuffer(out byte* dataInBytes,
out var capacityInBytes);
var dataInFloat = (float*) dataInBytes;
dataInFloats = new float[capacityInBytes / sizeof(float)];
for (var i = 0; i < capacityInBytes / sizeof(float); i++)
{
dataInFloats[i] = dataInFloat[i];
}
}
double decibels = dataInFloats.Aggregate<float, double>(0f, (current, sample) => current + Math.Abs(sample));
decibels = 20 * Math.Log10(decibels / dataInFloats.Length);
_decibelList.Add(decibels);
if (double.IsInfinity(decibels) || decibels < _threshold) return;//-45
if (_watch != null && _watch.Elapsed <= TimeSpan.FromSeconds(1)) return;
LoudNoise?.Invoke(this, decibels);
_watch = Stopwatch.StartNew();
}
回答1:
This is just statistics. You'll want to collect probably at least 50 frames (1 second) of data before actually having it function (maybe let the user decide by holding and releasing a button). Then you'll probably want to determine where the decibel level is usually around. I can think of 3 ways to do that.
private void Graph_QuantumStarted(AudioGraph sender, object args)
{
...
double decibels = dataInFloats.Aggregate<float, double>(0f, (current, sample) => current + Math.Abs(sample)); // I dislike the fact that the decibels variable is initially inaccurate, but it's your codebase.
decibels = 20 * Math.Log10(decibels / dataInFloats.Length);
if (scanning) // class variable (bool), you can set it from the UI thread like this
{
_decibelList.Add(decibels); // I assume you made this a class variable
}
else if (decibels == Double.NaN)
{
// Code by case below
}
else if (decibels > _sensitivity) //_sensitivity is a class variable(double), initialized to Double.NaN
{
LoudNoise?.Invoke(this, true); // Calling events is a wee bit expensive, you probably want to handle the sensitivity before Invoking it, I'm also going to do it like that to make this demo simpler
}
}
- If you can control make sure there's no spike loud enough you want it to go off you can just take the max value of all those frames and say if it's over the sensitivity is
maxDecibels + Math.Abs(maxDecibels* 0.2)
(the decibels could be negative, hence Abs).
double maxDecibels = _decibelList.OrderByDescending(x => x)[0];
_sensitivity = maxDecibels + Math.Abs(maxDecibels* 0.2);
- If you can't control when there's a spike, then you could collect those frames, sort, and have it take item [24] (of your 100 item list) and say that's the sensitivity.
sensitivity = _decibelList.OrderByDescending(x => x)[24]; // If you do a variable time you can just take Count/4 - 1 as the index
- (I think it's the best but I really don't know statistics) Walk the list of frame's decibels and track the average difference in value and the what index changed it most. Afterwards, find the max value from after that index and say 75% of the change to there is the sensitivty. (Don't use a LinkedList on this)
int greatestChange, changeIndex = 0;
double p = Double.NaN; // Previous
for (int i = 0; i < _decibelList.Count(); i++)
{
if (p != Double.Nan)
{
change = Math.Abs(_decibelList[i] - p);
if (Math.Abs(change > greatestChange)
{
greatestChange = change;
changeIndex = i;
}
}
p = _decibelList[i];
}
int i = changeIndex;
p = Double.NaN; // reused
double c= Double.NaN; // Current
do
{
p = c != Double.NaN ? c : _decibelList[i];
c = _decibelList[++i];
} while (c < p);
_sensitivity = ((3 * c) + _decibelList[changeIndex]) / 4;
Note: You can (kind of) remove the need to sort by having a LinkedList and inserting in the appropiate place
来源:https://stackoverflow.com/questions/54531704/using-audio-graph-learn-environment-noise-and-filter-in-a-uwp-app