What is the advantage of using a ConcurrentBag(Of MyType) against just using a List(Of MyType)? The MSDN page on the CB states that
ConcurrentBag(Of
The biggest advantage here is that ConcurrentBag<T>
is safe to access from multiple threads while LisT<T>
is not. If thread safe access is important for your scenario then a type like ConcurrentBag<T>
is possibly to your advantage over List<T>
+ manual locking. We'd need to know a bit more about your scenario before we can really answer this question.
Additionally List<T>
is an ordered collection while ConcurrentBag<T>
is not.
I think you should read that as "where multiple threads access the container and each thread may both produce and/or consume data", it is definitely intended for parallel scenarios.
Internally, the ConcurrentBag is implemented using several different Lists, one for each writing thread.
What that statement you quoted means is that, when reading from the bag, it will prioritize the list created for that thread. Meaning, it will first check the list for that thread before risking contention on another thread's list.
This way it can minimize lock contention when multiple threads are both reading and writing. When the reading thread doesn't have a list, or its list is empty, it has to lock a list assigned to a different thread. But, if you have multiple threads all reading from and writing to their own list, then you won't ever have lock contention.
TLDR; I would say Local lock is faster but difference is negligible (or I cocked-up setting up my test).
Performance analysis:
private static IEnumerable<string> UseConcurrentBag(int count)
{
Func<string> getString = () => "42";
var list = new ConcurrentBag<string>();
Parallel.For(0, count, o => list.Add(getString()));
return list;
}
private static IEnumerable<string> UseLocalLock(int count)
{
Func<string> getString = () => "42";
var resultCollection = new List<string>();
object localLockObject = new object();
Parallel.For(0, count, () => new List<string>(), (word, state, localList) =>
{
localList.Add(getString());
return localList;
},
(finalResult) => { lock (localLockObject) resultCollection.AddRange(finalResult); }
);
return resultCollection;
}
private static void Test()
{
var s = string.Empty;
var start1 = DateTime.Now;
var list = UseConcurrentBag(5000000);
if (list != null)
{
var end1 = DateTime.Now;
s += " 1: " + end1.Subtract(start1);
}
var start2 = DateTime.Now;
var list1 = UseLocalLock(5000000);
if (list1 != null)
{
var end2 = DateTime.Now;
s += " 2: " + end2.Subtract(start2);
}
if (!s.Contains("sdfsd"))
{
}
}
Margin of error using ConcurrentBag running 3 times against itself with 5M records
" 1: 00:00:00.4550455 2: 00:00:00.4090409"
" 1: 00:00:00.4190419 2: 00:00:00.4730473"
" 1: 00:00:00.4780478 2: 00:00:00.3870387"
3 runs ConcurrentBag vs Local lock with 5M records:
" 1: 00:00:00.5070507 2: 00:00:00.3660366"
" 1: 00:00:00.4470447 2: 00:00:00.2470247"
" 1: 00:00:00.4420442 2: 00:00:00.2430243"
With 50M records
" 1: 00:00:04.7354735 2: 00:00:04.7554755"
" 1: 00:00:04.2094209 2: 00:00:03.2413241"
I would say Local lock is marginally faster
UPDATE: On (Xeon X5650 @ 2.67GHz 64bit Win7 6 core) 'local lock' appear to perform even better
With 50M records.
1: 00:00:09.7739773 2: 00:00:06.8076807
1: 00:00:08.8858885 2: 00:00:04.6184618
1: 00:00:12.5532552 2: 00:00:06.4866486
Unlike the other concurrent collections, ConcurrentBag<T>
is optimized for single-threaded use.
Unlike List<T>
, ConcurrentBag<T>
can be used from multiple threads simultaneously.