How might a class like .NET's ConcurrentBag<T> be implemented?

ぃ、小莉子 提交于 2019-12-03 05:55:56

If you look at the details of ConcurrentBag<T>, you'll find that it's, internally, basically a customized linked list.

Since Bags can contain duplicates, and are not accessible by index, a doubly linked list is a very good option for implementation. This allows locking to be fairly fine grained for insert and removal (you don't have to lock the entire collection, just the nodes around where you're inserting/removing). Since you're not worried about duplicates, no hashing is involved. This makes a double linked list perfect.

There's some good info on ConcurrentBag here: http://geekswithblogs.net/BlackRabbitCoder/archive/2011/03/03/c.net-little-wonders-concurrentbag-and-blockingcollection.aspx

The way that the ConcurrentBag works is to take advantage of the new ThreadLocal type (new in System.Threading for .NET 4.0) so that each thread using the bag has a list local to just that thread.

This means that adding or removing to a thread-local list requires very low synchronization. The problem comes in where a thread goes to consume an item but it’s local list is empty. In this case the bag performs “work-stealing” where it will rob an item from another thread that has items in its list. This requires a higher level of synchronization which adds a bit of overhead to the take operation.

Since ordering doesn't matter a ConcurrentBag could be using a hashtable behind the scenes to allow for fast retrieval of data. But unlike a Hashset a bag accepts duplicates. Maybe each item could be paired with a Count property which is set to 1 when an item is added. If you add the same item for a second time, you could just increment the Count property of this item.

Then, to remove an item which has a count greater than one, you could just decrease the Count for this item. If the count was one, you would remove the Item-Count pair from the hashtable.

Well, in smalltalk (where the notion of a Bag came from), the collection is basically the same as a hash, albeit one that allows duplicates. Instead of storing the duplicate object though, it maintains an "occurrence count", e.g., a refcount of each object. If ConcurrentBag is a faithful implementation, this should give you a starting point.

I believe the concept of a 'Bag' is synonymous with 'Multiset'.

There are a number of "Bag"/"Multiset" implementations (these happen to be java) that are open source if you are interested in how they are implemented.

These implementations show that a 'Bag' can be implemented in any number of ways depending on your needs. There are examples of TreeMultiset, HashMultiset, LinkedHashMultiset, ConcurrentHashMultiset.

Google Collections
Google has a number of "MultiSet" implementations, one being a ConcurrentHashMultiset.

Apache Commons
Apache has a number of "Bag" implementations.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!