Is there a standard way to represent a \"set\" that can contain duplicate elements.
As I understand it, a set has exactly one or zero of an element. I want functiona
If you need duplicates, use a list, and transform it to a set when you need operate as a set.
An alternative Python multiset implementation uses a sorted list data structure. There are a couple implementations on PyPI. One option is the sortedcontainers module which implements a SortedList data type that efficiently implements set-like methods like add
, remove
, and contains
. The sortedcontainers module is implemented in pure-Python, fast-as-C implementations (even faster), has 100% unit test coverage, and hours of stress testing.
Installation is easy from PyPI:
pip install sortedcontainers
If you can't pip install
then simply pull the sortedlist.py file down from the open-source repository.
Use it as you would a set:
from sortedcontainers import SortedList
survey = SortedList(['blue', 'red', 'blue', 'green']]
survey.add('blue')
print survey.count('blue') # "3"
survey.remove('blue')
The sortedcontainers module also maintains a performance comparison with other popular implementations.