I\'ve got a class representing an interval. This class has two properties \"start\" and \"end\" of a comparable type. Now I\'m searching for an efficient algorithm to take t
It turns out this problem has been solved, many times over -- at varying levels of fancy, going under nomenclature(s): http://en.wikipedia.org/wiki/Interval_tree , http://en.wikipedia.org/wiki/Segment_tree , and also 'RangeTree'
(as OP's question involves large counts of intervals these datastructures matter )
in terms of my own choice of python library selection:
From testing, I'm finding that what most nails it in terms of being full featured and python current ( non bit-rotted ) : the 'Interval' and 'Union' classes from SymPy, see : http://sympystats.wordpress.com/2012/03/30/simplifying-sets/
Another good looking choice, a higher performance but less feature rich option (eg. didn't work on floating point range removal) : https://pypi.python.org/pypi/Banyan
Finally: search around on SO itself, under any of IntervalTree, SegmentTree, RangeTree, and you'll find answers/hooks further galore