Let\'s say I have a huge list containing random numbers for example
L = [random.randrange(0,25000000000) for _ in range(1000000000)]
I nee
Can't say I like this, but it should work, after a fashion.
Divide the data in N readonly pieces. Distribute one per worker to research the data. Everything is readonly, so it can all be shared. Each worker i 1...N checks its list against all the other 'future' lists i+1...N
Each worker i maintains a bit table for its i+1...N lists noting if any of the its items hit any of the the future items.
When everyone is done, worker i sends it's bit table back to master where tit can be ANDed. the zeroes then get deleted. No sorting no sets. The checking is not fast, tho.
If you don't want to bother with multiple bit tables you can let every worker i write zeroes when they find a dup above their own region of responsibility. HOWEVER, now you run into real shared memory issues. For that matter, you could even let each work just delete dup above their region, but ditto.
Even dividing up the work begs the question. It's expensive for each worker to walk though everyone else's list for each of its own entries. *(N-1)len(region)/2. Each worker could create a set of it's region, or sort it's region. Either would permit faster checks, but the costs add up.