问题
I have 2D list in python
list = [[9, 2, 7], [9, 7], [2, 7], [1, 0], [0, 5, 4]]
I would like to get union of list items if there occurs any intersection. For example [9, 2, 7]
, [9, 7]
, [2, 7]
has intersection of more than one digit. The union of this would be [9,2,7]
.
How can i get the final list as follows in efficient way ?
finalList = [[9,2,7], [0, 1, 5, 4]]
N.B. order of numbers is not important.
回答1:
You have a graph problem. You want to build connected components in a graph whose vertices are elements of your sublists, and where two vertices have an edge between them if they're elements of the same sublist. You could build an adjacency-list representation of your input and run a graph search algorithm over it, or you could iterate over your input and build disjoint sets. Here's a slightly-modified connected components algorithm I wrote up for a similar question:
import collections
# build an adjacency list representation of your input
graph = collections.defaultdict(set)
for l in input_list:
if l:
first = l[0]
for element in l:
graph[first].add(element)
graph[element].add(first)
# breadth-first search the graph to produce the output
output = []
marked = set() # a set of all nodes whose connected component is known
for node in graph:
if node not in marked:
# this node is not in any previously seen connected component
# run a breadth-first search to determine its connected component
frontier = set([node])
connected_component = []
while frontier:
marked |= frontier
connected_component.extend(frontier)
# find all unmarked nodes directly connected to frontier nodes
# they will form the new frontier
new_frontier = set()
for node in frontier:
new_frontier |= graph[node] - marked
frontier = new_frontier
output.append(tuple(connected_component))
回答2:
Here is a theoretical answer: This is a connected component problem: you build a graph as follows:
- there is a vertex for each set is the list
- there is an edge between two sets when they have a common value.
what you want is the union of the connected components of the graph.
回答3:
Here is an answer without any imports:
def func(L):
r = []
cur = set()
for l in L:
if not cur:
cur = set(l)
if any(i in cur for i in l):
cur.update(l)
else:
r.append(cur)
cur = set(l)
r.append(cur)
while len(r)>1:
if any(i in r[0] for i in r[-1]):
r[-1].update(r.pop(0))
else:
break
return r
Using it:
>>> func([[9, 2, 7], [9, 7], [2, 7], [1, 0], [0, 5, 4]])
[set([9, 2, 7]), set([0, 1, 4, 5])]
>>> func([[0],[1],[2],[0,1]])
[set([2]), set([0, 1])]
You can remove the set
and return a list of lists by changing r.append(cur)
into r.append(list(cur))
, but I think it is neater to return sets.
回答4:
This one uses sets:
>>> l = [[9, 2, 7], [9, 7], [2, 7], [1, 0], [0, 5, 4]]
>>> done = []
>>> while len(done) != len(l):
start = min([i for i in range(len(l)) if i not in done])
ref = set(l[start])
for j in [i for i in range(len(l)) if i not in done]:
if set(l[j]) & ref:
done.append(j)
ref |= set(l[j])
print ref
set([2, 7, 9])
set([0, 1, 4, 5])
回答5:
I propose that you examine each pair of list with itertools
import itertools, numpy
ls_tmp_rmv = []
while True:
ls_tmp = []
for s, k in itertools.combinations(lisst, 2):
if len(set(s).intersection( set(k) )) > 0:
ls_tmp = ls_tmp + [numpy.unique(s + k).tolist()]
if [s] not in ls_tmp:
ls_tmp_rmv = ls_tmp_rmv + [s]
if [k] not in ls_tmp:
ls_tmp_rmv = ls_tmp_rmv + [k]
else:
ls_tmp = ls_tmp + [s] + [k]
ls_tmp = [ls_tmp[i] for i in range(len(ls_tmp)) if ls_tmp[i]
not in ls_tmp[i+1:]]
ls_tmp_rmv = [ls_tmp_rmv[i] for i in range(len(ls_tmp_rmv))
if ls_tmp_rmv[i] not in ls_tmp_rmv[i+1:]]
ls_tmp = [X for X in ls_tmp if X not in ls_tmp_rmv]
if ls_tmp == lisst :
break
else:
lisst = ls_tmp
print lisst
You take all combinations of all pairs of lists in your list and check whether there are elements in common. If so, you merge the pair. If not, you add both peers in the pair. You keep in mind the elements you merged to remove them from the resulting list in the end.
With the list
lisst = [[1,2], [2,3], [8,9], [3,4]]
you do get
[[1, 2, 3, 4], [8, 9]]
回答6:
def intersection_groups(lst):
lst = map(set, lst)
a, b = 0, 1
while a < len(lst) - 1:
while b < len(lst):
if not lst[a].isdisjoint(lst[b]):
lst[a].update(lst.pop(b))
else:
b += 1
a, b = a + 1, a + 2
return lst
来源:https://stackoverflow.com/questions/17803919/how-can-i-get-union-of-2d-list-items-when-there-occurs-any-intersection-in-effi