How can i get union of 2D list items when there occurs any intersection (in efficient way)?

我的未来我决定 提交于 2019-12-12 02:34:45

问题


I have 2D list in python

list = [[9, 2, 7], [9, 7], [2, 7], [1, 0], [0, 5, 4]]

I would like to get union of list items if there occurs any intersection. For example [9, 2, 7], [9, 7], [2, 7] has intersection of more than one digit. The union of this would be [9,2,7].

How can i get the final list as follows in efficient way ?

finalList = [[9,2,7], [0, 1, 5, 4]]

N.B. order of numbers is not important.


回答1:


You have a graph problem. You want to build connected components in a graph whose vertices are elements of your sublists, and where two vertices have an edge between them if they're elements of the same sublist. You could build an adjacency-list representation of your input and run a graph search algorithm over it, or you could iterate over your input and build disjoint sets. Here's a slightly-modified connected components algorithm I wrote up for a similar question:

import collections

# build an adjacency list representation of your input
graph = collections.defaultdict(set)
for l in input_list:
    if l:
        first = l[0]
        for element in l:
            graph[first].add(element)
            graph[element].add(first)

# breadth-first search the graph to produce the output
output = []
marked = set() # a set of all nodes whose connected component is known
for node in graph:
    if node not in marked:
        # this node is not in any previously seen connected component
        # run a breadth-first search to determine its connected component
        frontier = set([node])
        connected_component = []
        while frontier:
            marked |= frontier
            connected_component.extend(frontier)

            # find all unmarked nodes directly connected to frontier nodes
            # they will form the new frontier
            new_frontier = set()
            for node in frontier:
                new_frontier |= graph[node] - marked
            frontier = new_frontier
        output.append(tuple(connected_component))



回答2:


Here is a theoretical answer: This is a connected component problem: you build a graph as follows:

  • there is a vertex for each set is the list
  • there is an edge between two sets when they have a common value.

what you want is the union of the connected components of the graph.




回答3:


Here is an answer without any imports:

def func(L):
    r = []
    cur = set()
    for l in L:
        if not cur:
            cur = set(l)
        if any(i in cur for i in l):
            cur.update(l)
        else:
            r.append(cur)
            cur = set(l)
    r.append(cur)
    while len(r)>1:
        if any(i in r[0] for i in r[-1]):
            r[-1].update(r.pop(0))
        else:
            break
    return r

Using it:

>>> func([[9, 2, 7], [9, 7], [2, 7], [1, 0], [0, 5, 4]])
[set([9, 2, 7]), set([0, 1, 4, 5])]
>>> func([[0],[1],[2],[0,1]])
[set([2]), set([0, 1])]

You can remove the set and return a list of lists by changing r.append(cur) into r.append(list(cur)), but I think it is neater to return sets.




回答4:


This one uses sets:

>>> l = [[9, 2, 7], [9, 7], [2, 7], [1, 0], [0, 5, 4]]
>>> done = []
>>> while len(done) != len(l):
    start = min([i for i in range(len(l)) if i not in done])
    ref = set(l[start])
    for j in [i for i in range(len(l)) if i not in done]:
        if set(l[j]) & ref:
            done.append(j)
            ref |= set(l[j])
    print ref


set([2, 7, 9])
set([0, 1, 4, 5])



回答5:


I propose that you examine each pair of list with itertools

import itertools, numpy

ls_tmp_rmv = []

while True:
    ls_tmp = []

    for s, k in itertools.combinations(lisst, 2):
        if len(set(s).intersection( set(k) )) > 0:

            ls_tmp = ls_tmp + [numpy.unique(s + k).tolist()]

            if [s] not in ls_tmp:
                ls_tmp_rmv = ls_tmp_rmv + [s]
            if [k] not in ls_tmp:
                ls_tmp_rmv = ls_tmp_rmv + [k]
        else:
            ls_tmp = ls_tmp + [s] + [k]

    ls_tmp = [ls_tmp[i] for i in range(len(ls_tmp)) if ls_tmp[i] 
                    not in ls_tmp[i+1:]]
    ls_tmp_rmv = [ls_tmp_rmv[i] for i in range(len(ls_tmp_rmv)) 
                     if ls_tmp_rmv[i] not in ls_tmp_rmv[i+1:]]

    ls_tmp = [X for X in ls_tmp if X not in ls_tmp_rmv]

    if ls_tmp == lisst :
        break
    else:
        lisst = ls_tmp

print lisst

You take all combinations of all pairs of lists in your list and check whether there are elements in common. If so, you merge the pair. If not, you add both peers in the pair. You keep in mind the elements you merged to remove them from the resulting list in the end.

With the list

lisst = [[1,2], [2,3], [8,9], [3,4]]

you do get

[[1, 2, 3, 4], [8, 9]]



回答6:


def intersection_groups(lst):
    lst = map(set, lst)
    a, b = 0, 1
    while a < len(lst) - 1:
        while b < len(lst):
            if not lst[a].isdisjoint(lst[b]):
                lst[a].update(lst.pop(b))
            else:
                b += 1
        a, b = a + 1, a + 2
    return lst


来源:https://stackoverflow.com/questions/17803919/how-can-i-get-union-of-2d-list-items-when-there-occurs-any-intersection-in-effi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!