From a set of pairs, find all subsets s.t. no pair in the subset shares any element with a pair not in the subset

有些话、适合烂在心里 提交于 2020-08-20 06:12:57

问题


I have a set of pairs. Each pair is represented as [i,1:2]. That is, the ith pair are the numbers in the first and second column in the ith row.

I need to sort these pairs into distinct groups, s.t. there is no element in any pair in the jth group that is in any group not j. For example:

EXAMPLE 1: DATA

> col1 <- c(3, 4, 6, 7, 10, 8)
> col2 <- c(6, 7, 3, 4, 3,  1)
> 
> dat <- cbind(col1, col2)
> rownames(dat) <- 1:nrow(dat)
> 
> dat
  col1 col2
1    3    6
2    4    7
3    6    3
4    7    4
5   10    3
6    8    1

For all pairs, it doesn't matter if the number is in column 1 or column 2, the pairs should be sorted into groups s.t. every number in every pair in every group exists only in one group. So the solved example would look like this.

  col1 col2 groups
1    3    6      1
2    4    7      2
3    6    3      1
4    7    4      2
5   10    3      1
6    8    1      3

Rows 1, 3, and 5 are grouped together because 1 and 3 contain the same numbers and 5 shares the number 3, so it must be grouped with them. 2 and 4 share the same distinct numbers so they are grouped together and 6 has unique numbers so it is left alone.

If we change the data slightly, note the following.

EXAMPLE 2: NEW DATA

Note what happens when we add a row that shares an element with row 6 and row 5.

  col1 col2 groups
1    3    6      1
2    4    7      2
3    6    3      1
4    7    4      2
5   10    3      1
6    8    1      1
7    1   10      1

The 10 in the 7th row connects it to the first group because it shares an elements with the 5th row. It also shares an element with the 6th row (the number 1), so the 6th row would instead be in group 1.

PROBLEM

Is there a simple way to form the groups? A vector operation? A sorting algorithm? It gets very nasty very quickly if you try to do it with a loop, since each subsequent row can change the membership of earlier rows, as demonstrated in the example.


回答1:


To draw on the old answer at: identify groups of linked episodes which chain together , which assigns a group to each individual value, you could try this to assign a group to each linked pair:

library(igraph)
g <- graph.data.frame(dat)
links <- data.frame(col1=V(g)$name,group=clusters(g)$membership)
merge(dat,links,by="col1",all.x=TRUE,sort=FALSE)

#  col1 col2 group
#1    3    6     1
#2    4    7     2
#3    6    3     1
#4    7    4     2
#5   10    3     1
#6    8    1     3



回答2:


Your elements can be regarded as vertices in an undirected graph, and your pairs can be regarded as edges, and then (assuming that you want to find groups of minimal size -- if you don't, then e.g. the entire set of pairs could be labelled "Group 1") the groups you're looking for are the connected components in this graph. They can all be found in linear time with a depth-first or breadth-first search.



来源:https://stackoverflow.com/questions/25173752/from-a-set-of-pairs-find-all-subsets-s-t-no-pair-in-the-subset-shares-any-elem

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!