问题
I would like to determine whether a list contains any duplicate elements, while considering permutations as equivalent. All vectors are of equal length.
What is the most efficient way (shortest running time) to accomplish this?
## SAMPLE DATA
a <- c(1, 2, 3)
b <- c(4, 5, 6)
a.same <- c(3, 1, 2)
## BOTH OF THSE LISTS SHOULD BE FLAGGED AS HAVING DUPLICATES
myList1 <- list(a, b, a)
myList2 <- list(a, b, a.same)
# CHECK FOR DUPLICATES
anyDuplicated(myList1) > 0 # TRUE
anyDuplicated(myList2) > 0 # FALSE, but would like true.
For now I am resorting to sorting each member of the list before checking for duplicates
anyDuplicated( lapply(myList2, sort) ) > 0
I am wondering if there is a more efficient alternative. Also, in the ?duplicated
documentation, it indicates "Using this for lists is potentially slow". Are there other functions better suited for lists?
回答1:
What about this...?
a <- c(1, 2, 3)
b <- c(4, 5, 6)
a.same <- c(3, 1, 2)
myList1 <- list(a, b, a)
myList2 <- list(a, b, a.same)
# For exact duplicated values: List1
DF1 <- do.call(rbind, myList1) # From list to data.frame
ind1 <- apply(DF1, 2, duplicated) # logical matrix for duplicated values
DF1[ind1] # finding duplicated values
[1] 1 2 3
# For permutations: List2
DF2 <- do.call(rbind, myList2)
ind2 <- apply(apply(DF2, 1, sort), 1, duplicated)
DF2[ind2] # duplicated values
[1] 3 1 2
回答2:
You could use setequal
:
myList1 <- list(a, b, a)
myList2 <- list(a, b, a.same)
myList3 <- list(a,b)
test1 <- function(mylist) anyDuplicated( lapply(mylist, sort) ) > 0
test1(myList1)
#[1] TRUE
test1(myList2)
#[1] TRUE
test1(myList3)
#[1] FALSE
test2 <- function(mylist) any(combn(length(mylist),2,
FUN=function(x) setequal(mylist[[x[1]]],mylist[[x[2]]])))
test2(myList1)
#[1] TRUE
test2(myList2)
#[1] TRUE
test2(myList3)
#[1] FALSE
library(microbenchmark)
microbenchmark(test1(myList2),test2(myList2))
#Unit: microseconds
# expr min lq median uq max
#1 test1(myList2) 142.256 150.9235 154.6060 162.8120 247.351
#2 test2(myList2) 63.306 70.5355 73.8955 79.5685 103.113
回答3:
a=[1,2,3]
b=[4,5,6]
samea=[3,2,1]
list1=list(a+b+a) and list(a+b+sames) both of this will create a list with same element
[1, 2, 3, 4, 5, 6, 3, 2, 1]
####so finding duplicate Function
def findDup(x):
for i in x:
if x.count(i)>1: return True
return False
来源:https://stackoverflow.com/questions/13334570/finding-duplicates-in-a-list-including-permutations