R, find duplicated rows , regardless of order

前端 未结 3 494
遥遥无期
遥遥无期 2020-12-04 01:04

I\'ve been thinking this problem for a whole night: here is my matrix:

\'a\' \'#\' 3
\'#\' \'a\' 3
 0  \'I am\' 2
\'I am\' 0 2

.....

<
3条回答
  •  自闭症患者
    2020-12-04 01:31

    As a start, you might want to refer to the documentation for an excellent R package called duplicated. As the package notes, "duplicated() determines which elements of a vector or data frame are duplicates of elements with smaller subscripts, and returns a logical vector indicating which elements (rows) are duplicates." Some examples that they provide are:

    Example 1:

    duplicated(iris)[140:143]
    

    Example 2:

    duplicated(iris3, MARGIN = c(1, 3))
    

    Example3

    anyDuplicated(iris)
    

    Example 4

    anyDuplicated(x)
    

    Example 5

    anyDuplicated(x, fromLast = TRUE)
    

    EDIT: If you wanted to do it the long way, you might think of comparing every row to every other row in the data from character by character. To do this, imagine that the first row has 3 characters. For each row, you loop through and check to see if they have this character. If they do, you then reduce and check the next character. Approaching this using a self created recursive function which compares a value in a string to all other rows in the dataframe or matrix (and then subsets ONLY on rows that do not match any other rows), could work.

提交回复
热议问题