发表新帖

发表新帖

R, find duplicated rows , regardless of order

前端未结

关注

 3  533

遥遥无期 2020-12-04 01:04

I\'ve been thinking this problem for a whole night: here is my matrix:

\'a\' \'#\' 3
\'#\' \'a\' 3
 0  \'I am\' 2
\'I am\' 0 2

.....

<

3条回答

自闭症患者 (楼主)

2020-12-04 01:31
As a start, you might want to refer to the documentation for an excellent R package called duplicated. As the package notes, "duplicated() determines which elements of a vector or data frame are duplicates of elements with smaller subscripts, and returns a logical vector indicating which elements (rows) are duplicates." Some examples that they provide are:

Example 1:
```
duplicated(iris)[140:143]
```
Example 2:
```
duplicated(iris3, MARGIN = c(1, 3))
```
Example3
```
anyDuplicated(iris)
```
Example 4
```
anyDuplicated(x)
```
Example 5
```
anyDuplicated(x, fromLast = TRUE)
```
EDIT: If you wanted to do it the long way, you might think of comparing every row to every other row in the data from character by character. To do this, imagine that the first row has 3 characters. For each row, you loop through and check to see if they have this character. If they do, you then reduce and check the next character. Approaching this using a self created recursive function which compares a value in a string to all other rows in the dataframe or matrix (and then subsets ONLY on rows that do not match any other rows), could work.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题