Porting set operations from R's data frames to data tables: How to identify duplicated rows?

后端 未结 1 1888
梦毁少年i
梦毁少年i 2020-12-16 05:32

[Update 1: As Matthew Dowle noted, I\'m using data.table version 1.6.7 on R-Forge, not CRAN. You won\'t see the same behavior with an earlier version of

相关标签:
1条回答
  • 2020-12-16 06:08

    duplicated.data.table needs the same fix unique.data.table got [EDIT: Now done in v1.7.2]. Please raise another bug report: bug.report(package="data.table"). For the benefit of others watching, you're already using v1.6.7 from R-Forge, not 1.6.6 on CRAN.

    But, on Note 1, there's a 'not join' idiom :

    x[-x[y,which=TRUE]]
    

    See also FR#1384 (New 'not' and 'whichna' arguments?) to make that easier for users, and that links to the keys that don't match thread which goes into more detail.


    Update. Now in v1.8.3, not-join has been implemented.

    DT[-DT["a",which=TRUE,nomatch=0],...]   # old idiom
    DT[!"a",...]                            # same result, now preferred.
    
    0 讨论(0)
提交回复
热议问题