Warning: 'Invalid .internal.selfref detected' when adding a column to a data.table returned from a function

前端 未结 2 1238
南方客
南方客 2020-12-09 11:29

This seems as fread bug, but I am not sure.

This example reproduce my problem. I have a function where I read a data.table and return it in a list. i us

2条回答
  •  执念已碎
    2020-12-09 12:17

    Arun's answer is a great explanation. The specific feature of list() in R <= 3.0.2 is that it copies named inputs (things that have been named before the call to list()). In r-devel now (the next version of R), this copy by list() no longer happens and all will be well. It's a very welcome change in R.

    In the meantime, you can work around it by creating the output list in a different way.

    > R.version.string
    [1] "R version 3.0.2 (2013-09-25)"
    

    First demonstrate list() copying :

    > DT = data.table(a=1:3)
    > address(DT)
    [1] "0x1d70010"
    > address(list(DT)[[1]])
    [1] "0x21bc178"    # different address => list() copied the data.table named DT
    > data.table:::selfrefok(DT)
    [1] 1
    > data.table:::selfrefok(list(DT)[[1]])
    [1] 0              # i.e. this copied DT is not over-allocated
    

    Now a different way to create the same list :

    > ans = list()
    > ans$DT = DT    # use $<- instead
    > address(DT)
    [1] "0x1d70010"
    > address(ans$DT)
    [1] "0x1d70010"    # good, no copy
    > identical(ans, list(DT=DT))
    [1] TRUE
    > data.table:::selfrefok(ans$DT)
    [1] 1              # good, the list()-ed DT is still over-allocated ok
    

    Convoluted and confusing, I know. Using $<- to create the output list, or even just placing the call to fread inside the call to list() i.e. list(DT=fread(...)) should avoid the copy by list().

提交回复
热议问题