Cannot use dput for data.table in R

前端 未结 3 2133
小蘑菇
小蘑菇 2020-12-03 17:57

I have following data.table for which I cannot use output of dput command to recreate it:

> ddt
   Unit Anything index new
1:    A      3.4     1   1
2:           


        
相关标签:
3条回答
  • 2020-12-03 18:33

    If you've already dput the file and you don't feel much like manually editing before dget, you can use the following

    data.table.parse<-function (file = "", n = NULL, text = NULL, prompt = "?", keep.source = getOption("keep.source"), 
                                srcfile = NULL, encoding = "unknown") 
    {
      keep.source <- isTRUE(keep.source)
      if (!is.null(text)) {
        if (length(text) == 0L) 
          return(expression())
        if (missing(srcfile)) {
          srcfile <- "<text>"
          if (keep.source) 
            srcfile <- srcfilecopy(srcfile, text)
        }
        file <- stdin()
      }
      else {
        if (is.character(file)) {
          if (file == "") {
            file <- stdin()
            if (missing(srcfile)) 
              srcfile <- "<stdin>"
          }
          else {
            filename <- file
            file <- file(filename, "r")
            if (missing(srcfile)) 
              srcfile <- filename
            if (keep.source) {
              text <- readLines(file, warn = FALSE)
              if (!length(text)) 
                text <- ""
              close(file)
              file <- stdin()
              srcfile <- srcfilecopy(filename, text, file.mtime(filename), 
                                     isFile = TRUE)
            }
            else {
              text <- readLines(file, warn = FALSE)
              if (!length(text)) {
                text <- ""
              } else {
                text <- gsub("(, .internal.selfref = <pointer: 0x[0-9A-Fa-f]+>)","",text,perl=TRUE)
              }
              on.exit(close(file))
            }
          }
        }
      }
      #  text <- gsub("(, .internal.selfref = <pointer: 0x[0-9A-F]+>)","",text)
      .Internal(parse(file, n, text, prompt, srcfile, encoding))
    }
    data.table.get <- function(file, keep.source = FALSE)
      eval(data.table.parse(file = file, keep.source = keep.source))
    dtget <- data.table.get
    

    then change your calls of dget to dtget. Note that due to the inline parsing, this will make dtget slower than dget, so use it only in circumstances where you could be retrieving an object of type data.table.

    0 讨论(0)
  • 2020-12-03 18:36

    I have also found this behavior rather annoying. So I have created my own dput function that ignores the .internal.selfref attribute.

    dput <- function (x, file = "", control = c("keepNA", "keepInteger", 
                                        "showAttributes")) 
    {
      if (is.character(file)) 
        if (nzchar(file)) {
          file <- file(file, "wt")
          on.exit(close(file))
        }
      else file <- stdout()
      opts <- .deparseOpts(control)
      # adding these three lines for data.tables
      if (is.data.table(x)) {
        setattr(x, '.internal.selfref', NULL)
      }
      if (isS4(x)) {
        clx <- class(x)
        cat("new(\"", clx, "\"\n", file = file, sep = "")
        for (n in .slotNames(clx)) {
          cat("    ,", n, "= ", file = file)
          dput(slot(x, n), file = file, control = control)
        }
        cat(")\n", file = file)
        invisible()
      }
      else .Internal(dput(x, file, opts))
    }
    
    0 讨论(0)
  • 2020-12-03 18:45

    The problem is that dput prints out external pointer address (this is something that data.table uses internally, and will reconstruct when required), which you can't really use.

    If you manually cut out the .internal.selfref part, it will work just fine, except for a one-time complaint from data.table for some operations.

    You could add an FR to data.table about this, but it will require modifying the base function from data.table, similar to how rbind is currently handled.

    0 讨论(0)
提交回复
热议问题