Pass an object to a function without copying it on change

半世苍凉 提交于 2020-01-04 09:27:22

问题


My question

If an object x is passed to a function f that modifies it R will create a modified local copy of x within f's environment, rather than changing the original object (due to the copy-on-change principle). However, I have a situation where x is very big and not needed once it has been passed to f, so I want to avoid storing the original copy of x once f is called. Is there a clever way to achieve this?

f is an unknown function to be supplied by a possibly not very clever user.

My current solution

The best I have so far is to wrap x in a function forget that makes a new local reference to x called y, removes the original reference in the workspace, and then passes on the new reference. The problem is that I am not certain it accomplish what I want and it only works in globalenv(), which is a deal breaker in my current case.

forget <- function(x){
    y <- x
    # x and y now refers to the same object, which has not yet been copied
    print(tracemem(y))
    rm(list=deparse(substitute(x)), envir=globalenv())
    # The outside reference is now removed so modifying `y`
    # should no longer result in a copy (other than the
    # intermediate copy produced in the assigment)
    y
}

f <- function(x){
    print(tracemem(x))
    x[2] <- 9000.1
    x
}

Here is an example of calling the above function.

> a <- 1:3
> tracemem(a)
[1] "<0x2ac1028>"
> b <- f(forget(a))
[1] "<0x2ac1028>"
[1] "<0x2ac1028>"
tracemem[0x2ac1028 -> 0x2ac1e78]: f 
tracemem[0x2ac1e78 -> 0x308f7a0]: f 
> tracemem(b)
[1] "<0x308f7a0>"
> b
[1]    1.0 9000.1    3.0
> a
Error: object 'a' not found

Bottom line

Am I doing what I hope I am doing and is there a better way to do it?


回答1:


(1) Environments You can use environments for that:

e <- new.env()
e$x <- 1:3
f <- function(e) with(e, x <- x + 1)
f(e)
e$x

(2) Reference Classes or since reference classes automatically use environments use those:

E <- setRefClass("E", fields = "x",
    methods = list(
        f = function() x <<- x + 1
    )
)
e <- E$new(x = 1:3)
e$f()
e$x

(3) proto objects also use environments:

library(proto)
p <- proto(x = 1:3, f = function(.) with(., x <- x + 1))
p$f()
p$x

ADDED: proto solution

UPDATED: Changed function name to f for consistency with question.




回答2:


I think the easiest approach is to only load the working copy into memory, instead of loading both the original (global namespace) and the working copy (function namespace). You can sidestep your whole issue by using the 'ff' package to define your 'x' and 'y' data sets as 'ffdf' data frames. As I understand it, 'ffdf' data frames reside on disk and load into memory only as parts of the data frame are needed and purge when those parts are no longer necessary. This would mean, theoretically, that the data would be loaded into memory to copy into the function namespace and then purged after the copy was complete.

I'll admit that I rarely have to use the 'ff' package, and when I do, I usually don't have any issues at all. I'm not checking specific memory usage, though, and my goal is usually just to perform a large calculation across the data. It works, and I don't ask questions.



来源:https://stackoverflow.com/questions/14647636/pass-an-object-to-a-function-without-copying-it-on-change

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!