R scoping: disallow global variables in function

前端 未结 5 1465
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-08 23:37

Is there any way to throw a warning (and fail..) if a global variable is used within a R function? I think that is much saver and prevents unintended behaviours

相关标签:
5条回答
  • 2020-12-08 23:41

    My other answer is more about what approach you can take inside your function. Now I'll provide some insight on what to do once your function is defined.

    To ensure that your function is not using global variables when it shouldn't be, use the codetools package.

    library(codetools)
    
    sUm <- 10
    f <- function(x, y) {
        sum = x + y
        return(sUm)
    }
    
    checkUsage(f)
    

    This will print the message:

    <anonymous> local variable ‘sum’ assigned but may not be used (:1)

    To see if any global variables were used in your function, you can compare the output of the findGlobals() function with the variables in the global environment.

    > findGlobals(f)
    [1] "{"  "+"  "="  "return"  "sUm"
    
    > intersect(findGlobals(f), ls(envir=.GlobalEnv))
    [1] "sUm"
    

    That tells you that the global variable sUm was used inside f() when it probably shouldn't have been.

    0 讨论(0)
  • 2020-12-08 23:50

    Another way (or style) is to keep all global variables in a special environment:

    with( globals <- new.env(), {
      # here define all "global variables"  
      sUm <- 10
      mEan <- 5
    })
    
    # or add a variable by using $
    globals$another_one <- 42
    

    Then the function won't be able to get them:

    sum <- function(x,y){
      sum = x+y
      return(sUm)
    }
    
    sum(1,2)
    # Error in sum(1, 2) : object 'sUm' not found
    

    But you can always use them with globals$:

    globals$sUm
    [1] 10
    

    To manage the discipline, you can check if there is any global variable (except functions) outside of globals:

    setdiff(ls(), union(lsf.str(), "globals")))
    
    0 讨论(0)
  • 2020-12-08 23:57

    You can check whether the variable's name appears in the list of global variables. Note that this is imperfect if the global variable in question has the same name as an argument to your function.

    if (deparse(substitute(var)) %in% ls(envir=.GlobalEnv))
        stop("Do not use a global variable!")
    

    The stop() function will halt execution of the function and display the given error message.

    0 讨论(0)
  • 2020-12-08 23:57

    Using get is a way:

    sUm <- 10
    sum <- function(x,y){
      sum <- x+y
      #with inherits = FALSE below the variable is only searched 
      #in the specified environment in the envir argument below
      get('sUm', envir = environment(), inherits=FALSE) 
    }
    

    Output:

    > sum(1,6)
    Error in get("sUm", envir = environment(), inherits = FALSE) : 
      object 'sUm' not found
    

    Having the right sum in the get function would still only look inside the function's environment for the variable, meaning that if there were two variables, one inside the function and one in the global environment with the same name, the function would always look for the variable inside the function's environment and never at the global environment:

    sum <- 10
    sum2 <- function(x,y){
      sum <- x+y
      get('sum', envir = environment(), inherits=FALSE) 
    }
    
    > sum2(1,7)
    [1] 8
    
    0 讨论(0)
  • 2020-12-09 00:07

    There is no way to permanently change how variables are resolved because that would break a lot of functions. The behavior you don't like is actually very useful in many cases.

    If a variable is not found in a function, R will check the environment where the function was defined for such a variable. You can change this environment with the environment() function. For example

    environment(sum) <- baseenv()
    sum(4,5)
    # Error in sum(4, 5) : object 'sUm' not found
    

    This works because baseenv() points to the "base" environment which is empty. However, note that you don't have access to other functions with this method

    myfun<-function(x,y) {x+y}
    sum <- function(x,y){sum = myfun(x+y); return(sUm)}
    
    environment(sum)<-baseenv()
    sum(4,5)
    # Error in sum(4, 5) : could not find function "myfun"
    

    because in a functional language such as R, functions are just regular variables that are also scoped in the environment in which they are defined and would not be available in the base environment.

    You would manually have to change the environment for each function you write. Again, there is no way to change this default behavior because many of the base R functions and functions defined in packages rely on this behavior.

    0 讨论(0)
提交回复
热议问题