I want to write a function that can take columns within a data frame or column names and the data frame they come from as arguments.
df <- data.frame(x = c(1:
To sort of piggy-back off of Cettt - something like this may be what you're looking for:
df <- data.frame(x = c(1:5), y = c(6:10), z = LETTERS[1:5])
my_fxn <- function (aaa, bbb, ccc, data) {
if (!missing(data)) {
aaa = as.numeric(data[[aaa]])
bbb = as.numeric(data[[bbb]])
ccc = as.character(data[[ccc]])
}
print(aaa[1])
}
my_fxn("x", "y", "z", df)
#> [1] 1
With the use of enquo() from library(dplyr), we no longer need to enter characters as the function variables:
library(dplyr)
my_fxn <- function (aaa, bbb, ccc, data) {
aaa <- enquo(aaa)
bbb <- enquo(bbb)
ccc <- enquo(ccc)
if (!missing(data)) {
aaa = as.numeric(pull(data, !!aaa))
bbb = as.numeric(pull(data, !!bbb))
ccc = as.character(pull(data, !!ccc))
}
print(aaa[1])
}
my_fxn(x, y, z, df)
#> [1] 1
More info about function building with enquo() and !! can be found here: https://dplyr.tidyverse.org/articles/programming.html#programming-recipes
Finally, a base R solution using deparse() and substitute():
my_fxn <- function (aaa, bbb, ccc, data) {
aaa <- deparse(substitute(aaa))
bbb <- deparse(substitute(bbb))
ccc <- deparse(substitute(ccc))
if (!missing(data)) {
aaa = as.numeric(data[[aaa]])
bbb = as.numeric(data[[bbb]])
ccc = as.character(data[[ccc]])
}
print(aaa[1])
}
my_fxn(x, y, z, df)
#> [1] 1
The problem is that when calling my_fxn(x, y, z, df) the object x is not defined.
Hence df$x does not return column x but NA.
Consider this small example:
df <- data.frame(x = 1:3, y = 4:6)
x <- "y"
df$x # returns column x
[1] 1 2 3
df[,x] #returns column y since the value which is stored in x is "y"
[1] 4 5 6
To circumvent your problem you can use data[, aaa] instead of data$aaa.
Yet another alternative would be to use the dplyr package where you can use select(data, aaa).