问题
I am fairly certain this is a really obvious question, but I can't figure it out.
Lets say I have the following dataset:
test <- data.frame(A = c(1:10),
B = c(1:10), C = c(1:10),
P = c(1:10))
And I want to test, if there is a column called "P", create a new column called "Z" and put some content in it calculated from P.
I wrote the following code (just to try and get it to conditionally create the column, I've not tried to get it to do anything with that yet!):
Clean <- function(data) {
if("P" %in% colnames(data)) {
data$Z <- NA
}
else {
cat("doobedooo")
}
}
Clean(test)
But it doesn't seem to do anything, and I don't understand why, when simply running test$Z <- NA
on the dataset does work.
I put the "doobedooo" in there, to see if it is returning a false at the first condition. It doesn't seem to be doing so.
Have I simply misunderstood how if statements work?
回答1:
You have to return a value from your function, and then assign that value to an object. Unlike many other languages, R doesn't modify objects in-place, at least not without a lot of work.
Clean <- function(data) {
if("P" %in% colnames(data)) {
data$Z <- NA
} else {
cat("doobedooo"
}
return(data)
}
test <- Clean(test)
回答2:
@HongOi answer is the direct answer to your question. Mine is the R way to deal with your problem. Since you want to create , another column combinations of others, you can use transform
(or within
), for example:
if('P' %in% colnames(test))
test <- transform(test,Z={## you can put any statement here
x=P+1
x^2
round(x/12,2)
}
)
head(test)
A B C P Z
1 1 1 1 1 0.17
2 2 2 2 2 0.25
3 3 3 3 3 0.33
4 4 4 4 4 0.42
5 5 5 5 5 0.50
6 6 6 6 6 0.58
回答3:
Previous answer already gives everything you need. However, there is another way to deal with these problems. In R
you can use environment
to set and add data by reference instead of return()ing the whole table (even if you change a piece of it).
env <- new.env()
env$test <- test
system.time({
Clean <- function(data) {
if("P" %in% names(data$test)) {
data$test$Z <- NA
}
else {
cat("doobedooo")
}
}
Clean(env)
})
> env$test
A B C P Z
1 1 1 1 1 NA
2 2 2 2 2 NA
3 3 3 3 3 NA
4 4 4 4 4 NA
5 5 5 5 5 NA
6 6 6 6 6 NA
7 7 7 7 7 NA
8 8 8 8 8 NA
9 9 9 9 9 NA
10 10 10 10 10 NA
来源:https://stackoverflow.com/questions/17339995/conditionally-creating-a-new-column