Conditionally creating a new column

不羁的心 提交于 2020-01-05 07:47:13

问题


I am fairly certain this is a really obvious question, but I can't figure it out.

Lets say I have the following dataset:

test <- data.frame(A = c(1:10),
              B = c(1:10), C = c(1:10),
              P = c(1:10))

And I want to test, if there is a column called "P", create a new column called "Z" and put some content in it calculated from P.

I wrote the following code (just to try and get it to conditionally create the column, I've not tried to get it to do anything with that yet!):

Clean <- function(data) {
  if("P" %in% colnames(data)) {        
    data$Z <- NA
      }
  else {
    cat("doobedooo")
      }
    }
Clean(test)

But it doesn't seem to do anything, and I don't understand why, when simply running test$Z <- NA on the dataset does work. I put the "doobedooo" in there, to see if it is returning a false at the first condition. It doesn't seem to be doing so.

Have I simply misunderstood how if statements work?


回答1:


You have to return a value from your function, and then assign that value to an object. Unlike many other languages, R doesn't modify objects in-place, at least not without a lot of work.

Clean <- function(data) {
    if("P" %in% colnames(data)) {        
        data$Z <- NA
    } else {
        cat("doobedooo"
    }
    return(data)
}
test <- Clean(test)



回答2:


@HongOi answer is the direct answer to your question. Mine is the R way to deal with your problem. Since you want to create , another column combinations of others, you can use transform (or within), for example:

if('P' %in% colnames(test))
     test <- transform(test,Z={## you can put any statement here
                               x=P+1
                               x^2
                               round(x/12,2)
                             }
                          )

 head(test)
  A B C P    Z
1 1 1 1 1 0.17
2 2 2 2 2 0.25
3 3 3 3 3 0.33
4 4 4 4 4 0.42
5 5 5 5 5 0.50
6 6 6 6 6 0.58



回答3:


Previous answer already gives everything you need. However, there is another way to deal with these problems. In R you can use environment to set and add data by reference instead of return()ing the whole table (even if you change a piece of it).

env <- new.env()
env$test <- test

system.time({
Clean <- function(data) {
  if("P" %in% names(data$test)) {        
    data$test$Z <- NA
  }
  else {
    cat("doobedooo")
  }
}
Clean(env)
})

> env$test
    A  B  C  P  Z
1   1  1  1  1 NA
2   2  2  2  2 NA
3   3  3  3  3 NA
4   4  4  4  4 NA
5   5  5  5  5 NA
6   6  6  6  6 NA
7   7  7  7  7 NA
8   8  8  8  8 NA
9   9  9  9  9 NA
10 10 10 10 10 NA


来源:https://stackoverflow.com/questions/17339995/conditionally-creating-a-new-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!