Nested ifelse statement

前端 未结 9 1420
逝去的感伤
逝去的感伤 2020-11-22 04:02

I\'m still learning how to translate a SAS code into R and I get warnings. I need to understand where I\'m making mistakes. What I want to do is create a variable which summ

9条回答
  •  清歌不尽
    2020-11-22 04:21

    If you are using any spreadsheet application there is a basic function if() with syntax:

    if(, , )
    

    Syntax is exactly the same for ifelse() in R:

    ifelse(, , )
    

    The only difference to if() in spreadsheet application is that R ifelse() is vectorized (takes vectors as input and return vector on output). Consider the following comparison of formulas in spreadsheet application and in R for an example where we would like to compare if a > b and return 1 if yes and 0 if not.

    In spreadsheet:

      A  B C
    1 3  1 =if(A1 > B1, 1, 0)
    2 2  2 =if(A2 > B2, 1, 0)
    3 1  3 =if(A3 > B3, 1, 0)
    

    In R:

    > a <- 3:1; b <- 1:3
    > ifelse(a > b, 1, 0)
    [1] 1 0 0
    

    ifelse() can be nested in many ways:

    ifelse(, , ifelse(, , ))
    
    ifelse(, ifelse(, , ), )
    
    ifelse(, 
           ifelse(, , ), 
           ifelse(, , )
          )
    
    ifelse(, , 
           ifelse(, , 
                  ifelse(, , )
                 )
           )
    

    To calculate column idnat2 you can:

    df <- read.table(header=TRUE, text="
    idnat idbp idnat2
    french mainland mainland
    french colony overseas
    french overseas overseas
    foreign foreign foreign"
    )
    
    with(df, 
         ifelse(idnat=="french",
           ifelse(idbp %in% c("overseas","colony"),"overseas","mainland"),"foreign")
         )
    

    R Documentation

    What is the condition has length > 1 and only the first element will be used? Let's see:

    > # What is first condition really testing?
    > with(df, idnat=="french")
    [1]  TRUE  TRUE  TRUE FALSE
    > # This is result of vectorized function - equality of all elements in idnat and 
    > # string "french" is tested.
    > # Vector of logical values is returned (has the same length as idnat)
    > df$idnat2 <- with(df,
    +   if(idnat=="french"){
    +   idnat2 <- "xxx"
    +   }
    +   )
    Warning message:
    In if (idnat == "french") { :
      the condition has length > 1 and only the first element will be used
    > # Note that the first element of comparison is TRUE and that's whay we get:
    > df
        idnat     idbp idnat2
    1  french mainland    xxx
    2  french   colony    xxx
    3  french overseas    xxx
    4 foreign  foreign    xxx
    > # There is really logic in it, you have to get used to it
    

    Can I still use if()? Yes, you can, but the syntax is not so cool :)

    test <- function(x) {
      if(x=="french") {
        "french"
      } else{
        "not really french"
      }
    }
    
    apply(array(df[["idnat"]]),MARGIN=1, FUN=test)
    

    If you are familiar with SQL, you can also use CASE statement in sqldf package.

提交回复
热议问题