I\'m still learning how to translate a SAS code into R and I get warnings. I need to understand where I\'m making mistakes. What I want to do is create a variable which summ
If you are using any spreadsheet application there is a basic function if() with syntax:
if(, , )
Syntax is exactly the same for ifelse() in R:
ifelse(, , )
The only difference to if() in spreadsheet application is that R ifelse() is vectorized (takes vectors as input and return vector on output). Consider the following comparison of formulas in spreadsheet application and in R for an example where we would like to compare if a > b and return 1 if yes and 0 if not.
In spreadsheet:
A B C
1 3 1 =if(A1 > B1, 1, 0)
2 2 2 =if(A2 > B2, 1, 0)
3 1 3 =if(A3 > B3, 1, 0)
In R:
> a <- 3:1; b <- 1:3
> ifelse(a > b, 1, 0)
[1] 1 0 0
ifelse() can be nested in many ways:
ifelse(, , ifelse(, , ))
ifelse(, ifelse(, , ), )
ifelse(,
ifelse(, , ),
ifelse(, , )
)
ifelse(, ,
ifelse(, ,
ifelse(, , )
)
)
To calculate column idnat2 you can:
df <- read.table(header=TRUE, text="
idnat idbp idnat2
french mainland mainland
french colony overseas
french overseas overseas
foreign foreign foreign"
)
with(df,
ifelse(idnat=="french",
ifelse(idbp %in% c("overseas","colony"),"overseas","mainland"),"foreign")
)
R Documentation
What is the condition has length > 1 and only the first element will be used? Let's see:
> # What is first condition really testing?
> with(df, idnat=="french")
[1] TRUE TRUE TRUE FALSE
> # This is result of vectorized function - equality of all elements in idnat and
> # string "french" is tested.
> # Vector of logical values is returned (has the same length as idnat)
> df$idnat2 <- with(df,
+ if(idnat=="french"){
+ idnat2 <- "xxx"
+ }
+ )
Warning message:
In if (idnat == "french") { :
the condition has length > 1 and only the first element will be used
> # Note that the first element of comparison is TRUE and that's whay we get:
> df
idnat idbp idnat2
1 french mainland xxx
2 french colony xxx
3 french overseas xxx
4 foreign foreign xxx
> # There is really logic in it, you have to get used to it
Can I still use if()? Yes, you can, but the syntax is not so cool :)
test <- function(x) {
if(x=="french") {
"french"
} else{
"not really french"
}
}
apply(array(df[["idnat"]]),MARGIN=1, FUN=test)
If you are familiar with SQL, you can also use CASE statement in sqldf package.