I have a very strange problem concerning the ifelse function: it does not return a factor (as I want) but something like the position of the factor.
The dataset I u
The field answer
is factor, hence your function returns number (level of factor).
What you need to do is:
aDDs$answer <- as.character(aDDs$answer)
and then it works.
Modify your ifelse
as follows
aDDs$top <- ifelse(
aDDs$answer %in% temp, ## condition: match aDDs$answer with row.names in summary df
levels(aDDs$answer)[aDDs$answer], ## then it should be named as aDDs$answer **THIS IS THE PROBLEM**
"Other" ## else it should be named "Other"
)
Notice the function levels
and the box brackets. Levels knows how many factors are their and their index. So, essentially what we are saying is give me the the factor value corresponding to some index value.
Sample demo:
topCountries<-as.factor(c("India", "USA", "UK"))
AllCountries<-as.factor(c("India", "USA", "UK", "China", "Brazil"))
myData<-data.frame(AllCountries)
myData
myData$top<-ifelse(
myData$AllCountries %in% topCountries,
levels(myData$AllCountries)[myData$AllCountries],
"Other"
)
myData
the top
column in myData will have "Other" for China & Brazil. For rows where Allcountries in {India, USA, UK} it will return their respective values i.e., {India, USA, UK} . Without the use of levels
it will return "Other" and factor index values for {India, USA, UK} .
That's because you have a factor:
ifelse(c(T, F), factor(c("a", "b")), "other")
#[1] "1" "other"
Read the warning in help("ifelse")
:
The mode of the result may depend on the value of test (see the examples), and the class attribute (see oldClass) of the result is taken from test and may be inappropriate for the values selected from yes and no.
Sometimes it is better to use a construction such as
(tmp <- yes; tmp[!test] <- no[!test]; tmp) , possibly extended to handle missing values in test.