I have a data which contain some NA value in their elements. What I want to do is to perform clustering without removing rows where the NA is present.
Using as.numeric may help in this case, but I do think that the original question points to a bug in the daisy
function. Specifically, it has the following code:
if (any(ina <- is.na(type3)))
stop(gettextf("invalid type %s for column numbers %s",
type2[ina], pColl(which(is.na))))
The intended error message is not printed, because which(is.na)
is wrong. It should be which(ina)
.
I guess I should find out where / how to submit this bug now.
The error is due to the presence of non-numeric variables in the data (numbers encoded as strings). You can convert them to numbers:
mydata <- apply( mtcars, 2, as.numeric )
d <- distfunc(mydata)