Finding non-numeric data in an R data frame or vector

前端 未结 2 1219
萌比男神i
萌比男神i 2020-12-05 19:00

I have read in some lengthy data with read.csv(), and to my surprise the data is coming out as factors rather than numbers, so I\'m guessing there must be at least one non-n

相关标签:
2条回答
  • df <- data.frame(c(1,2,3,4,"five",6,7,8,"nine",10))
    

    The trick is knowing that converting to numeric via as.numeric(as.character(.)) will convert non-numbers to NA.

    which(is.na(as.numeric(as.character(df[[1]]))))
    ## 5 9
    

    (just using as.numeric(df[[1]]) doesn't work - it just drops the levels leaving the numeric codes).

    You might choose to suppress the warnings:

    which.nonnum <- function(x) {
       which(is.na(suppressWarnings(as.numeric(as.character(x)))))
    }
    which.nonnum(df[[1]])
    

    To be more careful, you should also check that the values weren't NA before conversion:

    which.nonnum <- function(x) {
       badNum <- is.na(suppressWarnings(as.numeric(as.character(x))))
       which(badNum & !is.na(x))
    }
    
    0 讨论(0)
  • 2020-12-05 19:30

    An alternative could be to check which entries in the vector contain any characters other than a number:

    df <- data.frame(c(1,2,3,4,"five",6,7,8,"nine",10))
    which(!grepl('^[0-9]',df[[1]]))
    ## 5 9 
    
    0 讨论(0)
提交回复
热议问题