Importing csv file into R - numeric values read as characters

前端 未结 6 867
逝去的感伤
逝去的感伤 2020-12-01 01:26

I am aware that there are similar questions on this site, however, none of them seem to answer my question sufficiently.

This is what I have done so far:

I

6条回答
  •  情深已故
    2020-12-01 01:55

    If you're dealing with large datasets (i.e. datasets with a high number of columns), the solution noted above can be manually cumbersome, and requires you to know which columns are numeric a priori.

    Try this instead.

    char_data <- read.csv(input_filename, stringsAsFactors = F)
    num_data <- data.frame(data.matrix(char_data))
    numeric_columns <- sapply(num_data,function(x){mean(as.numeric(is.na(x)))<0.5})
    final_data <- data.frame(num_data[,numeric_columns], char_data[,!numeric_columns])
    

    The code does the following:

    1. Imports your data as character columns.
    2. Creates an instance of your data as numeric columns.
    3. Identifies which columns from your data are numeric (assuming columns with less than 50% NAs upon converting your data to numeric are indeed numeric).
    4. Merging the numeric and character columns into a final dataset.

    This essentially automates the import of your .csv file by preserving the data types of the original columns (as character and numeric).

提交回复
热议问题