I would like to reassign 128 column classes with a list/vector of column classes?

自作多情 提交于 2019-12-25 00:44:13

问题


I can't seem to find what I need in other posts, essentially,

  1. I need to reorder my data from the data.table read in (I can't give the col classes fread statement because my columns are out of order)
  2. I need to change the columns classes to what I need listed below.

A lot of the other posts seem to be changing all of one type of class to another type of class:

Change the class of many columns in a data frame

Convert column classes in data.table

I believe my problem is different because there is no "change all factors to characters" etc. Each column has a specific class that I must change to ahead of time.

I have my column names in a vector called selectColumns that I pass to fread.

selectColumns <- c(giantListofColumnsGoesHere)
DT <- fread("DT.csv", select=selectColumns, na.strings=NAsList)

setcolorder(DT, selectColumns)
colClasses <- list('character','character','character','factor','numeric','character','numeric','integer','integer','integer','integer','numeric','numeric','factor','factor','factor','logical','integer','numeric','factor','integer','integer','integer','factor','factor','factor','factor','factor','integer','integer','factor','integer','factor','factor','integer','factor','numeric','factor','numeric','character','factor','factor','factor','factor','factor','factor','factor','factor','factor','factor','integer','factor','numeric','factor','factor','character','factor','factor','factor','integer','numeric','integer','integer','integer','integer','integer','factor','character','factor','factor','factor','factor','integer','factor','factor','character','integer','integer','integer','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical')

#Now the part I can't figure out, I've tried:
lapply(DT, class) <- colClasses
#OR
attr(DT, class) <- colClasses
#Obviously attr(DT, class) just gives "data.table" "data.frame"

But I need to subset the DT's column attributes to get the lower level lists somehow, but I'm not great with lists and I can't seem to figure this out. I'm sorry if this is too easy of a question and already been answered essentially, but I'm lost and it seems like there is usually an easy way to do this.

I'm sorry I can't give data because this it contains private information.

Thanks for any help everyone.


回答1:


Suppose if the OP forgot to use colClasses inside fread or if there is any technical difficulty in using that and wants to change the class of the data.table, using set will be an option

for(j in seq_along(selectColumns)){
     set(DT, i= NULL, j=selectColumns[j], value = get(colClasses[j])(DT[[selectColumns[j]]]))
 } 

str(DT)
#Classes ‘data.table’ and 'data.frame':  5 obs. of  6 variables:
#$ V1: num  1 2 3 4 5
#$ V2: chr  "A" "B" "C" "D" ...
#$ V3: int  1 2 3 4 5
#$ V4: chr  "F" "G" "H" "I" ...
#$ V5: chr  "G" "H" "I" "J" ...
#$ V6: Factor w/ 5 levels "6","7","8","9",..: 1 2 3 4 5

Note that the initial class for the "selectColumns" were

str(DT)
#Classes ‘data.table’ and 'data.frame':  5 obs. of  6 variables:
#$ V1: int  1 2 3 4 5
#$ V2: chr  "A" "B" "C" "D" ...
#$ V3: num  1 2 3 4 5
#$ V4: chr  "F" "G" "H" "I" ...
#$ V5: chr  "G" "H" "I" "J" ...
#$ V6: int  6 7 8 9 10

data

 DT <- data.table(V1= 1:5, V2 = LETTERS[1:5], V3 = as.numeric(1:5),
          V4 = LETTERS[6:10], V5 = LETTERS[7:11], V6 = 6:10)
 colClasses <- paste0("as.",c("numeric", "integer", "factor"))
 selectColumns <- c("V1", "V3", "V6")

NOTE: Added as. to "colClasses" vector to make the conversion. If we are converting 'factor' to 'numeric', then we have to do this in two steps, i.e. first convert to 'character' and then to 'numeric' (Based on @Frank's suggestion in the comments)



来源:https://stackoverflow.com/questions/36732331/i-would-like-to-reassign-128-column-classes-with-a-list-vector-of-column-classes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!