Make rbindlist skip, ignore or change class attribute of the column

耗尽温柔 提交于 2019-12-23 06:52:20

问题


I would like to merge a large set of dataframes (about 30), which each have about 200 variables. These datasets are very much alike but not identical.

Please find two example dataframes below:

library(data.table)
library(haven)
df1 <- fread(
    "A   B   C  iso   year   
     0   B   1  NLD   2009   
     1   A   2  NLD   2009   
     0   Y   3  AUS   2011   
     1   Q   4  AUS   2011   
     0   NA  7  NLD   2008   
     1   0   1  NLD   2008   
     0   1   3  AUS   2012",
  header = TRUE
)
df2 <- fread(
    "A   B   D  E  iso   year   
     0   1   1  NA ECU   2009   
     1   0   2  0  ECU   2009   
     0   0   3  0  BRA   2011   
     1   0   4  0  BRA   2011   
     0   1   7  NA ECU   2008   
     1   0   1  0  ECU   2008   
     0   0   3  2  BRA   2012   
     1   0   4  NA BRA   2012",
  header = TRUE
)

To recreate the error:

class(df2$B) <- "anything"

When I do the following

df_merged <- rbindlist(list(df1, df2), fill=TRUE, use.names=TRUE)

The dataset gives the error:

Error in rbindlist(list(df1, df2), fill = TRUE, use.names = TRUE) : 
  Class attribute on column 2 of item 2 does not match with column 2 of item 1.

What can I do to either:

  1. Make rbindlist skip the column which does not match and add some suffix.
  2. Change the class of one of the columns to the other one.

Desired result for option 1:

df_merged <- fread(
    "A   B  B.x  C  D   E   iso   year   
     0   A   NA  1  NA  NA  NLD   2009   
     1   Y   NA  2  NA  NA  NLD   2009   
     0   Q   NA  3  NA  NA  AUS   2011   
     1   NA  NA  4  NA  NA  AUS   2011   
     0   0   NA  7  NA  NA  NLD   2008   
     1   1   NA  1  NA  NA  NLD   2008   
     0   1   NA  3  NA  NA  AUS   2012   
     0   NA  1   NA  1  NA  ECU   2009   
     1   NA  0   NA  2  0   ECU   2009   
     0   NA  0   NA  3  0   BRA   2011   
     1   NA  0   NA  4  0   BRA   2011   
     0   NA  1   NA  7  NA  ECU   2008   
     1   NA  0   NA  1  0   ECU   2008   
     0   NA  0   NA  3  2   BRA   2012   
     1   NA  0   NA  4  NA  BRA   2012",
   header = TRUE
)

Desired result for option 2:

df_merged <- fread(
    "A   B   C  D   E   iso   year   
     0   3   1  NA  NA  NLD   2009   
     1   4   2  NA  NA  NLD   2009   
     0   5   3  NA  NA  AUS   2011   
     1   5   4  NA  NA  AUS   2011   
     0   0   7  NA  NA  NLD   2008   
     1   1   1  NA  NA  NLD   2008   
     0   1   3  NA  NA  AUS   2012   
     0   1   NA  1  NA  ECU   2009   
     1   0   NA  2  0   ECU   2009   
     0   0   NA  3  0   BRA   2011   
     1   0   NA  4  0   BRA   2011   
     0   1   NA  7  NA  ECU   2008   
     1   0   NA  1  0   ECU   2008   
     0   0   NA  3  2   BRA   2012   
     1   0   NA  4  NA  BRA   2012",",
   header = TRUE
)

回答1:


I am having the same issue and have not yet been able to find a solution. The data.table package was recently updated (April 7, 2019). This update I believe is what is causing the problem, and why people are saying it works fine for them. Refer to features 4 and 5 in v1.12.2 in the link below.

data.table news



来源:https://stackoverflow.com/questions/55706560/make-rbindlist-skip-ignore-or-change-class-attribute-of-the-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!