How to reproduce all column names when producing a table to cross reference column names and datatypes from multiple dbfs in R

这一生的挚爱 提交于 2021-01-29 19:39:12

问题


This is a follow up question to Implementing lists in a for loop in R to produce a table of column names and datatypes from multiple dbfs.

I’m trying to extract the column names and associated datatypes from a number of dbfs and put the results into a table to cross reference which column names and datatypes appear in which dbfs. The dbfs have different numbers of columns so I’ve used rbind and lapply to fill missing values with NULL in the resulting table. Although the script I have works to an extent, the column names are only kept from the initial dbf. When new column names appear, the data is added to the table but the columns are given the names V35, V36 etc. instead of the actual column names.

library(foreign)
files <- list.files("path/", full.names = TRUE, pattern = "*.dbf$") #List files

#Get column names and datatypes from dbfs and put into list
colnamesDTList <- list()
for (i in 1:14){
  dbfs <- read.dbf(files[i])
  ColnamesDT <- lapply(dbfs,class)
  ColnamesDTList[[i]] <- ColnamesDT
}

maxLength <- max(lengths(ColnamesDTList)) #Get max length of the lists in ColnamesDTList

#Create a df from the lists in ColnamesDTList, with equal length columns
ColnamesDTDf <- as.data.frame(do.call(rbind, lapply(ColnamesDTList, `length<-`, maxLength)))

#Rename rows
years <- 2005:2018
new.names <-NULL
for(i in 1:14){
  new.names[i]<-paste("dbf", years[i], sep="")
}
row.names(ColnamesDTDf)<-new.names

This produces a table like this:

        cname1  cname2  cname3  V4      V5
dbf2005 factor  factor  numeric NULL    NULL
dbf2006 numeric factor  NULL    factor  numeric

So instead of producing the actual column names from 2006 they are instead given the generic ‘V’ plus the column number in which they appear. How can I get the table to include the column names from dbf2006?


回答1:


I found a much simpler solution using the compare_df_cols() function in the janitor package.



来源:https://stackoverflow.com/questions/64589397/how-to-reproduce-all-column-names-when-producing-a-table-to-cross-reference-colu

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!