How to combine different .csv files to one complete file by adding the data of every file as an additional row using R?

天涯浪子 提交于 2019-12-12 17:42:36

问题


I have several different folders which all contain one single .csv file. All of those .csv files have one single column containing the data of one condition of an experiment. I would like to merge those .csv files in such a way that the data of every file is added as a new column.

At the moment, it Looks somehow like this:

C1.csv
102
106
152
196
223
486
553

C2.csv
296
299
843
1033
1996

However, it would like to have one single .csv file, where all the separate files are copied into a new column containing the name of the source file, like:

C1     C2     ...    Cn
102    296    ...    ...
106    299    ...
152    843    ...
196    1033   ...
223    1996   ...
486           ...
553           ...

So far, I the following code:

myFiles = list.files(path = ".", recursive = TRUE, pattern = ".csv", full.names = TRUE)
data <- lapply(myFiles, read.table, sep="\t", header=FALSE)
Max <- max(sapply(data, length))
data <- lapply(data, function(x) c(x, rep(NA, Max - length(x))))
data <- do.call(cbind, data)
names(data) <- sub("^[^[:alnum:]]*([[:alnum:]]+)\\.csv$", "\\1", myFiles)

write.csv(data, "outfile.csv")

It yielded a document that looks like this instead of adding the data of every .csv file in a new column:

enter image description here


回答1:


One can read all files using read.table in a list. Combine all data using dplyr::bind_rows. Afterwards, use reshape2::dcast to spread data in wide format with a column for data from every file.

# Get list of files in directory
fileList <- list.files(".", "*.csv", full.names = TRUE)

# Read file data. This will generate a list containing dataframes
listData <- lapply(fileList, read.table)

# Name list using name of files
names(listData) <- gsub(".csv","",basename(fileList))

library(tidyverse)
library(reshape2)

bind_rows(listData, .id = "FileName") %>%
  group_by(FileName) %>%
  mutate(rowNum = row_number()) %>%
  dcast(rowNum~FileName, value.var = "V1") %>%
  select(-rowNum) %>%
  write.csv(file="Result.csv")

# Content of Result.csv
# "","C1","C2"
# "1",102,296
# "2",106,299
# "3",152,843
# "4",196,1033
# "5",223,1996
# "6",486,NA
# "7",553,NA



回答2:


Is this what you want?
Note that I read the files in with scan. Since the files have only one column there is no need for a complex function like read.csv.

myFiles <- list.files(path = ".", pattern = "^C.*\\.csv", full.names = TRUE, recursive = TRUE)
data <- lapply(myFiles, scan)
Max <- max(sapply(data, length))
data <- lapply(data, function(x) c(x, rep(NA, Max - length(x))))
data <- do.call(cbind, data)
names(data) <- sub("^[^[:alnum:]]*([[:alnum:]]+)\\.csv$", "\\1", myFiles)

write.csv(data, "outfile.csv")

The contents of "outfile.csv" are

"","V1","V2"
"1",102,296
"2",106,299
"3",152,843
"4",196,1033
"5",223,1996
"6",486,NA
"7",553,NA


来源:https://stackoverflow.com/questions/50897193/how-to-combine-different-csv-files-to-one-complete-file-by-adding-the-data-of-e

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!