To stack up results in one masterfile in R

问题

Using this script I have created a specific folder for each csv file and then saved all my further analysis results in this folder. The name of the folder and csv file are same. The csv files are stored in the main/master directory. Now, I have created a csv file in each of these folders which contains a list of all the fitted values.

I would now like to do the following:

Set the working directory to the particular filename
Read fitted values file
Add a row/column stating the name of the site/ unique ID
Add it to the masterfile which is stored in the main directory with a title specifying site name/filename. It can be stacked by rows or by columns it doesn't really matter.
Come to the main directory to pick the next file
Repeat the loop

Using the merge(), rbind(), cbind() combines all the data under one column name. I want to keep all the sites separate for comparison at a later on stage.

This is what I'm using at the moment and I'm lost on how to proceed further.

setwd( "path")  # main directory
path <-"path"  # need this for convenience while switching back to main directory

# import all files and create a character type array
files <- list.files(path=path, pattern="*.csv")

for(i in seq(1, length(files), by = 1)){

      fileName <- read.csv(files[i]) # repeat to set the required working directory
      base <- strsplit(files[i], ".csv")[[1]]   # getting the filename
      setwd(file.path(path, base))   # setting the working directory to the same filename
      master <- read.csv(paste(base,"_fiited_values curve.csv"))
    # read the fitted value csv file for the site and store it in a list
    }

I want to construct a for loop to make one master file with the files in different directories. I do not want to merge all under one column name.

For example, If I have 50 similar csv files and each had two columns of data, I would like to have one csv file which accommodates all of it; but in its original format rather than appending to the existing row/column. So then I will have 100 columns of data.

Please tell me what further information can I provide?

回答1:

for reading a group of files, from a number of different directories, with pathnames patha pathb pathc:

paths = c('patha','pathb','pathc')
files = unlist(sapply(paths, function(path) list.files(path,pattern = "*.csv", full.names = TRUE)))

listContainingAllFiles = lapply(files, read.csv)

If you want to be really quick about it, you can grab fread from data.table:

library(data.table)
listContainingAllFiles = lapply(files, fread)

Either way this will give you a list of all objects, kept separate. If you want to join them together vertically/horizontally, then:

do.call(rbind, listContainingAllFiles)
do.call(cbind, listContainingAllFiles)

EDIT: NOTE, the latter makes no sense unless your rows actually mean something when they're corresponding. It makes far more sense to just create a field tracking what location the data is from.

if you want to include the names of the files as the method of determining sample location (I don't see where you're getting this info from in your example), then you want to do this as you read in the files, so:

listContainingAllFiles = lapply(files, 
                            function(file) data.frame(filename = file,
                                                      read.csv(file)))

then later you can split that column to get your details (Assuming of course you have a standard naming convention)

来源：https://stackoverflow.com/questions/33851162/to-stack-up-results-in-one-masterfile-in-r

标签

csv

for-loop

merge

data.table