To stack up results in one masterfile in R

吃可爱长大的小学妹 提交于 2020-01-17 00:41:13

问题


Using this script I have created a specific folder for each csv file and then saved all my further analysis results in this folder. The name of the folder and csv file are same. The csv files are stored in the main/master directory. Now, I have created a csv file in each of these folders which contains a list of all the fitted values.

I would now like to do the following:

  1. Set the working directory to the particular filename
  2. Read fitted values file
  3. Add a row/column stating the name of the site/ unique ID
  4. Add it to the masterfile which is stored in the main directory with a title specifying site name/filename. It can be stacked by rows or by columns it doesn't really matter.
  5. Come to the main directory to pick the next file
  6. Repeat the loop

Using the merge(), rbind(), cbind() combines all the data under one column name. I want to keep all the sites separate for comparison at a later on stage.

This is what I'm using at the moment and I'm lost on how to proceed further.

setwd( "path")  # main directory
path <-"path"  # need this for convenience while switching back to main directory

# import all files and create a character type array
files <- list.files(path=path, pattern="*.csv")

for(i in seq(1, length(files), by = 1)){

      fileName <- read.csv(files[i]) # repeat to set the required working directory
      base <- strsplit(files[i], ".csv")[[1]]   # getting the filename
      setwd(file.path(path, base))   # setting the working directory to the same filename
      master <- read.csv(paste(base,"_fiited_values curve.csv"))
    # read the fitted value csv file for the site and store it in a list
    }

I want to construct a for loop to make one master file with the files in different directories. I do not want to merge all under one column name.

For example, If I have 50 similar csv files and each had two columns of data, I would like to have one csv file which accommodates all of it; but in its original format rather than appending to the existing row/column. So then I will have 100 columns of data.

Please tell me what further information can I provide?


回答1:


for reading a group of files, from a number of different directories, with pathnames patha pathb pathc:

paths = c('patha','pathb','pathc')
files = unlist(sapply(paths, function(path) list.files(path,pattern = "*.csv", full.names = TRUE)))

listContainingAllFiles = lapply(files, read.csv)

If you want to be really quick about it, you can grab fread from data.table:

library(data.table)
listContainingAllFiles = lapply(files, fread)

Either way this will give you a list of all objects, kept separate. If you want to join them together vertically/horizontally, then:

do.call(rbind, listContainingAllFiles)
do.call(cbind, listContainingAllFiles)

EDIT: NOTE, the latter makes no sense unless your rows actually mean something when they're corresponding. It makes far more sense to just create a field tracking what location the data is from.

if you want to include the names of the files as the method of determining sample location (I don't see where you're getting this info from in your example), then you want to do this as you read in the files, so:

listContainingAllFiles = lapply(files, 
                            function(file) data.frame(filename = file,
                                                      read.csv(file)))

then later you can split that column to get your details (Assuming of course you have a standard naming convention)



来源:https://stackoverflow.com/questions/33851162/to-stack-up-results-in-one-masterfile-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!