consolidating data frames in R

后端 未结 2 854
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-03 08:23

Hi I have a lot of CSV files to process. Each file is generated by a run of an algorithm. My data always has one key and a value like this:

csv1:

            


        
相关标签:
2条回答
  • 2020-12-03 09:13

    What I have understood from the question is that you want a list which will contain lists of data.frame of csv files or txt files and aggregate it.

    Create a directory and drop all your csv and txt files in the folder. Now run the following command to get the result in list.

    l=list.files(pattern=".csv")

    this l object will contain the names of csv files

    m=Map(read.csv,l)

    This Map function will map read.csv function to all csv files and m object contains csv files as data.frame in list.

    dat=do.call(rbind,m)

    Now call plyr library

    library(plyr)

    res=ddply(dat,~index,summarize,value=mean(value))

    this res object will contain the aggregated value

    I hope this will help you to get your desire result.

    0 讨论(0)
  • 2020-12-03 09:15

    Here is a solution. I am following all the excellent comments so far, and hopefully adding value by showing you how to handle any number of files. I am assuming you have all your csv files in the same directory (my.csv.dir below).

    # locate the files
    files <- list.files(my.csv.dir)
    
    # read the files into a list of data.frames
    data.list <- lapply(files, read.csv)
    
    # concatenate into one big data.frame
    data.cat <- do.call(rbind, data.list)
    
    # aggregate
    data.agg <- aggregate(value ~ index, data.cat, mean)
    

    Edit: to handle your updated question in your comment below:

    files     <- list.files(my.csv.dir)
    algo.name <- sub("-.*", "", files)
    data.list <- lapply(files, read.csv)
    data.list <- Map(transform, data.list, algorithm = algo.name)
    data.cat  <- do.call(rbind, data.list)
    data.agg  <- aggregate(value ~ algorithm + index, data.cat, mean)
    
    0 讨论(0)
提交回复
热议问题