Importing many files at the same time and adding ID indicator

前端 未结 2 1434
盖世英雄少女心
盖世英雄少女心 2020-12-18 15:47

I have 91 files - .log format:

rajectory Log File

Rock type: 2 (0: Sphere, 1: Cuboid, 2: Rock)

Nr of Trajectories: 91
Trajectory-Mode: ON
Average Slope (D         


        
相关标签:
2条回答
  • 2020-12-18 16:20

    First of all, you should encapsulate the reading part in a function :

    read_log_file <- function(path) {
      trjct <- read.table(path, skip = 23)
      trjct <- trjct[,c("V1","V2","V3", "V4", "V15")]
      colnames(trjct) <- c("t", "x", "y", "z", "Etot")
      return(trjct)
    }
    

    Then, you can create a list of data.frame using mapply (kind of apply which can take two parameters, go to datacamp article on apply family if you want to know more).

    files.list <- list.files(pattern = ".log")
    ids <- 1:length(files.list)
    
    df_list = mapply(function(path, id) {
        df = read_log_file(path)
        df$ID = id
        return(df)
    }, files.list, ids, SIMPLIFY=FALSE)
    

    Note the SIMPLIFY=FALSE part, it avoids mapply to return a kind of data.frame and return a raw list of data.frame instead.

    Finally, you can concatenate all your data.frame in one with bind_rows from dplyr package :

    df = dplyr::bind_rows(df_list)
    

    Note : in general, in R, it's better to use *apply functions family.

    0 讨论(0)
  • 2020-12-18 16:32

    You could also check out purrr::map_df which behaves like lapply or for loop but returns a data.frame

    read_traj <- function(fi) {
        df <- read.table(fi, header=F, skip=23)
        df <- df[, c(1:4, 15)]
        colnames(df) <- c("t", "x", "y", "z", "Etot")
        return(df)
    }
    
    files.list <- list.files(pattern = ".log")
    library(tidyverse)
    

    map_df has a handy feature .id=... that creates a column, id, with numbers 1...N where N is number of files.

    map_df(files.list, ~read_traj(.x), .id="id")
    

    If you want to save the file name instead, use the id column to access files.list

    map_df(files.list, ~read_traj(.x), .id="id") %>%
      mutate(id = files.list[as.numeric(id)])
    
    0 讨论(0)
提交回复
热议问题