When importing CSV into R how to generate column with name of the CSV?

后端 未结 6 608
情书的邮戳
情书的邮戳 2020-12-01 06:45

I have a large number of csv files that I want to read into R. All the Column headings in the csvs are the same. At first I thought I would need to create a loop based on th

6条回答
  •  没有蜡笔的小新
    2020-12-01 07:27

    Here is a solution using the import_list() function from rio, which is designed exactly for this purpose.

    # setup some example files to import
    rio::export(mtcars, "mtcars1.csv")
    rio::export(mtcars, "mtcars2.csv")
    rio::export(mtcars, "mtcars3.csv")
    

    The default behavior of import_list() is to get a list of data frames:

    str(rio::import_list(dir(pattern = "mtcars")), 1)
    ## List of 3
    ##  $ :'data.frame':       32 obs. of  11 variables:
    ##  $ :'data.frame':       32 obs. of  11 variables:
    ##  $ :'data.frame':       32 obs. of  11 variables:
    

    But you can use the rbind argument to instead construct a single data frame (note the _file column at the end):

    str(rio::import_list(dir(pattern = "mtcars"), rbind = TRUE))
    ## 'data.frame':   96 obs. of  12 variables:
    ##  $ mpg  : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
    ##  $ cyl  : int  6 6 4 6 8 6 8 4 4 6 ...
    ##  $ disp : num  160 160 108 258 360 ...
    ##  $ hp   : int  110 110 93 110 175 105 245 62 95 123 ...
    ##  $ drat : num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
    ##  $ wt   : num  2.62 2.88 2.32 3.21 3.44 ...
    ##  $ qsec : num  16.5 17 18.6 19.4 17 ...
    ##  $ vs   : int  0 0 1 1 0 1 0 1 1 1 ...
    ##  $ am   : int  1 1 1 0 0 0 0 0 0 0 ...
    ##  $ gear : int  4 4 4 3 3 3 3 4 4 4 ...
    ##  $ carb : int  4 4 1 1 2 1 4 2 2 4 ...
    ##  $ _file: chr  "mtcars1.csv" "mtcars1.csv" "mtcars1.csv" "mtcars1.csv" ...
    

    and the rbind_label argument to specify the name of the column that identifies each file:

    str(rio::import_list(dir(pattern = "mtcars"), rbind = TRUE, rbind_label = "source"))
    ## 'data.frame':   96 obs. of  12 variables:
    ##  $ mpg   : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
    ##  $ cyl   : int  6 6 4 6 8 6 8 4 4 6 ...
    ##  $ disp  : num  160 160 108 258 360 ...
    ##  $ hp    : int  110 110 93 110 175 105 245 62 95 123 ...
    ##  $ drat  : num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
    ##  $ wt    : num  2.62 2.88 2.32 3.21 3.44 ...
    ##  $ qsec  : num  16.5 17 18.6 19.4 17 ...
    ##  $ vs    : int  0 0 1 1 0 1 0 1 1 1 ...
    ##  $ am    : int  1 1 1 0 0 0 0 0 0 0 ...
    ##  $ gear  : int  4 4 4 3 3 3 3 4 4 4 ...
    ##  $ carb  : int  4 4 1 1 2 1 4 2 2 4 ...
    ##  $ source: chr  "mtcars1.csv" "mtcars1.csv" "mtcars1.csv" "mtcars1.csv" ...
    

    For full disclosure: I am the maintainer of rio.

提交回复
热议问题