Splitting dataframes in R based on empty rows

后端 未结 1 1661
一生所求
一生所求 2020-12-18 17:14

I have a dataframe which has multiple tables, each table is signified by the empty rows in between.

A   x   y   z
Name1   12  21  23
Name2   23  21  22
Name3         


        
相关标签:
1条回答
  • 2020-12-18 17:37

    If possible, you should see if you can modify how you import your data to not have to do this. Otherwise here is a possible solution that creates a list where each element is one of your tables.

    dt <- read.table(blank.lines.skip = FALSE,
                     text = "
    A   x   y   z
    Name1   12  21  23
    Name2   23  21  22
    Name3   45  43  21
    
    B   x   y   z
    Name4   32  23  23
    Name5   12  32  33
    Name6   10  34  45
    Name12  11  11  56
    
    C   x   y   z
    Name7   11  56  67
    Name8   90  87  98
    Name9   45  34  34
    Name10  78  8   56
    Name11  92  12  45
    ", stringsAsFactors = FALSE)
    
    ## add column to indicate groups
    dt$tbl_id <- cumsum(!nzchar(dt$V1))
    
    ## remove blank lines
    dt <- dt[nzchar(dt$V1), ]
    
    ## split the data frame
    dt_s <- split(dt[, -ncol(dt)], dt$tbl_id)
    
    ## use first line as header and reset row numbers
    dt_s <- lapply(dt_s, function(x) {
        colnames(x) <- x[1, ]
        x <- x[-1, ]
        rownames(x) <- NULL
        x
    })
    

    Result:

    > dt_s
    $`1`
          A  x  y  z
    1 Name1 12 21 23
    2 Name2 23 21 22
    3 Name3 45 43 21
    
    $`2`
           B  x  y  z
    1  Name4 32 23 23
    2  Name5 12 32 33
    3  Name6 10 34 45
    4 Name12 11 11 56
    
    $`3`
           C  x  y  z
    1  Name7 11 56 67
    2  Name8 90 87 98
    3  Name9 45 34 34
    4 Name10 78  8 56
    5 Name11 92 12 45
    
    0 讨论(0)
提交回复
热议问题