Split strings into columns in R where each string has a potentially different number of column entries

前端 未结 3 689
情歌与酒
情歌与酒 2021-01-12 17:43

I\'ve got a data frame that\'s got the following form

pages                         count
[page 1, page 2, page 3]      23
[page 2, page 4]              4
[p         


        
3条回答
  •  温柔的废话
    2021-01-12 18:09

    sample data

    myDat <- read.table(text=
      "pages|count
    [page 1, page 2, page 3]|23
    [page 2, page 4]|4
    [page 1, page 3, page 4]|12", header=TRUE, sep="|") 
    

    We can pull pages out of myDat to work on it.

    # if factors, convert to characters
    pages <- as.character(myDat$page)
    
    # remove brackets.  Note the double-escape's in R
    pages <- gsub("(\\[|\\])", "", pages)
    
    # split on comma
    pages <- strsplit(pages, ",")
    
    # find the largest element
    maxLen <- max(sapply(pages, length))
    
    # fill in any blanks. The t() is to transpose the return from sapply
    pages <- 
    t(sapply(pages, function(x)
          # append to x, NA's.  Note that if (0 == (maxLen - length(x))), then no NA's are appended 
          c(x, rep(NA, maxLen - length(x)))
      ))
    
    # add column names as necessary
    colnames(pages) <- paste(c("First", "Second", "Third"), "Page")
    
    # Put it all back together
    data.frame(pages, Count=myDat$count)
    



    Results

    > data.frame(pages, Count=myDat$count)
      First.Page Second.Page Third.Page Count
    1     page 1      page 2     page 3    23
    2     page 2      page 4            4
    3     page 1      page 3     page 4    12
    

提交回复
热议问题