How to repeat rows in dataframe based on the range of a time variable in R? [duplicate]

你说的曾经没有我的故事 提交于 2020-05-28 05:43:29

问题


I have a df, such that a row looks like:

name datesemployed        university   
Kate Oct 2015 – Jan 2016  Princeton

What I want to do is repeat the entire row for each year in the range of variable datesemployed.

In this case, there would be two rows --- one for 2015, and one for 2016.

I've attempted to clean the variable first, but even having a tough time on how to do that:

df3<-str_split_fixed(df$datesemployed, "–", 2)
df<-cbind(df3, df)

回答1:


We can use separate_rows from tidyr while specifying the sep as zero or more spaces followed by - and then any spaces

library(dplyr)
library(tidyr)
df %>%
     separate_rows(datesemployed,  sep="\\s*–\\s*")
#    name datesemployed university
#1 Kate      Oct 2015  Princeton
#2 Kate      Jan 2016  Princeton

data

df <- structure(list(name = "Kate", datesemployed = "Oct 2015 – Jan 2016", 
    university = "Princeton"), class = "data.frame", row.names = c(NA, 
-1L))


来源:https://stackoverflow.com/questions/61192872/how-to-repeat-rows-in-dataframe-based-on-the-range-of-a-time-variable-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!