问题
My dataframe looks like
df <- setNames(data.frame(c("2 June 2004, 5 words, ()(","profit, Insight, 2 May 2004, 188 words, reports, by ()("), stringsAsFactors = F), "split")
What I want is to split column for date and words So far I found "Extract date text from string"
lapply(df2, function(x) gsub(".*(\\d{2} \\w{3} \\d{4}).*", "\\1", x))
But its not working with my example, thanks for the help as always
回答1:
As there is only a single column, we can directly use gsub/sub
after extracting the column. In the pattern, the days can be 1 or more, similarly the words have 3 ('May') or 4 characters ('June'), so we need to make those changes
sub(".*\\b(\\d{1,} \\w{3,4} \\d{4}).*", "\\1", df$split)
#[1] "2 June 2004" "2 May 2004"
来源:https://stackoverflow.com/questions/50557460/extracting-date-from-text-using-r