How to divide text (string) by a certain character using r

时光毁灭记忆、已成空白 提交于 2019-12-06 14:39:33

You need to specify a pattern for dates more completely.

## Your sample data
b = "10/1/2017 This was the first restaurant. 9/30/2015 i'm happy. i'm ~~. 6/20/2016  Prices were reasonable.."

Messages = strsplit(b, "\\d{1,2}/\\d{1,2}/\\d{4}")[[1]]
m <- gregexpr("\\d{1,2}/\\d{1,2}/\\d{4}", b)
Dates = regmatches(b, m)[[1]] 
if(length(Messages) > length(Dates)) { Messages = Messages[-1] }
as.data.frame(cbind(Dates, Messages))
      Dates                         Messages
1 10/1/2017  This was the first restaurant. 
2 9/30/2015              i'm happy. i'm ~~. 
3 6/20/2016         Prices were reasonable..

stringr and rebus packages are really helpful and intuitive.

> string<-"10/1/2017 This was the first restaurant. 9/30/2015 i'm happy. i'm ~~. 6/20/2016  Prices were reasonable.."
> library(stringr)
> library(rebus)
> pattern<-
+   capture(dgt(1,2))%R%
+   char_class("/")%R%
+   capture(dgt(1,2))%R%
+   char_class("/")%R%
+   capture(dgt(1,4))%R%
+   capture(one_or_more(or(WRD,char_class(" ","'"))))
> matrix<-str_match_all(string,pattern)
> matrix
[[1]]
     [,1]                                      [,2] [,3] [,4]   [,5]                            
[1,] "10/1/2017 This was the first restaurant" "10" "1"  "2017" " This was the first restaurant"
[2,] "9/30/2015 i'm happy"                     "9"  "30" "2015" " i'm happy"                    
[3,] "6/20/2016  Prices were reasonable"       "6"  "20" "2016" "  Prices were reasonable"
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!