R regular expression: isolate a string between quotes

主宰稳场 提交于 2019-12-10 15:44:59

问题


I have a string myFunction(arg1=\"hop\",arg2=TRUE). I want to isolate what is in between quotes (\"hop\" in this example)

I have tried so far with no success:

gsub(pattern="(myFunction)(\\({1}))(.*)(\\\"{1}.*\\\"{1})(.*)(\\){1})",replacement="//4",x="myFunction(arg1=\"hop\",arg2=TRUE)")

Any help by a regex guru would be welcome!


回答1:


You could use regmatches function also. Sub or gsub only works for a particular input , for general case you must do grabing instead of removing.

> x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
> regmatches(x, gregexpr('"[^"]*"', x))[[1]]
[1] "\"hop\""

To get only the text inside quotes then pass the result of above function to a gsub function which helps to remove the quotes.

> x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
> gsub('"', '', regmatches(x, gregexpr('"([^"]*)"', x))[[1]])
[1] "hop"
> x <- "myFunction(arg1=\"hop\",arg2=\"TRUE\")"
> gsub('"', '', regmatches(x, gregexpr('"([^"]*)"', x))[[1]])
[1] "hop"  "TRUE"



回答2:


Try

 sub('[^\"]+\"([^\"]+).*', '\\1', x)
 #[1] "hop"

Or

 sub('[^\"]+(\"[^\"]+.).*', '\\1', x)
 #[1] "\"hop\""

The \" is not needed as " would work too

 sub('[^"]*("[^"]*.).*', '\\1', x)
 #[1] "\"hop\""

If there are multiple matches, as @AvinashRaj mentioned in his post, sub may not be that useful. An option using stringi would be

 library(stringi)
 stri_extract_all_regex(x1, '"[^"]*"')[[1]]
 #[1] "\"hop\""  "\"hop2\""

data

 x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
 x1 <- "myFunction(arg1=\"hop\",arg2=TRUE arg3=\"hop2\", arg4=TRUE)"



回答3:


You can try:

str='myFunction(arg1=\"hop\",arg2=TRUE)'

gsub('.*(\\".*\\").*','\\1',str)
#[1] "\"hop\""



回答4:


x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
unlist(strsplit(x,'"'))[2]
# [1] "hop"


来源:https://stackoverflow.com/questions/29508943/r-regular-expression-isolate-a-string-between-quotes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!