Extracting Number and Name from String [r]

陌路散爱 提交于 2019-12-06 07:14:05

问题


POSIX Expression is giving me a headache.

Lets say we have a string:

a = "[question(37), question_pipe(\"Person10\")]"

and ultimately I would like to be able to have:

b = c("37", "Person10")

I've had a look at the stringr package but cant figure out how to extract the information out using regular expressions and str_split.

Any help would be greatly appreciated.

Cameron


回答1:


So if I understand correctly you want to extract the elements within parenthesis.

You can first extract those elements, including the parenthesis, using str_extract_all:

b1 <- str_extract_all(string = a, pattern = "\\(.*?\\)")
b1
# [[1]]
# [1] "(37)"           "(\"Person10\")"

Since str_extract_all returns a list, let's turn it into a vector:

b2 <- unlist(b1)
b2
# [1] "(37)"           "(\"Person10\")"

Last, you can remove the parenthesis (the first and last character of each string) using str_sub:

b3 <- str_sub(string = b2, start = 2L, end = -2L) 
b3
# [1] "37"           "\"Person10\""

Edit: A few comments about the regex pattern: \\( and \\) are your opening and closing parenthesis. .*? means any character string but without being greedy, otherwise you would get one long match from the first ( to the last ).




回答2:


This should work in you specific case:

a <- "[question(37), question_pipe(\"Person10\")]"

# First split into two parts
b <- strsplit(a, ",")[[1]]

# Extract the number (skip as.integer if you want it as character)
x <- as.integer(gsub("[^0-9]","", b[[1]])) # 37

# Extract the stuff in quotes
y <- gsub(".*\"(.*)\".*", "\\1", b[[2]])   # "Person10"

An alternative for extracting everything in parentheses from the first part:

x <- gsub(".*\\((.*)\\).*", "\\1", b[[1]]) # "37"



回答3:


I'd do it this way:

a <- "[question(37), question_pipe(\"Person10\")]"
b <- unlist(strsplit(gsub("\"","",gsub(".*question\\((.*)\\).*question_pipe\\((.*)\\).*","\\1,\\2",a)),","))
print(b)
[1] "37"       "Person10"



回答4:


expanding on flodel's answer - this would be the most concise solution, i think:

a <- "[question(37), question_pipe(\"Person10\")]"    
b1 <- unlist(str_extract_all(string = a, pattern = "\(.*?\)"))
b <- gsub("[[:punct:]]", "", b1)


来源:https://stackoverflow.com/questions/9796752/extracting-number-and-name-from-string-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!