Extract text between certain symbols using Regular Expression in R

后端 未结 5 1311
刺人心
刺人心 2020-12-05 20:35

I have a series of expressions such as:

\"the text I need to extract
\"

I need to extrac

5条回答
  •  不思量自难忘°
    2020-12-05 21:35

    If this is html (which it look like it is) you should probably use an html parser. Package XML can do this

    library(XML)
    x <- "the text I need to extract
    " xmlValue(getNodeSet(htmlParse(x), "//i")[[1]]) # [1] "the text I need to extract"

    On an entire html document, you can use

    doc <- htmlParse(x)
    sapply(getNodeSet(doc, "//i"), xmlValue)
    

提交回复
热议问题