removing data with tags from a vector

前端 未结 2 1498
遇见更好的自我
遇见更好的自我 2021-01-17 03:13

I have a string vector which contains html tags e.g

  abc<-\"\"welcome abc Ha         


        
2条回答
  •  遇见更好的自我
    2021-01-17 03:32

    Try

    > gsub("(<[^>]*>)","",abc)
    

    what this says is 'substitute every instance of < followed by anything that isnt a > up to a > with nothing"

    You cant just do gsub("<.*>","",abc) because regexps are greedy, and the .* would match up to the last > in your text (and you'd lose the 'abc' in your example).

    This solution might fail if you've got > in your tags - but is legal? Doubtless someone will come up with another answer that involves parsing the HTML with a heavyweight XML package.

提交回复
热议问题