regex, multiline extract in R

余生颓废 提交于 2019-12-23 12:49:29

问题


I am having some problems with deleting everything after the first occurrence of a pattern in R. I have imported the data with paste(readLines(url), collapse="\n").

For example, my string is, \"id=\"fruit_info\">\n<tr class='thead'>\n<th colspan=2>Strawberries</th></table>\n</tr>\n</table>\n<tr class.

I want to remove everything after the first occurrence of </table>. What I want to see is;

\"id=\"fruit_info\">\n<tr class='thead'>\n<th colspan=2>Strawberries</th>

The methods I am trying do not seem to register the first </table> occurrence and not providing the intended results.

Thanks!


回答1:


Try using the inline (?s) modifier which forces the dot . to span across newline sequences.

sub('(?s)</table>.*', '', x, perl = TRUE)


来源:https://stackoverflow.com/questions/30144325/regex-multiline-extract-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!