Match Anything In Between Strings For Linux Grep Command

你。 提交于 2019-12-24 09:04:07

问题


I have read the post grep all characters including newline but I not working with XML so it's a bit different with my Linux command.

I have the following data:

Example line 0</span>
<tag>Example line 1</tag>
<span>Example line 1.5</span>
<tag>
Example line 2
</tag>
Example line 3
<span>Example line 4</span>

Using this command cat file.txt | grep -o '<tag.*tag>\|^--.*' I get:

<tag>Example line 1</tag>

However, I want the output to be:

<tag>Example line 1</tag>
<tag>Example line 2</tag>

How can I match anything between the strings, including the newline?

Note: I need to used <tag and tag> as strings because other files can contain multiple tags and text in between the lines. Will update sample data to show that.


回答1:


This is easier done with gnu-awk using </tag> as record separator:

awk -v RS='</tag>' 'RT {gsub(/\n/, ""); print $0 RT}' file

<tag>Example line 1</tag>
<tag>Example line 2</tag>



回答2:


Consider this test file:

$ cat file2
Example line 0</span>
<tag>Example line 1</tag>
<span>Example line 1.5</span>
<tag>
Example line 2
</tag>
Example line 3
<span>Example line 4</span>

This produces the output that you want (requires GNU sed):

$ sed -z 's|\n||g; s|</tag>|&\n|g; s|[^\n]*<tag>|<tag>|; s|\n[^\n]*<tag>|\n<tag>|g; s|\n[^\n]*$|\n|' file2
<tag>Example line 1</tag>
<tag>Example line 2</tag>

Limitation: Note that processing XML-like text with non-specialized tools can be quite fragile.



来源:https://stackoverflow.com/questions/40050239/match-anything-in-between-strings-for-linux-grep-command

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!