Regexp to search for phrase containing other phrase and don't mark anything other

橙三吉。 提交于 2020-03-26 04:08:13

问题


Similar topics appear here quite frequently but even analyzing them i still can't figure the proper regexp to do my task. I have an XML file with some sections. I need to remove text sections which contain given attributes and leave the everything else.

The example text section:

<Text FontFamily="Open Sans" FontSize="19" FontStyle="Normal"
    FontWeight="Normal" HorizontalAlign="Left" Left="803.0"
    Name="Back" Stroke="#CCCCCC" TextDecoration="None"
    Top="126.0" Visibility="Hidden">
... More content here ...
</Text>

I need to find and remove only the ones containing Name="Back". There are different Name attributes in other text sections or there are sections without Name at all (not named). Sections are multilined.

The simplest regexp is:

(?s)<Text (.*?)Name="Back"(.*?)</Text>

and also an improper one. If Name="Back" appears then regexp marks the proper part. But if this special phrase is not there, then it starts from <Text> then marks many other text or not text sections until it finds </Text> followed by Name="Back" which can be at end of file. So it marks almost the whole file, many texts and not text sections.

There is no point in showing my other regexp trials I tried to create based on other people's regexp solutions. The final regexp marks nothing or marks too much.

I will be grateful for help.

By the way, how do I escape < here? < plus Text won't appear in text, only in the code segment.


回答1:


You're close, try this:

(?s)<Text[^>]*? Name="Back".*?>.*?<\/Text>

See the demo at https://regex101.com/r/Dmyq59/1

^ I know it's not Notepad++ but they're both PCRE


If you paste your regex into my regex101 example then it will visualize the problem for you which is Text (.*?)Name="Back" because (.*?) will continue capturing EVERYTHING until it reaches a tag which does contain Name="Back"


You should strongly consider installing the XPatherizerNPP plugin so that you can use XPath. The equivalent XPath would have been //text[@name='Back']



来源:https://stackoverflow.com/questions/60288749/regexp-to-search-for-phrase-containing-other-phrase-and-dont-mark-anything-othe

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!