Need variable width negative lookbehind replacement

瘦欲@ 提交于 2019-12-14 01:56:12

问题


I have looked at many questions here (and many more websites) and some provided hints but none gave me a definitive answer. I know regular expressions but I am far from being a guru. This particular question deals with regex in PHP.

I need to locate words in a text that are not surrounded by a hyperlink of a given class. For example, I might have

This <a href="blabblah" class="no_check">elephant</a> is green and this elephant is blue while this <a href="blahblah">elephant</a> is red.

I would need to match against the second and third elephants but not the first (identified by test class "no_check"). Note that there could more attributes than just href and class within hyperlinks. I came up with

((?<!<a .*class="no_check".*>)\belephant\b)

which works beautifully in regex test software but not in PHP.

Any help is greatly appreciated. If you cannot provide a regular expression but can find some sort of PHP code logic that would circumvent the need for it, I would be equally grateful.


回答1:


If variable width negative look-behind is not available a quick and dirty solution is to reverse the string in memory and use variable width negative look-ahead instead. then reverse the string again.

But you may be better off using an HTML parser.




回答2:


I think the simplest approach would be to match either a complete <a> element with a "no_check" attribute, or the word you're searching for. For example:

<a [^<>]*class="no_check"[^<>]*>.*?</a>|(\belephant\b)

If it was the word you matched, it will be in capture group #1; if not, that group should be empty or null.

Of course, by "simplest approach" I really meant the simplest regex approach. Even simpler would be to use an HTML parser.




回答3:


I ended up using a mixed solution. It turns out that I had to parse a text for specific keywords and check if they were already part of a link and if not add them to a hyperlink. The solutions provided here were very interesting but not exactly tailored enough for what I needed.

The idea of using an HTML parser was a good one though and I am currently using one in another project. So hats off to both Alan Moore and Eric Strom for suggesting that solution.



来源:https://stackoverflow.com/questions/2725356/need-variable-width-negative-lookbehind-replacement

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!