preg_replace words not inside a url

谁说我不能喝 提交于 2019-12-25 06:39:03

问题


I am using preg_replace to replace a list of words in a text that may contain some urls. The problem is that I don't want to replace these words if they're part of a url.

These examples should be ignored:

foo.com

foo.com/foo

foo.com/foo/foo

For a basic example (written in php), I tried to ignore strings containing .com and optional slashes and chars, using a negative look ahead assertion, but with no success:

preg_replace("/(\b)foo(\b)/", "$1bar$2(?!(\w+\.\w+)*(\.com)([\.\/]\w+)*)", $text);

This call works just ignores the word before .com. Any help would be really appreciated.


回答1:


In cases like these, its much easier to think of the problem inverted. You want to match words not in an url. Instead think, you want to match the url and the words. So, your expression would look like this: url_match_here|(?:my|words|here). This will allow the regex engine to consume the URL first and then try to match those words. Thus, you never have to worry about matching the words inside an URL. If you want to maintain the text structure, you can use preg_replace, with the following expression (url_match_here)|(?:my|words|here) and replace by \1 to preserve the URL and the text.

I hope this helps.

Good luck.



来源:https://stackoverflow.com/questions/12139033/preg-replace-words-not-inside-a-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!