preg_match_all - regex to find full urls in string

做~自己de王妃 提交于 2020-01-12 06:17:28

问题


I have spent over 4 hours trying to find a regex patter to my php code without luck.

I have a string with html code. It has lot of urls formats like:

site*com
http://site*com
http://www*site*com
http://site*com/some.php
http://site*om/some.php?var1=1
http://site*com/some.php?var1=1&var2=2
etc.

I have the following php code working in part:

preg_match_all('/\b(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&@#\/%=~_|$?!:,.]*[A-Z0-9+&@#\/%=~_|$]/i', $content, $result, PREG_PATTERN_ORDER);

The only thing I need is ALSO capture urls with multiple query strings using "&" I get them, but not in full, I only receive things like:

http://site*com/asdad.php?var1=1&

(please note, replace * with . I cant post links)

The left is lost.

Can someone help me adding the part lost to the pattern?

Thanks so much in advance.


回答1:


Well. Finally I got it:

The final regex code is:

$regex = "/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i";

It works.




回答2:


Check these pattern which can be used for any URL type

$regex = "((https?|ftp)\:\/\/)?"; // Checking scheme 
$regex .= "([a-z0-9-.]*)\.([a-z]{2,3})"; // Checking host name and/or IP
$regex .= "(\:[0-9]{2,5})?"; // Check it it has port number
$regex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?"; // The real path
$regex .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?"; // Check the query string params
$regex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?"; // Check anchors if are used.

You can ignore any section which you may not need. As you see I am concatenating them



来源:https://stackoverflow.com/questions/22202821/preg-match-all-regex-to-find-full-urls-in-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!