问题
How to write the regex that all three situations below matches? The path, file, and query string has to be exact. The domain part could be any variants of the following (domain name/IP address)
http://www.example.com/path1/path2/foobar.aspx?id=123&key=456
https://www.example.com/path1/path2/foobar.aspx?id=123&key=456
64.123.456.789/path1/path2/foobar.aspx?id=123&key=456
Basically, only the /path1/path2/foobar.aspx?id=123&key=456 needs to be matched. The part in front of it could be any of the variants lead user to the site.
回答1:
Code
\.[^\/]+(.*)
Try it online!
This RegEx captures the relative path of the address. This means that you will need to get the match's capture in your used program rather than the matched characters.
Explanation
\. Gets the first dot of the address [^\/]+ Matches all characters that aren't forward slashes (.*) Captures the rest of the address
Further Explanation
The reason why I'm not able to match (rather than capture) the address is because I don't have any expressions to definitely represent the beginning of the relative path (without having to match any other characters).
This is because some addresses have a protocol part (e.g.: http://
) whereas others don't. The extra two forward slashes mean that the RegEx would become much lengthier in order to verify that we get to the correct forward slash.
I used the first dot since all addresses (as far as I know) have a dot in the domain (www.something.com
or 64.123.456.789
). Since the domain is always immediately before the relative path, we can just skip to the next forward slash and always arrive at the relative path.
Then we just capture the rest of the address (including the first forward slash), which is then easy to get.
来源:https://stackoverflow.com/questions/52043172/regex-to-match-the-relative-path-of-the-url