Python regular expression again - match url

后端 未结 4 567
我在风中等你
我在风中等你 2020-12-03 22:37

I have such regexp:

 re.compile(r\"((https?):((//)|(\\\\\\\\))+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*)\", re.MULTILINE|re.UNICODE)

But that

相关标签:
4条回答
  • 2020-12-03 23:02

    It could be very long but in practice mine works pretty good. Please try this one ((http|https)\:\/\/)?[a-zA-Z0-9\.\/\?\:@\-_=#]+\.([a-zA-Z]){2,6}([a-zA-Z0-9\.\&\/\?\:@\-_=#])*

    It matches all of the example below

    http://wwww.stackoverflow.com
    abc.com
    http://test.test-75.1474.stackoverflow.com/
    stackoverflow.com/
    stackoverflow.com
    rfordyce@broadviewnet.com
    http://www.example.com/etcetc
    www.example.com/etcetc
    example.com/etcetc
    user:pass@example.com/etcetc
    (www.itmag.com)
    example.com/etcetc?query=aasd
    example.com/etcetc?query=aasd&dest=asds
    http://stackoverflow.com/questions/6427530/regular-expression-pattern-to-
    match-url-with
    www/Christina.V.Scott@gmail.com
    line.lundvoll.nilsen@telemed.no.
    s.hossain@unsw.edu.au 
    s.hossain@unsw.edu.au     
    
    0 讨论(0)
  • 2020-12-03 23:08

    This is a common problem, use default libraries.

    For python use urlparse

    0 讨论(0)
  • 2020-12-03 23:15

    I'll admit that I'm a little bit worried about an application that requires a regex like that to match URLs. That said, this seems to work for me:

    ((https?):((//)|(\\\\))+([\w\d:#@%/;$()~_?\+-=\\\.&](#!)?)*)
    
    0 讨论(0)
  • 2020-12-03 23:25

    Don't try to make your own regular expression for matching URLs, use someone else's who has already solved such problems, like this one.

    0 讨论(0)
提交回复
热议问题