Can a URL contain a semicolon and still be valid?

前端 未结 7 1477
故里飘歌
故里飘歌 2020-11-30 04:47

I am using a regular expression to convert plain text URL to clickable links.

@(https?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.-]*(\\?\\S+)?)?)?)@

Ho

7条回答
  •  春和景丽
    2020-11-30 05:18

    Quoting RFCs is not all that helpful in answering this question, because you will encounter URLs with semicolons (and commas for that matter). We had a Regex that did not handle semicolons and commas, and some of our users at NutshellMail complained because URLs containing them do in fact exist in the wild. Try building a dummy URL in Facebook or Twitter that contains a ';' or ',' and you will see that those two services encode the full URL properly.

    I replaced the Regex we were using with the following pattern (and have tested that it works):

     string regex = @"((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-zA-Z0-9-]+\.[a-zA-Z0-9\/_:@=.+?,##%&~_-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])";
    

    This Regex came from http://rickyrosario.com/blog/converting-a-url-into-a-link-in-csharp-using-regular-expressions/ (with a slight modification)

提交回复
热议问题