Regex to match URL

后端 未结 14 2402
耶瑟儿~
耶瑟儿~ 2020-11-22 09:41

I am using the following regex to match a URL:

$search  = \"/([\\S]+\\.(MUSEUM|TRAVEL|AERO|ARPA|ASIA|COOP|INFO|NAME|BIZ|CAT|COM|INT|JOBS|NET|ORG|PRO|TEL|AC|A         


        
14条回答
  •  眼角桃花
    2020-11-22 10:07

    Regex to match all urls (with www, without www, with http or https, without http or https, includes all 2-6 letter top level domain names [for countries, ex 'ly','us'], ports, query strings, and anchors ['#']). It's not 100% but it is better than anything I have seen posted on the web.

    It uses the top level domains from the first answer, combined with other techniques found in my searches. It will return any valid url that has bounds, that is where \b comes into play. Since the trailing '/' is also triggered by \b, the last one, is a match for one or more '?'.

    /\b((http(s?):\/\/)?([a-z0-9\-]+\.)+(MUSEUM|TRAVEL|AERO|ARPA|ASIA|EDU|GOV|MIL|MOBI|COOP|INFO|NAME|BIZ|CAT|COM|INT|JOBS|NET|ORG|PRO|TEL|A[CDEFGILMNOQRSTUWXZ]|B[ABDEFGHIJLMNORSTVWYZ]|C[ACDFGHIKLMNORUVXYZ]|D[EJKMOZ]|E[CEGHRSTU]|F[IJKMOR]|G[ABDEFGHILMNPQRSTUWY]|H[KMNRTU]|I[DELMNOQRST]|J[EMOP]|K[EGHIMNPRWYZ]|L[ABCIKRSTUVY]|M[ACDEFGHKLMNOPQRSTUVWXYZ]|N[ACEFGILOPRUZ]|OM|P[AEFGHKLMNRSTWY]|QA|R[EOSUW]|S[ABCDEGHIJKLMNORTUVYZ]|T[CDFGHJKLMNOPRTVWZ]|U[AGKMSYZ]|V[ACEGINU]|W[FS]|Y[ETU]|Z[AMW])(:[0-9]{1,5})?((\/([a-z0-9_\-\.~]*)*)?((\/)?\?[a-z0-9+_\-\.%=&]*)?)?(#[a-zA-Z0-9!$&'()*+.=-_~:@/?]*)?)/gi
    

提交回复
热议问题