RegEx for matching UK Postcodes

前端 未结 30 3177
广开言路
广开言路 2020-11-22 01:38

I\'m after a regex that will validate a full complex UK postcode only within an input string. All of the uncommon postcode forms must be covered as well as the usual. For in

30条回答
  •  面向向阳花
    2020-11-22 01:49

    I had a look into some of the answers above and I'd recommend against using the pattern from @Dan's answer (c. Dec 15 '10), since it incorrectly flags almost 0.4% of valid postcodes as invalid, while the others do not.

    Ordnance Survey provide service called Code Point Open which:

    contains a list of all the current postcode units in Great Britain

    I ran each of the regexs above against the full list of postcodes (Jul 6 '13) from this data using grep:

    cat CSV/*.csv |
        # Strip leading quotes
        sed -e 's/^"//g' |
        # Strip trailing quote and everything after it
        sed -e 's/".*//g' |
        # Strip any spaces
        sed -E -e 's/ +//g' |
        # Find any lines that do not match the expression
        grep --invert-match --perl-regexp "$pattern"
    

    There are 1,686,202 postcodes total.

    The following are the numbers of valid postcodes that do not match each $pattern:

    '^([A-PR-UWYZ0-9][A-HK-Y0-9][AEHMNPRTVXY0-9]?[ABEHMNPRVWXY0-9]?[0-9][ABD-HJLN-UW-Z]{2}|GIR 0AA)$'
    # => 6016 (0.36%)
    
    '^(GIR ?0AA|[A-PR-UWYZ]([0-9]{1,2}|([A-HK-Y][0-9]([0-9ABEHMNPRV-Y])?)|[0-9][A-HJKPS-UW]) ?[0-9][ABD-HJLNP-UW-Z]{2})$'
    # => 0
    
    '^GIR[ ]?0AA|((AB|AL|B|BA|BB|BD|BH|BL|BN|BR|BS|BT|BX|CA|CB|CF|CH|CM|CO|CR|CT|CV|CW|DA|DD|DE|DG|DH|DL|DN|DT|DY|E|EC|EH|EN|EX|FK|FY|G|GL|GY|GU|HA|HD|HG|HP|HR|HS|HU|HX|IG|IM|IP|IV|JE|KA|KT|KW|KY|L|LA|LD|LE|LL|LN|LS|LU|M|ME|MK|ML|N|NE|NG|NN|NP|NR|NW|OL|OX|PA|PE|PH|PL|PO|PR|RG|RH|RM|S|SA|SE|SG|SK|SL|SM|SN|SO|SP|SR|SS|ST|SW|SY|TA|TD|TF|TN|TQ|TR|TS|TW|UB|W|WA|WC|WD|WF|WN|WR|WS|WV|YO|ZE)(\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}))|BFPO[ ]?\d{1,4}$'
    # => 0
    

    Of course, these results only deal with valid postcodes that are incorrectly flagged as invalid. So:

    '^.*$'
    # => 0
    

    I'm saying nothing about which pattern is the best regarding filtering out invalid postcodes.

提交回复
热议问题