Regular expression to catch letters beyond a-z

后端 未结 6 822
南方客
南方客 2020-12-16 02:26

A normal regexp to allow letters only would be \"[a-zA-Z]\" but I\'m from, Sweden so I would have to change that into \"[a-zåäöA-ZÅÄÖ]\". But suppose I don

6条回答
  •  别那么骄傲
    2020-12-16 02:53

    Is there a way to automatically know what chars are are valid in a given locale/language or should I just make a blacklist of chars that I (think I) know I don't want?

    This is not, in general, possible.

    After all Engligh text does include some accented characters (e.g. in "fête" and "naïve" -- which in UK-English to be strictly correct still use accents). In some languages some of the standard letters are rarely used (e.g. y-diaeresis in French).

    Then consider including foreign words are included (this will often be the case where technical terms are used). Quotations would be another source.

    If your requirements are sufficiently narrowly defined you may be able to create a definition, but this requires linguistic experience in that language.

提交回复
热议问题