Regex for Matching Pinyin

后端 未结 2 1259
离开以前
离开以前 2020-12-19 01:32

I\'m looking for a regular expression that can correctly match valid pinyin (e.g. \"sheng\", \"sou\" (while ignoring invalid pinyin, e.g. \"shong\", \"sei\"). Most of the re

2条回答
  •  时光取名叫无心
    2020-12-19 02:10

    I would use a combination approach that is not solely regex.

    Check for valid pinyin:

    1. grab word

    2. grab letters from the beginning of the word as long as they are consonants. This separates the initial sound from the final sound.

    3. check that the initial and final are valid...

    4. ...and if so, see if their combination is allowed (via a table like this, but the entries are simply 1's and 0's).

提交回复
热议问题