Convert non-breaking spaces to spaces in Ruby

后端 未结 6 780
春和景丽
春和景丽 2020-12-15 03:56

I have cases where user-entered data from an html textarea or input is sometimes sent with \\u00a0 (non-breaking spaces) instead of spaces when encoded as utf-8

6条回答
  •  醉酒成梦
    2020-12-15 04:55

    For whatever reason \s doesn't match \u00a0.

    I think the "whatever reason" is that is not supposed to. Only the POSIX and \p construct character classes are Unicode aware. The character-class abbreviations are not:

    Sequence   As[...]        Meaning
         \d    [0-9]          ASCII decimal digit character
         \D    [^0-9]         Any character except a digit
         \h    [0-9a-fA-F]    Hexadecimal digit character
         \H    [^0-9a-fA-F]   Any character except a hex digit
         \s    [ \t\r\n\f]    ASCII whitespace character
         \S    [^ \t\r\n\f]   Any character except whitespace
         \w    [A-Za-z0-9\_]  ASCII word character
         \W    [^A-Za-z0-9\_] Any character except a word character
    

提交回复
热议问题