Convert non-breaking spaces to spaces in Ruby

后端未结

关注

 6  780

春和景丽 2020-12-15 03:56

I have cases where user-entered data from an html textarea or input is sometimes sent with \\u00a0 (non-breaking spaces) instead of spaces when encoded as utf-8

6条回答

醉酒成梦 (楼主)

2020-12-15 04:55

For whatever reason \s doesn't match \u00a0.

I think the "whatever reason" is that is not supposed to. Only the POSIX and \p construct character classes are Unicode aware. The character-class abbreviations are not:

Sequence   As[...]        Meaning
     \d    [0-9]          ASCII decimal digit character
     \D    [^0-9]         Any character except a digit
     \h    [0-9a-fA-F]    Hexadecimal digit character
     \H    [^0-9a-fA-F]   Any character except a hex digit
     \s    [ \t\r\n\f]    ASCII whitespace character
     \S    [^ \t\r\n\f]   Any character except whitespace
     \w    [A-Za-z0-9\_]  ASCII word character
     \W    [^A-Za-z0-9\_] Any character except a word character

0 讨论(0)

查看其它6个回答