I have cases where user-entered data from an html textarea or input is sometimes sent with \\u00a0 (non-breaking spaces) instead of spaces when encoded as utf-8
Use /\u00a0/ to match non-breaking spaces. For instance s.gsub(/\u00a0/, ' ') converts all non-breaking spaces to regular spaces.
Use /[[:space:]]/ to match all whitespace, including Unicode whitespace like non-breaking spaces. This is unlike /\s/, which matches only ASCII whitespace.
See also: Ruby Regexp documentation