Delete non-UTF characters from a string in Ruby?

前端 未结 7 1710
不思量自难忘°
不思量自难忘° 2021-02-05 01:24

How do I delete non-UTF8 characters from a ruby string? I have a string that has for example \"xC2\" in it. I want to remove that char from the string so that it becomes a valid

7条回答
  •  悲哀的现实
    2021-02-05 01:54

    You can use /n, as in

    text.gsub!(/\xC2/n, '')
    

    to force the Regexp to operate on bytes.

    Are you sure this is what you want, though? Any Unicode character in the range [U+80, U+BF] will have a \xC2 in its UTF-8 encoded form.

提交回复
热议问题