Is there a way in ruby 1.9 to remove invalid byte sequences from strings?
Suppose you have a string like "€foo\xA0" , encoded UTF-8, Is there a way to remove invalid byte sequences from this string? ( so you get "€foo" ) In ruby-1.8 you could use Iconv.iconv('UTF-8//IGNORE', 'UTF-8', "€foo\xA0") but that is now deprecated. "€foo\xA0".encode('UTF-8') doesn't do anything, since it is already UTF-8. I tried: "€foo\xA0".force_encoding('BINARY').encode('UTF-8', :undef => :replace, :replace => '') which yields "foo" But that also loses the valid multibyte character € Evgenii "€foo\xA0".chars.select(&:valid_encoding?).join Van der Hoorn "€foo\xA0".encode('UTF-16le',