Ruby 2.0 iconv replacement

穿精又带淫゛_ 提交于 2019-11-29 20:59:35

Iconv was deprecated (removed) in 1.9.3. You can still install it.

Reference Material if you unsure: https://rvm.io/packages/iconv/

However the suggestion is that you don't and rather use:

string.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => "?")

API

masakielastic

String#scrub can be used since Ruby 2.1.

str.scrub(''),
str.scrub{ |bytes| '' }

Related question: Equivalent of Iconv.conv(“UTF-8//IGNORE”,…) in Ruby 1.9.X?

If you're not on Ruby 2.1, so can't use String#scrub then the following will ignore all parts of the string that aren't correctly UTF-8 encoded.

string.encode('UTF-16', :invalid => :replace, :replace => '').encode('UTF-8')

The encode method does almost exactly what you want, but with the caveat that encode doesn't do anything if it thinks the string is already UTF-8. So you need to change encodings, going via an encoding that can still encode the full set of unicode characters that UTF-8 can encode. (If you don't you'll corrupt any characters that aren't in that encoding - 7bit ASCII would be a really bad choice!)

I have not had luck with the various approaches using a one line string.encode by itself

But I wrote a backfill that implements String#scrub in MRI pre 2.1, or other rubies that do not have it.

https://github.com/jrochkind/scrub_rb

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!