I\'d like to have a canonical place to pool information about Unicode support in various languages. Is it a part of the core language? Is it provided in libraries? Is it not
The only stuff I can find for Ruby is pretty old and not being much of a rubist, I'm not sure how accurate it is.
For the record, Ruby does support utf8, but not multibyte. Internally, it usually assumes strings are byte vectors, though there are libraries and tricks you can usually use to make things work.
Found that here.
Ruby 1.9 attaches encodings to strings. Binary strings use the encoding "ASCII-8BIT". While the default encoding is usually UTF-8 on any modern system, you cannot assume that all third party library functions always returns strings in this encoding. It might return any other encoding (e.g. some yaml parsers do that in some situations). If you concatenate two strings of different encoding you might get an Encoding::CompatibilityError.