Regexp in ruby 1.8.7 that will detect a 4-byte Unicode character

ⅰ亾dé卋堺 提交于 2020-01-05 11:06:33

问题


Can anyone tell me how I would write a ruby regexp in ruby 1.8.7 to detect the presence of a 4-byte unicode character (specifically the emoji)? I am trying to handle the fact that mysql does not, by default, allow you to store 4-byte emoji unicode characters, now in use by iOS 5.

Thanks!


回答1:


This appears to match the first two bytes of the four bytes that represent emoji. This is being run in ruby 1.8.7.

str.match(/\360\237/)



回答2:


Altering the table might be feasible using a non-blocking online approach, e.g. Maatkit's online-schema-change: http://www.percona.com/doc/percona-toolkit/pt-online-schema-change.html

From the docs:

In brief, this tool works by creating a temporary table which is a copy of the original table (the one being altered). (The temporary table is not created like CREATE TEMPORARY TABLE; we call it temporary because it ultimately replaces the original table.) The temporary table is altered, then triggers are defined on the original table to capture changes made on it and apply them to the temporary table. This keeps the two tables in sync. Then all rows are copied from the original table to the temporary table; this part can take awhile. When done copying rows, the two tables are swapped by using RENAME TABLE. At this point there are two copies of the table: the old table which used to be the original table, and the new table which used to be the temporary table but now has the same name as the original table. If --drop-old-table is specified, then the old table is dropped.



来源:https://stackoverflow.com/questions/7774853/regexp-in-ruby-1-8-7-that-will-detect-a-4-byte-unicode-character

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!