How does Facebook encode emoji in the json Graph API?

元气小坏坏 提交于 2019-12-19 09:47:40

问题


Does anyone know how Facebook encodes emoji with high-surrogate pairs in the Graph API?

Low surrogate pairs seem fine. For example, ❤️ (HEAVY BLACK HEART, though it is red in iOS/OSX, link to image if you can't see the emoji) comes through as \u2764\ufe0f which appears to match the UTF-16 hex codes / "Formal Unicode Notation" shown here at iemoji.com.

And indeed, in Ruby when parsing the JSON output from the API:

ActiveSupport::JSON.decode('"\u2764\ufe0f"')

you correctly get:

"❤️"

However, to pick another emoji, 💤 (SLEEPING SYMBOL, link to image here. Facebook returns \udbba\udf59. This seems to correspond with nothing I can find on any unicode resources, e.g., for example this one at iemoji.com.

And when I attempt to decode in Ruby using the same method above:

ActiveSupport::JSON.decode('"\udbba\udf59"')

I get:

"󾭙"

Any idea what's going on here?


回答1:


Answering my own question though most of the credit belongs to @bobince for showing me the way in the comments above.

The answer is that Facebook encodes emoji using the "Google" encoding as seen on this Unicode table.

I have created a ruby gem called emojivert that can convert from one encoding to another, including from "Google" to "Unified". It is based on another existing project called rails-emoji.

So the failing example above would be fixed by doing:

string = ActiveSupport::JSON.decode('"\udbba\udf59"')
> "󾭙"
fixed = Emojivert.google_to_unified(string)
> "💤"


来源:https://stackoverflow.com/questions/20045268/how-does-facebook-encode-emoji-in-the-json-graph-api

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!