In python these three commands print the same emoji:
print \"\\xF0\\x9F\\x8C\\x80\"
Your first string is a byte string. The fact that it prints a single emoji character means that your console is configured to print UTF-8 encoded characters.
Your second string is a Unicode string with a single codepoint, U+1F300. The \U
specifies that the next 8 hex digits should be interpreted as a codepoint.
The third string takes advantage of a quirk in the way Unicode strings are stored in Python 2. You've given two UTF-16 entities, which together form the single codepoint U+1F300
the same as the previous string. Each \u
takes 4 following hex digits. Individually these characters wouldn't be valid Unicode, but because Python 2 stores its Unicode internally as UTF-16 it works out. In Python 3 this wouldn't be valid.
When you print out a Unicode string, and your console encoding is known to be UTF-8, the Unicode strings are encoded to UTF-8 bytes. Thus the 3 strings end up producing the same byte sequence on the output, generating the same character.