I am trying to replace emoji from Arabic tweets using java.
I used this code:
String line = \"اييه تقولي اجل الارسنال تعادل امس بعد ما كان فايز
From the Javadoc for the Pattern class
A Unicode character can also be represented in a regular-expression by using its Hex notation(hexadecimal code point value) directly as described in construct
\x{...}, for example a supplementary character U+2011F can be specified as\x{2011F}, instead of two consecutive Unicode escape sequences of the surrogate pair\uD840\uDD1F.
This means that the regular expression that you're looking for is ([\x{1F601}-\x{1F64F}]). Of course, when you write this as a Java String literal, you must escape the backslashes.
Pattern unicodeOutliers = Pattern.compile("([\\x{1F601}-\\x{1F64F}])");
Note that the construct \x{...} is only available from Java 7.