I have some strings with all kinds of different emojis/images/signs in them.
Not all the strings are in English -- some of them are in other non-Latin languages, for
I'm not super into Java, so I won't try to write example code inline, but the way I would do this is to check what Unicode calls "the general category" of each character. There are a couple letter and punctuation categories.
You can use Character.getType to find the general category of a given character. You should probably retain those characters that fall in these general categories:
COMBINING_SPACING_MARK
CONNECTOR_PUNCTUATION
CURRENCY_SYMBOL
DASH_PUNCTUATION
DECIMAL_DIGIT_NUMBER
ENCLOSING_MARK
END_PUNCTUATION
FINAL_QUOTE_PUNCTUATION
FORMAT
INITIAL_QUOTE_PUNCTUATION
LETTER_NUMBER
LINE_SEPARATOR
LOWERCASE_LETTER
MATH_SYMBOL
MODIFIER_LETTER
MODIFIER_SYMBOL
NON_SPACING_MARK
OTHER_LETTER
OTHER_NUMBER
OTHER_PUNCTUATION
PARAGRAPH_SEPARATOR
SPACE_SEPARATOR
START_PUNCTUATION
TITLECASE_LETTER
UPPERCASE_LETTER
(All of the characters you listed as specifically wanting to remove have general category OTHER_SYMBOL, which I did not include in the above category whitelist.)