I am facing a situation where i get Surrogate characters in text that i am saving to MySql 5.1. As the UTF-16 is not supported in this, I want to remove these surrogate pai
Java strings are stored as sequences of 16-bit chars, but what they represent is sequences of unicode characters. In unicode terminology, they are stored as code units, but model code points. Thus, it's somewhat meaningless to talk about removing surrogates, which don't exist in the character / code point representation (unless you have rogue single surrogates, in which case you have other problems).
Rather, what you want to do is to remove any characters which will require surrogates when encoded. That means any character which lies beyond the basic multilingual plane. You can do that with a simple regular expression:
return query.replaceAll("[^\u0000-\uffff]", "");