Basically I would like to decode a given Html document, and replace all special chars, such as \" \" -> \" \", \">\" -
The most reliable way is with
String cleanedString = StringEscapeUtils.unescapeHtml4(originalString);
from org.apache.commons.lang3.StringEscapeUtils.
And to escape the whitespaces
cleanedString = cleanedString.trim();
This will ensure that whitespaces due to copy and paste in web forms to not get persisted in DB.