How to remove the URLs present in text example
String str=\"Fear psychosis after #AssamRiots - http://www.google.com/LdEbWTgD http://www.yahoo.com/mksVZKBz\"
Note that if your URL contains characters like & and \ then the answers above will not work because replaceAll can't handle those characters. What worked for me was to remove those characters in a new string variable then remove those characters from the results of m.find() and use replaceAll on my new string variable.
private String removeUrl(String commentstr)
{
// rid of ? and & in urls since replaceAll can't deal with them
String commentstr1 = commentstr.replaceAll("\\?", "").replaceAll("\\&", "");
String urlPattern = "((https?|ftp|gopher|telnet|file|Unsure|http):((//)|(\\\\))+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*)";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
int i = 0;
while (m.find()) {
commentstr = commentstr1.replaceAll(m.group(i).replaceAll("\\?", "").replaceAll("\\&", ""),"").trim();
i++;
}
return commentstr;
}