Please can someone help me parse these links from an HTML page
Looks like your regex is doing something wrong. Instead of
Pattern pattern = Pattern.compile("
Try:
Pattern pattern = Pattern.compile("
the 'a.+' on your first pattern is matching any character at least one time. If you intended to set the space character the use '\s+' instead.
The following code works perfect:
String s = " " +
" " +
"";
Pattern p = Pattern.compile("
output:
0 : http://nemertes.lis.upatras.gr/dspace/handle/123456789/2299
72 : http://nemertes.lis.upatras.gr/dspace/handle/123456789/3154
145 : http://nemertes.lis.upatras.gr/dspace/handle/123456789/3158