So I am trying to get the text between the tags. So far I have been successful. But sometimes when there are special characters or html tags inside my custom tags I am unable to
Since you have been asking this question now for different libraries, here is a solution with XMLParser
. The author of this XML had maybe not the best understanding how XML works. If I where you I'd rather put some filtering in place, to make this sane again (e.g.
to
).
def xml = '''\
Australia
1.02 Accounting Terms.
Isle of Man
Smallest Street-Legal Car at 99cm wide and 59 kg in weight
France
Most Valuable Car at $15 million
'''
def underp = { l ->
l.inject([texts: [:]]) { r, it ->
if (it.respondsTo('name') && it.name().endsWith('Begin')) {
r.texts[(r.last=it.name().replaceFirst(/Begin$/,''))] = ''
} else if (it.respondsTo('name') && it.name().endsWith('End')) {
r.last = null
} else if (r.last) {
r.texts[r.last] += (it instanceof String) ? it : it.text()
}
r
}.texts
}
def root = new XmlParser().parseText(xml)
root.car.each{
println underp(it.children()).inspect()
}
prints
['ae_definedTermTitle':'Australia', 'ae_clauseTitle':'1.02 Accounting Terms.']
['ae_definedTermTitle':'Isle of Man', 'ae_clauseTitle':'Smallest Street-Legal Car at 99cm wide and 59 kg in weight']
['ae_definedTermTitle':'France', 'ae_clauseTitle':'Most Valuable Car at $15 million']