When I want to parsing XML document in Python using BeautifulSoup library, I faced some problems. The XML document that I want to parse:
-
You could use BeautifulSoup to parse XML:
import bs4 as bs
content='''\
-
2011-10-10 09:00:00
2011-10-17 09:00:00
35000
20000
'''
soup = bs.BeautifulSoup(content, 'xml')
title = soup.title
print(title.string)
# Title Sample
link = soup.link.nextSibling
print(link)
# http://banhada.kr/?cateCode=09&viewCode=S0941580
Under the hood, BeautifulSoup uses lxml for parsing XML. Although it's not needed here, you might want to use lxml directly, since it gives you more succinct ways to navigate through XML using XPath:
import lxml.etree as ET
content='''\
-
2011-10-10 09:00:00
2011-10-17 09:00:00
35000
20000
'''
doc = ET.fromstring(content)
title = doc.find('title')
print(title.text)
# Title Sample
link = doc.find('link')
print(link.tail)
# http://banhada.kr/?cateCode=09&viewCode=S0941580