问题
I have imported and modified some xml, but when I write out my xml using test.prettify(). It changes the top line of the xml from
<?xml version="1.0"?>
to
<?xml version="1.0" encoding="utf-8"?>
I don't want this change. How can I just keep the first line unchanged? What is the easiest way to do this?
If it matters, I'm using the xml parser.
soup = BeautifulSoup(r.text,'xml')
回答1:
I'm sure there's a more elegant way to do this using BeautifulSoup's built-ins, but based on your comment, I'll give you the "strip it out" version:
xml_string = '<?xml version="1.0" encoding="utf-8"?>'
print xml_string[:xml_string.find("encoding")-1] + "?>"
This is general enough to strip out any encoding from the header (not just utf-8).
回答2:
You could find the xml and use replaceWith() to replace it with the value you want.
来源:https://stackoverflow.com/questions/36503875/how-to-remove-xml-header-in-beautifulsoup