elementtree

How to get xml output in a file with new line using python xml.etree?

早过忘川 提交于 2020-01-05 09:07:30
问题 I am generating xml file using "from xml.etree import ElementTree" and placing the generated output in to a new file "test.xml". The output is getting placed inside the test.xml but there is no new line its a big big line. So, what shall i do to have new line inside "test.xml" . Following is the script: from xml.etree import ElementTree from xml.dom import minidom from lxml import etree def prettify(elem): """Return a pretty-printed XML string for the Element. """ rough_string = ElementTree

Modify large file containing multiple XML files to create small file depending on condition

天大地大妈咪最大 提交于 2020-01-05 08:33:48
问题 I have a large file that contains multiple XMLs in different lines. I want to create a new file with lines (or XMLs) depending on a condition where multiple tags match columns of spreadsheet. For example, I have a large XML file. <?xml version="1.0" encoding="UTF-8"?><data><student><result><grade>A</grade></result><details><name>John</name><house>Red</house><id>100</id><age>16</age><email>john@mail.com</email></details></student></data> <?xml version="1.0" encoding="UTF-8"?><data><student>

Return XPath attribute value with ElementTree

倖福魔咒の 提交于 2020-01-05 04:01:06
问题 I want to evaluate XML documents with this kind of structure: <?xml version="1.0" encoding="UTF-8"?> <Service_name version="3.3.0"> ... where Service_name tag name is not constant string, while version attribute is required. For that purpose this xpath expression: /*[1]/@version evaluates fine with any xpath processor, but I can't figure how with Python ElementTree. For example: import xml.etree.ElementTree as ET doc = ET.parse('sample.xml') v = find('/*[1]/@version').text raises KeyError: '@

How to create “virtual root” with Python's ElementTree?

拈花ヽ惹草 提交于 2020-01-03 15:11:31
问题 I am trying to use Python's ElementTree to generate an XHTML file. However, the ElementTree.Element() just lets me create a single tag (e.g., HTML). I need to create some sort of a virtual root or whatever it is called so that I can put the various , DOCTYPES, etc. How do I do that? Thanks 回答1: I don't know if there's a better way but I've seen this done: Create the base document as a string: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD

'utf8' codec can't decode byte 0xd0 in position 0: invalid continuation byte

独自空忆成欢 提交于 2020-01-03 04:45:31
问题 I've the following text in an html document: <a href="#">�'ам интересна информация</a> and I'm using the following expression for extracting the text: row.xpath("string(./td[@class='col2 td-tags']/h3/a/text())") This expression works fine for simple english, but for the above string it throws this error: 'utf8' codec can't decode byte 0xd0 in position 0: invalid continuation byte 回答1: In HTML, &#xxx does NOT specify a byte in the document encoding; it's ALWAYS a unicode codepoint. Thus, you

How to preserve namespaces when parsing xml via ElementTree in Python

半世苍凉 提交于 2020-01-03 02:57:21
问题 Assume that I've the following XML which I want to modify using Python's ElementTree : <root xmlns:prefix="URI"> <child company:name="***"/> ... </root> I'm doing some modification on the XML file like this: import xml.etree.ElementTree as ET tree = ET.parse('filename.xml') # XML modification here # save the modifications tree.write('filename.xml') Then the XML file looks like: <root xmlns:ns0="URI"> <child ns0:name="***"/> ... </root> As you can see, the namepsace prefix changed to ns0 . I'm

How to preserve namespaces when parsing xml via ElementTree in Python

喜你入骨 提交于 2020-01-03 02:57:04
问题 Assume that I've the following XML which I want to modify using Python's ElementTree : <root xmlns:prefix="URI"> <child company:name="***"/> ... </root> I'm doing some modification on the XML file like this: import xml.etree.ElementTree as ET tree = ET.parse('filename.xml') # XML modification here # save the modifications tree.write('filename.xml') Then the XML file looks like: <root xmlns:ns0="URI"> <child ns0:name="***"/> ... </root> As you can see, the namepsace prefix changed to ns0 . I'm

Parsing file with ElementTree and BeautifulSoup: is there a way to parse the file by number of tag levels?

点点圈 提交于 2020-01-03 02:30:48
问题 I have this xml file, and I basically want to record all of the information into a dictionary. I wrote this code: import requests import xml.etree.ElementTree as ET import urllib2 import glob import pprint from bs4 import BeautifulSoup #get the XML file #response = requests.get('https://www.drugbank.ca/drugs/DB01048.xml') #with open('output.txt', 'w') as input: # input.write(response.content) #set up lists etc set_of_files = glob.glob('output*txt') val = lambda x: "{http://www.drugbank.ca}" +

Insert XML document into existing XML with Python

丶灬走出姿态 提交于 2020-01-02 19:15:10
问题 Given these XML documents: Document 1 <root> <element1> </element1> </root> Document 2 <request> <dummyValue>5</dummyValue> </request> Using Pythons ElementTree I'd like to insert the second document into the first document so that the result would look as follows. Resulting document <root> <element1> <request> <dummyValue>5</dummyValue> </request> </element1> </root> ET.SubElement(element1, request) gives me a serialization error. Is there a Pythonic way of doing this? 回答1: SubElement()

How to read root XML tag in python

拜拜、爱过 提交于 2020-01-02 06:53:08
问题 My question follows on from another stackoverflow question:- "How to get the root node of an xml file in Python?" from xml.etree import ElementTree as ET path = 'C:\cool.xml' et = ET.parse ( path ) root = et.getroot() When I extract and print the root tag, I receive:- <Element 'root' at 0x1234abcd> I then want to check that the root tag has a certain title, how do I pull out just the tag name? If I try: if root == "root": print 'something' it doesn't work, so I assume I need to convert 'root'