elementtree

how to find and edit tags in XML files with namespaces using ElementTree

青春壹個敷衍的年華 提交于 2021-02-02 09:57:26
问题 I would like to find specific tags in my XML document and edit their text or attributes. My XML file contains namespaces (and as I understand it correctly, nested namespaces). The tool I'd like to use for this purpose is ElementTree. I managed to read XML file by iterparse , however I don't know how I can save edited XML, because iterparse doesn't have write element. I need a solution to read XML file by parse and strip its namespaces and nested namespaces or a way to save iterparsed file.

how to find and edit tags in XML files with namespaces using ElementTree

╄→尐↘猪︶ㄣ 提交于 2021-02-02 09:56:10
问题 I would like to find specific tags in my XML document and edit their text or attributes. My XML file contains namespaces (and as I understand it correctly, nested namespaces). The tool I'd like to use for this purpose is ElementTree. I managed to read XML file by iterparse , however I don't know how I can save edited XML, because iterparse doesn't have write element. I need a solution to read XML file by parse and strip its namespaces and nested namespaces or a way to save iterparsed file.

python xml.etree.ElementTree remove empty tag in the middle of text

你。 提交于 2021-01-29 14:49:49
问题 I have an xml document from which I want to extract text based on tags. The part that I want to extract text from looks something like this : <BlockText attr1="blah" attr2=657 ID="Bhf76" lang="en"> Simply dummy text of the printing and typesetting industry. It has survived not only<TIP CONTENT="­"/>\n five centuries, electronic typesetting, remaining essentially release. </BlockText> When I do tree = ET.parse("myfile.xml") root = tree.getroot() tags = list(set([elem.tag for elem in root.iter(

Include one XML within another XML and parse it with python

混江龙づ霸主 提交于 2021-01-29 13:23:34
问题 I wanted to include an XML file in another XML file and parse it with python. I am trying to achieve it through Xinclude. There is a file1.xml which looks like <?xml version="1.0"?> <root> <document xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include href="file2.xml" parse="xml" /> </document> <test>some text</test> </root> and file2.xml which looks like <para>This is a paragraph.</para> Now in my python code i tried to access it like: from xml.etree import ElementTree, ElementInclude

ParseError: undefined entity while parsing XML file in Python

感情迁移 提交于 2021-01-29 12:10:37
问题 I have a big XML file with several article nodes. I have included only one with the problem. I try to parse it in Python to filter some data and I get the error File "<string>", line unknown ParseError: undefined entity Ö: line 90, column 17 Sample of the XML file <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE dblp SYSTEM "dblp.dtd"> <dblp> <article mdate="2019-10-25" key="tr/gte/TR-0146-06-91-165" publtype="informal"> <author>Alejandro P. Buchmann</author> <author>M. Tamer Özsu<

'Exception while reading request', 'detail': 'Cannot decode: java.io.StringReader@1aac9ea'}, 'status': 'failure'}

元气小坏坏 提交于 2021-01-29 11:12:12
问题 My first question is, what does "cannot decode: java.io.stringreader" mean? My second question is, why do some strings work and others do not? I'm guessing that there's an issue with python passing certain strings with d0rked encoding? My third question is, am I converting the XML values correctly before passing them to the REST body? I'm taking XML files from a directory, parsing the XML, then populating a REST post to a ServiceNow instance. Essentially I'm bringing a TON of legacy tickets

Python write siblings in XML below desired tag

帅比萌擦擦* 提交于 2021-01-29 10:37:37
问题 I am trying to add some sibling tags after <VIDPOM>10</VIDPOM> tag My XML looks like: <ZAP> <N_ZAP>999</N_ZAP> <SLUCH> <IDCASE>100100100</IDCASE> <USL_OK>3</USL_OK> <VIDPOM>10</VIDPOM> <IDSP>99</IDSP> <USL> <IDSERV>123456789</IDSERV> <DATE_IN>2020-12-01</DATE_IN> </USL> </SLUCH> </ZAP> But I want to make it like this: <ZAP> <N_ZAP>999</N_ZAP> <SLUCH> <IDCASE>100100100</IDCASE> <USL_OK>3</USL_OK> <VIDPOM>10</VIDPOM> <MY_CUSTOM_TAG>TEXT IS HERE</MY_CUSTOM_TAG> <IDSP>99</IDSP> <USL> <IDSERV

lxml/python reading xml with CDATA section

夙愿已清 提交于 2021-01-29 09:40:11
问题 In my xml I have a CDATA section. I want to keep the CDATA part, and then strip it. Can someone help with the following? Default does not work: $ from io import StringIO $ from lxml import etree $ xml = '<Subject> My Subject: 美海軍研究船勘查台海水文? 船<![CDATA[é]]>€ </Subject>' $ tree = etree.parse(StringIO(xml)) $ tree.getroot().text ' My Subject: 美海軍研究船勘查台海水文? 船é€ ' This post seems to suggest that a parser option strip_cdata=False may keep the cdata, but it has no effect: $ parser=etree.XMLParser

ElementTree's .write() changes strings with " to "

纵然是瞬间 提交于 2021-01-29 08:37:22
问题 In my code, I am changing an existed formatted string in XML with predefined format with ElementTree in Python. <Value xsi:type='xs:string'>{"name":"Test123","type":"}</Value> New text adding by: ValueNode.text = '{"name":"NewTextdemo"}' and to save the file I am using doc.write(path_to_XML_file) The problem is, that the doc.write(path_to_XML_file) is changing the " to &quot; further entity name - and so the result XML is invalid. Does anybody know how to avoid it? How to set write function

How to properly parse parent/child XML with Python

时光毁灭记忆、已成空白 提交于 2021-01-28 21:34:42
问题 I have a XML parsing issue that I have been working on for the last few days and I just can't figure it out. I've used both the ElementTree built-in to Python as well as the LXML libraries but get the same results. I would like to continue using ElementTree if I can, but if there are limitations to that library then LXML would do. Please see the following XML example. What I am trying to do is find a connection element and see what classes that element contains. I am expecting each connection