Python 3.4 lxml.etree: Start tag expected, '<' not found, line 1, column 1

廉价感情. 提交于 2019-12-22 06:31:48

问题


Friends,

As a novice at best, I have not been able to figure this out given what is available in forums. Ultimately, all I want to do is take some simple xml files and convert them all to CSV in one go (though this code is just for one at a time). It looks to me like there are no official name spaces, but I'm not sure. I have this code (I used one header, 'SubmittingSystemVendor', but I really want to write all of them to CSV:

import csv
import lxml.etree
x = r'C:\Users\...\jh944.xml'

with open('output.csv', 'w') as f:
    writer = csv.writer(f)
    writer.writerow('SubmittingSystemVendor')
    root = lxml.etree.fromstring(x)

    writer.writerow(row)

Here is a sample of the XML file:

<?xml version="1.0" encoding="utf-8"?>
<EOYGeneralCollectionGroup SchemaVersionMajor="2014-2015" SchemaVersionMinor="1" CollectionId="157" SubmittingSystemName="MISTAR" SubmittingSystemVendor="WayneRESA" SubmittingSystemVersion="2014" xsi:noNamespaceSchemaLocation="http://cepi.state.mi.us/msdsxml/EOYGeneralCollection2014-20151.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <EOYGeneralCollection>
        <SubmittingEntity>
            <SubmittingEntityTypeCode>D</SubmittingEntityTypeCode>
            <SubmittingEntityCode>82730</SubmittingEntityCode>
        </SubmittingEntity>

Thanks in advance!


回答1:


You are using lxml.etree.fromstring, but giving it a file path as the argument. This means it's trying to interpret "C:\Users...\jh944.xml" as the XML data to be parsed.

Instead, you want to open the file containing this XML. You can simply replace the call to fromstring with lxml.etree.parse, which will accept a filename or open file object as the argument.



来源:https://stackoverflow.com/questions/32510295/python-3-4-lxml-etree-start-tag-expected-not-found-line-1-column-1

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!