lxml.objectify and leading zeros

此生再无相见时 提交于 2019-12-24 02:09:11

问题


When the objectify element is printed on the console, the leading zero is lost, but it is preserved in the .text:

>>> from lxml import objectify
>>> 
>>> xml = "<a><b>01</b></a>"
>>> a = objectify.fromstring(xml)
>>> print(a.b)
1
>>> print(a.b.text)
01

From what I understand, objectify automatically makes the b element an IntElement class instance. But, it also does that even if I try to explicitly set the type with an XSD schema:

from io import StringIO
from lxml import etree, objectify

f = StringIO('''
   <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
     <xsd:element name="a" type="AType"/>
     <xsd:complexType name="AType">
       <xsd:sequence>
         <xsd:element name="b" type="xsd:string" />
       </xsd:sequence>
     </xsd:complexType>
   </xsd:schema>
 ''')
schema = etree.XMLSchema(file=f)
parser = objectify.makeparser(schema=schema)

xml = "<a><b>01</b></a>"
a = objectify.fromstring(xml, parser)
print(a.b)
print(type(a.b))
print(a.b.text)

Prints:

1
<class 'lxml.objectify.IntElement'>
01

How can I force objectify to recognize this b element as a string element?


回答1:


Based on the documentation and the behavior observed, it seems that XSD Schema is only used for validation, but isn't involved in the process of determining property data type whatsoever.

For example, when an element is declared to be of type integer in the XSD, but then the actual element in the XML has value of x01, element invalid exception is correctly raised :

f = StringIO(u'''
   <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
     <xsd:element name="a" type="AType"/>
     <xsd:complexType name="AType">
       <xsd:sequence>
         <xsd:element name="b" type="xsd:integer" />
       </xsd:sequence>
     </xsd:complexType>
   </xsd:schema>
 ''')
schema = etree.XMLSchema(file=f)
parser = objectify.makeparser(schema=schema)

xml = '''<a><b>x01</b></a>'''
a = objectify.fromstring(xml, parser)
# the following exception raised:
# lxml.etree.XMLSyntaxError: Element 'b': 'x01' is not a valid value of....
# ...the atomic type 'xs:integer'.

Despite objectify documentation on how data types are matched mentioned about XML Schema xsi:type (no. 4 in the linked section), the example code there suggests that it means by adding xsi:type attribute directly in the actual XML element, not via a separate XSD file, for example :

xml = '''
<a xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <b xsi:type="string">01</b>
</a>
'''
a = objectify.fromstring(xml)

print(a.b)  # 01
print(type(a.b)) # <type 'lxml.objectify.StringElement'>
print(a.b.text) # 01


来源:https://stackoverflow.com/questions/35960223/lxml-objectify-and-leading-zeros

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!