python解析xml实例

元气小坏坏 提交于 2020-02-21 23:33:56

如下,一个银行卡打标签后导出的数据

<?xml version="1.0" encoding="ISO-8859-1"?>
<annotation>
<filename>a001.jpg</filename>
<folder>users/three33//card</folder>
<source>
<submittedBy>three</submittedBy>
</source>
<imagesize>
<nrows>2240</nrows>
<ncols>3968</ncols>
</imagesize>
<object>
<name>numbers</name>
<deleted>0</deleted>
<verified>0</verified>
<occluded>no</occluded>
<attributes>6228480808055442079</attributes>
<parts>
<hasparts/>
<ispartof/>
</parts>
<date>12-May-2019 06:21:39</date>
<id>0</id>
<type>bounding_box</type>
<polygon>
<username>anonymous</username>
<pt>
<x>927</x>
<y>1278</y>
</pt>
<pt>
<x>3269</x>
<y>1278</y>
</pt>
<pt>
<x>3269</x>
<y>1475</y>
</pt>
<pt>
<x>927</x>
<y>1475</y>
</pt>
</polygon>
</object>
</annotation>
View Code

(上面的代码无法保留格式,还是截张图吧

现要将其中的标记的四个坐标和银行卡号读取出来,并保存到文本文件。由于有几百张图片,需要批处理。

代码:

 1 import os
 2 import sys
 3 import xml.etree.cElementTree as ET
 4 
 5 
 6 from_path = "./card"     //输入文件夹
 7 to_path = "./cardout"    //输出文件夹
 8 files = os.listdir(from_path)
 9 files.sort()      #按字典序排序
10 
11 
12 i = 1
13 for filename in files:
14     
15     dir1 = os.path.join(from_path, filename)
16     tree = ET.ElementTree(file=dir1)
17     root = tree.getroot()
18 
19     new_filename = filename[:-4] + ".txt"
20     dir2 = os.path.join(to_path,new_filename)
21 
22     fobj = open(dir2,'w+')
23 
24     print("time: %d, from_filename: %s, to_filename: %s" % (i, dir1, dir2))
25 
26     for elem in tree.iterfind('object/polygon/pt'):
27         fobj.write((elem[0].text + ',' + elem[1].text + ','))
28         #print(elem[0].text + ',' + elem[1].text + ',')
29 
30     for elem in tree.iterfind('object/attributes'):
31         fobj.write(elem.text)
32 
33     fobj.close()
34     i = i + 1
35     

效果:

 

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!