如何解析Python中的XML？

2018-05-30 18:10:10

我在包含xml的数据库中有很多行，并且我正在尝试编写一个Python脚本，它将通过这些行并计算出特定节点属性的实例数。例如，我的树看起来像：

<foo>
   <bar>
      <type foobar="1"/>
      <type foobar="2"/>
   </bar>
</foo>

我如何使用Python访问XML中的属性1和2？

我建议ElementTree 。还有其他兼容的API实现，如Python标准库本身中的lxml和cElementTree ; 但是在这种情况下，他们主要添加的速度更快 - 编程部分的轻松程度取决于ElementTree定义的API。

构建一个元素实例后e从XML，例如使用XML功能，或通过解析一个文件的东西，如

import xml.etree.ElementTree
e = xml.etree.ElementTree.parse('thefile.xml').getroot()

或者在ElementTree展示的许多其他方式中的任何一种，您只需执行以下操作：

for atype in e.findall('type'):
    print(atype.get('foobar'))

和类似的，通常非常简单的代码模式。

minidom是最快最直截了当的：

XML：

<data>
    <items>
        <item name="item1"></item>
        <item name="item2"></item>
        <item name="item3"></item>
        <item name="item4"></item>
    </items>
</data>

蟒蛇：

from xml.dom import minidom
xmldoc = minidom.parse('items.xml')
itemlist = xmldoc.getElementsByTagName('item')
print(len(itemlist))
print(itemlist[0].attributes['name'].value)
for s in itemlist:
    print(s.attributes['name'].value)

OUTPUT

4
item1
item1
item2
item3
item4

你可以使用BeautifulSoup

from bs4 import BeautifulSoup

x="""<foo>
   <bar>
      <type foobar="1"/>
      <type foobar="2"/>
   </bar>
</foo>"""

y=BeautifulSoup(x)
>>> y.foo.bar.type["foobar"]
u'1'

>>> y.foo.bar.findAll("type")
[<type foobar="1"></type>, <type foobar="2"></type>]

>>> y.foo.bar.findAll("type")[0]["foobar"]
u'1'
>>> y.foo.bar.findAll("type")[1]["foobar"]
u'2'

链接地址: http://www.djcxy.com/p/5083.html

上一篇: How do I parse XML in Python?

下一篇: How does one parse XML files?