Python XML Parse with xml attributes
I have many rows in a file that contains xml and I'm trying to write a Python script that will go through those rows and count how many instances of a particular node attribute show up. For instance, my tree looks like:
<foo>
<bar>
<type name="controller">A</type>
<type name="channel">12</type>
</bar>
</foo>
I want to get text of line with 'name="controller"'. In the above xml text, I need to receive "A" and not "controller".
I used xml.etree.ElementTree
but it shows me the value of name attribute that is "controller".
Assuming your file is input.xml . You can use the following piece of code :
import xml.etree.ElementTree as ET
tree = ET.parse('input.xml')
tree_ = tree.findall('bar')
for i in tree_:
i_ = i.findall('type')
for elem in i_:
if elem.attrib['name'] == 'controller':
print elem.text
For xml.etree.ElementTree
, use the text
property of an Element
to get the text inside the element -
Example -
import xml.etree.ElementTree as ET
x = ET.fromstring('<a>This is the text</a>')
x.text
>> 'This is the text'
ElementTree supports some limited XPath (XPath is a language for specifying nodes in an xml file). We can use this to find all of your desired nodes and the text attribute to get their content.
import xml.etree.ElementTree as ET
tree = ET.parse("filename.xml")
for x in tree.findall(".//type[@name='controller']"):
print(x.text)
This will loop over all type elements whose name attribute is controller. In XPath the .// means all descendents of the current node and the name type means just those whose tag is type. The bracket is a predicate expression which means only nodes satisfiing a condition. @name means the name attribute. Thus this expression means to select all type nodes (no matter how deep) with a name attribute equal to controller.
In this example, I have just printed the text in the node. You can do whatever you want in the body of that loop.
If you want all nodes with that attribute and not just the type nodes, replace the argument to the findall function with
.//*[@name='controller']
The * matches ANY element node.
链接地址: http://www.djcxy.com/p/29956.html上一篇: 什么是实现JSR的API
下一篇: Python XML解析XML属性