XML Parsing w/ Python Element Tree
I'm trying to parse a number of xml files that only sometimes have xmlns set. Is there any way to determine whether it's set w/o using the lxml library?
My main issue is when finding elements using find or findall, nothing is returned if the namespace is set since the tag doesn't match. But I can't hardcode the namespace in because sometimes there is no namespace set. I don't really know how to go about this.
Here's a sample of some of my code
tree = ET.parse(xml_file_path)
root = tree.getroot() #ONIXmessage
...
pids = product.findall("productidentifier")
...
So my main issue is with the findall() method
Thanks.
I will shortly be having this problem/question too. My thought was: use a wrapper function that first tries to get the elements without the namespace specified, and if that returns None
, then try with the namespace. If both return None, then the elements were not present. Using both functions (without if-else) works nicely if no default namespace is provided.
If the choice is between the same namespace either being specified or not, then I think ths approach above is okay. If you have multiple-optional-namespaces, it will make your wrapper more complicated but it's a one-time effort.
Would like to see a more elegant solution for this though. Did DanielHaley's answer work?
Related options:
find
, findall
, etc. register_namespace
as per the solution here, which works for writing out. *
to find but that's too generic to use to find specific elements. It's kind of a pain, but you could use local-name() in your XPath.
For example, instead of:
/foo/bar/baz
try:
/*[local-name()='foo']/*[local-name()='bar']/*[local-name()='baz']
链接地址: http://www.djcxy.com/p/58800.html
上一篇: Python XML ElementTree标记通配符
下一篇: XML解析与Python元素树