XML Parsing w/ Python Element Tree

I'm trying to parse a number of xml files that only sometimes have xmlns set. Is there any way to determine whether it's set w/o using the lxml library?

My main issue is when finding elements using find or findall, nothing is returned if the namespace is set since the tag doesn't match. But I can't hardcode the namespace in because sometimes there is no namespace set. I don't really know how to go about this.

Here's a sample of some of my code

 tree = ET.parse(xml_file_path)
 root = tree.getroot() #ONIXmessage
 ...
 pids = product.findall("productidentifier")
 ...

So my main issue is with the findall() method

Thanks.


I will shortly be having this problem/question too. My thought was: use a wrapper function that first tries to get the elements without the namespace specified, and if that returns None , then try with the namespace. If both return None, then the elements were not present. Using both functions (without if-else) works nicely if no default namespace is provided.

If the choice is between the same namespace either being specified or not, then I think ths approach above is okay. If you have multiple-optional-namespaces, it will make your wrapper more complicated but it's a one-time effort.

Would like to see a more elegant solution for this though. Did DanielHaley's answer work?

Related options:

  • There's also this answer to specify the namespace in find , findall , etc.
  • Could try register_namespace as per the solution here, which works for writing out.
  • This one suggests using * to find but that's too generic to use to find specific elements.
  • Suppress namespaces altogether
  • If desperate, you can try using regex

  • It's kind of a pain, but you could use local-name() in your XPath.

    For example, instead of:

    /foo/bar/baz
    

    try:

    /*[local-name()='foo']/*[local-name()='bar']/*[local-name()='baz']
    
    链接地址: http://www.djcxy.com/p/58800.html

    上一篇: Python XML ElementTree标记通配符

    下一篇: XML解析与Python元素树