Python API XML parsing
I'm parsing some XML returned from my Tableau server via API using Python. Namespaces are involved, and I think I may be lacking some fundamental understanding of how they work. Here is what my XML looks like:
<tsResponse version-and-namespace-settings>
<parent type="Project" id="1f2f3e4e-5d6d-7c8c-9b0b-1a2a3f4f5e6e" />
<permissions>
<workbook id="1a1b1c1d-2e2f-2a2b-3c3d-3e3f4a4b4c4d" name="Finance">
<owner id="9f9e9d9c-8b8a-8f8e-7d7c-7b7a6f6d6e6d"/>
</workbook>
<granteeCapabilities>
<group id="1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d"/>
<capabilities>
<capability name="Read" mode="Allow"/>
<capability name="Filter" mode="Allow"/>
<capability name="ViewUnderlyingData" mode="Allow"/>
<capability name="ExportImage" mode="Allow"/>
<capability name="ExportData" mode="Allow"/>
<capability name="AddComment" mode="Allow"/>
<capability name="ViewComments" mode="Allow"/>
<capability name="ShareView" mode="Allow"/>
</capabilities>
</granteeCapabilities>
</permissions>
</tsResponse>
Here is the code I'm running, pared down to where the problem occurs. My aim is to initially identify each group id under a given workbook id, then find each capability under the group.
xmlns = {'t': 'http://tableau.com/api'}
test_response1 = []
test_response2 = []
url = "tableau.my.org/api/2.4/sites/siteid/workbooks/workbookid/permissions?pageSize=1000".format()
server_response_WB2 = requests.get(url, headers={'x-tableau-auth': auth_token})
test_response1.append(server_response_WB2.text)
server_response_WB2 = ET.fromstring(_encode_for_display(server_response_WB2.text))
permissions = server_response_WB2.findall('.//t:permissions', namespaces=xmlns)
for permission in permissions:
capabilities = permission.findall('granteeCapabilities')
test_response2.append(capabilities)
print test_response1
print test_response2
test_response1 contains a list like:
[[<Element '{http://tableau.com/api}permissions' at 0x3c07d70>],
[<Element '{http://tableau.com/api}permissions' at 0x3bb8dd0>]]
test_response2 however, returns a list of empty lists:
[[], [], []]
In the code above, I'm looking for 'granteeCapabilities' as a tag. I've also tried looking for it as a path, using the namespace, like so:
capabilities = permission.findall('.//t:permissions/granteeCapabilities', namespaces=xmlns)
This returns the same result. A list of empty lists. Why am I able to find the data under permissions, but not at lower levels?
If the problem is likely has something to do with XML namespaces then you really should not omit the namespaces part of your XML form the question since it is fundamental for diagnosing the problem and for writing a working solution.
I suspect the namespace http://tableau.com/api
was declared as a default namespaces (the one without prefix) in the XML, in which case all descendant elements without prefix implicitly inherit the same namespace from the ancestor. This would explain why granteeCapabilities
didn't work for you, and you should try to add the prefix t
here as well :
capabilities = permission.findall('t:granteeCapabilities', namespaces=xmlns)
If you don't need any information from permissions
element, you can just get granteeCapabilities
directly :
capabilities = root.findall('.//t:granteeCapabilities', namespaces=xmlns)
Here is a short but complete example demonstrating the solution :
raw = '''<tsResponse xmlns="http://tableau.com/api" xmlns:xsi="w3.org/2001/XMLSchema-instance" xsi:schemaLocation="tableau/com/api tableau.com/api/ts-api-2.4.xsd">
<parent type="Project" id="1f2f3e4e-5d6d-7c8c-9b0b-1a2a3f4f5e6e" />
<permissions>
<granteeCapabilities>
<group id="1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d"/>
<capabilities>
....
</capabilities>
</granteeCapabilities>
</permissions>
</tsResponse>'''
from xml.etree import ElementTree as ET
root = ET.fromstring(raw)
xmlns = {'t': 'http://tableau.com/api'}
permissions = root.findall('.//t:permissions', namespaces=xmlns)
for permission in permissions:
capabilities = permission.findall('t:granteeCapabilities', namespaces=xmlns)
print capabilities
eval.in demo
output :
[<Element '{http://tableau.com/api}granteeCapabilities' at 0x402f430c>]
链接地址: http://www.djcxy.com/p/58806.html
上一篇: Twitter的API获取推文(职位)
下一篇: Python API XML解析