groovy findAll expands <
I have an XML document with embedded HTML tags wrapped by "<" and ">" (it parses cleanly by XMLSlurper.parseText()
). When I use Groovy's depthFirst.findAll()
, the returned list shows the <
and >
replaced by <
and >
. This makes it difficult to subsequently search the original XML content, since the list items returned no longer match the characters in the original XML.
Fragment from the XML:
<label>Read about it <a href="http://whatever">here</a></label>
This code:
def root = new XmlSlurper().parseText(xml)
def list = root.depthFirst().findAll{ it.name().equalsIgnoreCase('label') }
Gives me:
Read about it <a href="http://whatever">here</a>
Is there a way to prevent sequences such as </> from being mangled by methods like findAll?
Take a look at this question - it's similar problem. Proposed solution works nicely in your case too:
def xml = '<label>Read about it <a href="http://whatever">here</a></label>'
def root = new XmlSlurper().parseText(xml)
def list = root.depthFirst().findAll{ it.name().equalsIgnoreCase('label') }
String content = new groovy.xml.StreamingMarkupBuilder().bind {
mkp.yield list[0].text()
}
assert content == 'Read about it <a href="http://whatever">here</a>'
链接地址: http://www.djcxy.com/p/87626.html
上一篇: 在XSLT 1.0中转换时编码
下一篇: groovy findAll扩展&lt;