groovy findAll expands &lt

2018-07-01 10:06:18

I have an XML document with embedded HTML tags wrapped by "&lt" and "&gt" (it parses cleanly by XMLSlurper.parseText() ). When I use Groovy's depthFirst.findAll() , the returned list shows the &lt and &gt replaced by < and > . This makes it difficult to subsequently search the original XML content, since the list items returned no longer match the characters in the original XML.

Fragment from the XML:

<label>Read about it &lt;a href="http://whatever"&gt;here&lt;/a&gt;</label>

This code:

def root = new XmlSlurper().parseText(xml)
def list = root.depthFirst().findAll{ it.name().equalsIgnoreCase('label') }

Gives me:

Read about it <a href="http://whatever">here</a>

Is there a way to prevent sequences such as &lt/&gt from being mangled by methods like findAll?

Take a look at this question - it's similar problem. Proposed solution works nicely in your case too:

def xml = '<label>Read about it &lt;a href="http://whatever"&gt;here&lt;/a&gt;</label>'

def root = new XmlSlurper().parseText(xml)
def list = root.depthFirst().findAll{ it.name().equalsIgnoreCase('label') }

String content = new groovy.xml.StreamingMarkupBuilder().bind {
  mkp.yield list[0].text()
}

assert content == 'Read about it &lt;a href="http://whatever"&gt;here&lt;/a&gt;'

链接地址: http://www.djcxy.com/p/87626.html

上一篇: 在XSLT 1.0中转换时编码

下一篇: groovy findAll扩展＆lt;