Nexpose XML report Version 2.0, How to remove HTML from XML?
I've made a php parser for PHP for Nxpose XML version 2.0 and it works fine but recently the parser fails.
The problem seems to be because the XML that I'm trying to parse has HTML between the XML Elements without a CDATA tags, that means that the HTML code has invalid characters. So the XML is not valid to parse with the libraries I'm using, xmlReader and simpleXML.
This is a example the kind of lines that are invalid for this DOM libraries of PHP:
<Paragraph preformat="true">98: 99: <BODY scroll="AUTO" bgColor="#FFFFFF" text="#000000" onload="setFo... 100: <FORM action="/exchweb/bin/auth/owaauth.dll" method="POST" name="... 101: 98: <INPUT type="hidden" name="destination" value="
http://www.rapid7.com"...</Paragraph>
Any Idea how to detect all lines like this one and delete it?
Right now the only pattern I detect to find this lines is hat before a HTML code are number as identifiers with the following pattern:
<number>:<html-code>
Thanks in advance for your help guys.
Kind Regards
你应该试试这个:
<Paragraph.+[0-9]:.+</Paragraph>
链接地址: http://www.djcxy.com/p/64752.html