Best way to parse an invalid HTML in PHP
Is there a better approach to parse an invalid HTML then applying Tidy on it?
Side Note : There are some situation when you can't have Tidy available. Regexp is also not recommended I understood for parsing html.
I would try something like this: http://php.net/manual/en/domdocument.loadhtml.php
From that page:
The function parses the HTML contained in the string source. Unlike loading XML, HTML does not have to be well-formed to load . This function may also be called statically to load and create a DOMDocument object.
已知SimpleHTMLDOM比PHP的本地DOM功能更宽松。
链接地址: http://www.djcxy.com/p/5080.html上一篇: 如何解析XML文件?
下一篇: 在PHP中解析无效HTML的最佳方法