Regexp for html
Possible Duplicate:
RegEx match open tags except XHTML self-contained tags
I have the following string:
$str = "
<li>r</li>
<li>a</li>
<li>n</li>
<li>d</li>
...
<li>om</li>
";
How do I get the HTML for the first n-th <li>
tags?
Ex : n = 3 ; result = "<li>r<...>n</li>;
I would like a regexp if possible.
Like this.
$dom = new DOMDocument();
@$dom->loadHTML($str);
$x = new DOMXPath($dom);
// we wan the 4th node.
foreach($x->query("//li[4]") as $node)
{
echo $node->c14n()
}
Oh yeah, learn xpath, it will save you lots of trouble in the future.
The Solution of @Byron but with SimpleXML:
$xml = simplexml_load_string($str);
foreach($xml->xpath("//li[4]") as $node){
echo $node[0]; // The first element is the text node
}
EDIT : Another reason I really like at simplexml is the easy debugging of the content of a node. You can just use print_r($xml) to print the object with it's child nodes.
As I'm sure you are aware it is not a good idea to use regular expressions to work through HTML unless you were to "tidy" it first.
A very viable solution in PHP would be to navigate the HTML structure using Simple XML (http://php.net/manual/en/book.simplexml.php) or as a DOM Document (http://php.net/manual/en/class.domdocument.php).
链接地址: http://www.djcxy.com/p/76850.html上一篇: 正则表达式去除链接
下一篇: 正则表达式的HTML