Parsing HTML Source to extract Anchor and Link tags href value
I am looking for some HTML Parser in PHP which can help me extract href values
from the html source.
I looked at phpQuery and its best but it is to be too overkill for my needs and cosume a lot of CPU doing the extra stuff that I dont need.
I also checked
$dom = new DomDocument();
$dom->loadHTML($html);
but it has problems parsing HTML5
tags.
Is there any better library/class
or a way to do it?
那么,你可以使用正则表达式来提取数据:
$html = "This is some stuff right here. <a href='index.html'>Check this out!</a> <a href=herp.html>And this is another thing!</a> <a href="derp.html">OH MY GOSH</a>";
preg_match_all('/href=['"]?([^s>'"]*)['">]/', $html, $matches);
$hrefs = ($matches[1] ? $matches[1] : false);
print_r($hrefs);
simplehtmldom is a handy PHP HTML parsing class
http://simplehtmldom.sourceforge.net/
我用这个 - -
$html = '<a href="http://google.com"><img src="images/a.png" /></a>';
preg_match('/href="([^s"]+)/', $html, $match);
echo '<pre>';
print_r($match);
链接地址: http://www.djcxy.com/p/29846.html