HTML Scraping in Php

This question already has an answer here:

  • How do you parse and process HTML/XML in PHP? 28 answers

  • I would recomend PHP Simple HTML DOM Parser after you have scraped the HTML from the page. It supports invalid HTML, and provides a very easy way to handle HTML elements.


    If the page you're scraping is valid X(HT)ML, then any of PHP's built-in XML parsers will do.

    I haven't had much success with PHP libraries for scraping. If you're adventurous though, you can try simplehtmldom. I'd recommend Hpricot for Ruby or Beautiful Soup for Python, which are both excellent parsers for HTML.


    I would also recommend 'Simple HTML DOM Parser.' It is a good option particularly if your familiar with jQuery or JavaScript selectors then you will find yourself at home.

    I have even blogged about it in the past.

    链接地址: http://www.djcxy.com/p/29892.html

    上一篇: 来自字符串的元素,在PHP中

    下一篇: 在Php中的HTML刮