I'm trying to put list of possible solutions for browser automatic tests suits and headless browser platforms capable of scraping.
BROWSER TESTING / SCRAPING:
Selenium - polyglot flagship in browser automation, bindings for Python, Ruby, JavaScript, C#, Haskell and more, IDE for Firefox (as an extension) for faster test deployment. Can act as a Server and has tons of features. JAVASCRIPT
PhantomJS - JavaScript , headless testing with screen capture and automation, uses Webkit . As of version 1.8 Selenium's WebDriver API is implemented, so you can use any WebDriver binding and tests will be compatible with Selenium SlimerJS - similar to PhantomJS, uses Gecko (Firefox) instead of WebKit CasperJS - JavaScript , build on both PhantomJS and SlimerJS, has extra features Ghost Driver - JavaScript implementation of the WebDriver Wire Protocol for PhantomJS . new PhantomCSS - CSS regression testing. A CasperJS module for automating visual regression testing with PhantomJS and Resemble.js. new WebdriverCSS - plugin for Webdriver.io for automating visual regression testing new PhantomFlow - Describe and visualize user flows through tests. An experimental approach to Web user interface testing. new trifleJS - ports the PhantomJS API to use the Internet Explorer engine. new CasperJS IDE (commercial) NODE.JS
Node-phantom - bridges the gap between PhantomJS and node.js WebDriverJs - Selenium WebDriver bindings for node.js by Selenium Team WD.js - node module for WebDriver/Selenium 2 yiewd - WD.js wrapper using latest Harmony generators! Get rid of the callback pyramid with yield ZombieJs - Insanely fast, headless full-stack testing using node.js NightwatchJs - Node JS based testing solution using Selenium Webdriver Chimera - Chimera: can do everything what phantomJS does, but in a full JS environment Dalek.js - Automated cross browser testing with JavaScript through Selenium Webdriver Webdriver.io - better implementation of WebDriver bindings with predefined 50+ actions Nightmare - Electron bridge with a high-level API. jsdom - Tailored towards web scraping. A very lightweight DOM implemented in Node.js, it supports pages with javascript. WEB SCRAPING / MINING
Scrapy - Python , mainly a scraper/miner - fast, well documented and, can be linked with Django Dynamic Scraper for nice mining deployments, or Scrapy Cloud for PaaS (server-less) deployment, works in terminal or an server stand-alone proces, can be used with Celery , built on top of Twisted Snailer - node.js module, untested yet. Node-Crawler - node.js module, untested yet. ONLINE TOOLS
new Online HTTP client - Dedicated SO answer dead CasperBox - Run CasperJS scripts online
RELATED LINKS & RESOURCES
Comparsion of Webscraping software new Resemble.js : Image analysis and comparison Questions:
Any pure Node.js solution or Nodejs to PhanthomJS/CasperJS module that actually works and is documented? Answer: Chimera seems to go in that direction, checkout Chimera
Other solutions capable of easier JavaScript injection than Selenium?
Do you know any pure ruby solutions?
Answer: Checkout the list created by rjk with ruby based solutions
Do you know any related tech or solution? Feel free to reedit this question and add content as you wish! Thank you for your contributions!
Updates
added SlimerJS to the list added Snailer and Node-Crawler and Node-phantom added Yiewd WebDriver wrapper added WebDriverJs and WD.js added Ghost Driver added Comparsion of Webscraping software on Screen Scraper Blog added ZombieJs added Resemble.js and PhantomCSS and PhantomFlow, categorised and reedited content 04.01.2014, added Chimera, answered 2 questions added NightWatchJs added DalekJS added WebdriverCSS added CasperBox added trifleJS added CasperJS IDE added Nightmare added jsdom added Online HTTP client, updated CasperBox (dead)
If Ruby is your thing, you may also try:
https://github.com/chriskite/anemone (dev stopped) https://github.com/sparklemotion/mechanize https://github.com/postmodern/spidr https://github.com/stewartmckee/cobweb http://watirwebdriver.com/ (Selenium) also, Nokogiri gem can be used for scraping:
http://nokogiri.org/ there is a dedicated book about how to utilise nokogiri for scraping by packt publishing
http://triflejs.org/就像phantomjs,但基于IE
A kind of JS-based Selenium is Dalek.js. It not only aims for automated frontend-tests, you can also do screenshots with it. It has webdrivers for all important browsers. Unfortunately those webdrivers seem to be worth improving (just not to say "buggy" to Firefox).
链接地址:
http://www.djcxy.com/p/50686.html
上一篇:
如何在使用TestNG的Selenium测试期间关闭打开的驱动程序
下一篇:
无头浏览器和刮