Git Product home page Git Product logo

headlessbrowsers's Introduction

Headless Browsers

A list of (almost) all headless web browsers in existence

A web browser without a graphical user interface, controlled programmatically. Used for automation, testing, and other purposes.

Browser engines

These browser engines fully render web pages or run JavaScript in a virtual DOM

Name About Supported Languages License
Chromium Embedded Framework CEF is a open source project based on the Google Chromium project. JavaScript BSD
Erik Headless browser on top of Kanna and WebKit. Swift MIT
jBrowserDriver A Selenium-compatible headless browser which is written in pure Java. WebKit-based. Works with any of the Selenium Server bindings. Java Apache License v2.0
PhantomJS [Unmaintained] PhantomJS is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG. JavaScript, Python, Ruby, Java, C#, Haskell, Objective-C, Perl, PHP, R(via Selenium) BSD 3-Clause
Splash Splash is a javascript rendering service with an HTTP API. It's a lightweight browser with an HTTP API, implemented in Python using Twisted and QT. Any BSD 3-Clause
Surf Surf is an open source project that implements a virtual web browser that can be controlled programatically Go MIT

Multi drivers

These libraries can control multiple browser engines (typically using Selenium)

Name About Supported Languages License
CasperJS [Unmaintained] CasperJS is an open source navigation scripting & testing utility written in Javascript for the PhantomJS WebKit headless browser and SlimerJS (Gecko). JavaScript MIT
Geb Geb is a Groovy interface to WebDriver. Groovy Apache
Playwright Playwright is a Node library to automate the Chromium, WebKit and Firefox browsers with a single API TypeScript Apache
playwright-dotnet Playwright for .NET is a library to automate Chromium, Firefox and WebKit browsers with a single API. .NET MIT
playwright-python Playwright for Python is a library to automate Chromium, Firefox and WebKit browsers with a single API. Python Apache
playwright-java Playwright for Java is a library to automate Chromium, Firefox and WebKit browsers with a single API. Java Apache
playwright-go Playwright for Go is a library to automate Chromium, Firefox and WebKit browsers with a single API. Go MIT
Selenium Selenium is a suite of tools to automate web browsers across many platforms. JavaScript, Python, Ruby, Java, C#, Haskell, Objective-C, Perl, PHP, R Apache
Splinter Splinter is an open source tool for testing web applications using Python. It lets you automate browser actions, such as visiting URLs and interacting with their items. Python -
SST SST (selenium-simple-test) is a web test framework that uses Python to generate functional browser-based tests. Python -
Watir The most elegant way to use Selenium WebDriver with Ruby. Ruby MIT

PhantomJS drivers

These libraries control PhantomJS

Name About Supported Languages License
Ghostbuster Automated browser testing via phantom.js, with all of the pain taken out! That means you get a real browser, with a real DOM, and can do real testing! JavaScript Not specified
jedi-crawler Lightsabing Node/PhantomJS crawler; scrape dynamic content : without the hassle JavaScript Not specified
Lotte Lotte is a headless, automated testing framework built on top of PhantomJS and inspired by Ghostbuster. JavaScript MIT
phantompy Phantompy is a headless WebKit engine with powerful pythonic api build on top of Qt5 Webkit Python LGPL-2.1
X-RAY Supports strings, arrays, arrays of objects, nested object structures, selector API, pagination, crawler, concurrency, throttles, delays, timeouts, and pluggable drivers (PhantomJS, HTTP) JavaScript MIT
Horseman Promise based Node.js module for PhantomJS. Features chainable API, understandable control-flow, support for multiple tabs, and built-in jQuery. JavaScript MIT

Chromium drivers

These libraries control Chromium

Name About Supported Languages License
Awesomium Chromium-based headless browser engine C++, .NET Free/Commercial
Headless Chromium Chromium feature activated with the --headlesss flag, currently availible in the nightly build of Chromium, not yet released C++ Opensource
Puppeteer Headless Chrome Node API from the Chrome DevTools team JavaScript Apache
PuppeteerSharp PuppeteerSharp is a .NET port of the official Headless Chrome Node.JS Puppeteer API .NET MIT
chrome-remote-interface Chrome Debugging Protocol interface for Node.js JavaScript MIT
Chromy Features chainable API, mobile emulation, fundamental API such as javascript evaluation. JavaScript MIT
chromedp A faster, simpler way to drive browsers (Chrome, Edge, Safari, Android, etc) without external dependencies (ie, Selenium, PhantomJS, etc) using the Chrome Debugging Protocol. Go MIT
Chromeless Chrome automation made simple. Runs locally or headless on AWS Lambda. JavaScript MIT
Chrome PHP PHP API to drive Chromium or Google Chrome via the Chrome Devtools Protocole. PHP Fair
Wendigo Test-oriented browser automation library using Puppeteer. JavaScript GPL-3.0
cdp4j A web-automation library for Java. It can be used for automating the use of web pages and for testing web pages. It use Google Chrome DevTools Protocol to automate Chrome/Chromium based browsers. Java cdp4j Commercial License
Pyppeteer Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library Python MIT
Headless Chrome A high-level API to control headless Chrome or Chromium over the DevTools Protocol. Rust None

Webkit drivers

These drivers control an in-process instance of Webkit

Name About Supported Languages License
Browserjet Runs a custom build of webkit, controlled by node.js interface. JavaScript Not specified
ghost.py ghost.py is a webkit web client written in python. Python MIT
headless_browser Headless browser based on WebKit written in C++. C++ Not Specified
Jabba-Webkit Jabba's headless webkit browser for scraping AJAX-powered webpages. Python Not specified
Jasmine-Headless-Webkit jasmine-headless-webkit uses the QtWebKit widget to run your specs without needing to render a pixel. Python, JavaScript, Ruby Free
Python-Webkit Python-Webkit is a python extension to Webkit to add full, complete access to Webkit's DOM Python GNU
Spynner Programmatic web browsing module with AJAX support for Python Python Not specified
Webloop Scriptable, headless WebKit with a Go API. Go BSD 3-Clause
wkhtmltopdf wkhtmltox wkhtmltoimage Command line tool rendering HTML into PDF and other image formats. shell, C LGPLv3
WKZombie Functional headless browser (with JSON support) for iOS using WebKit and hpple/libxml2. Swift MIT

Other drivers

These libraries control lesser known browsers or OS-provided web libraries

Name About Supported Languages License
Cypress Cypress supports end-to-end, integration and unit tests and makes debugging tests simple. Default engine is headless Electron. JavaScript MIT
Nightmare Nightmare is a high-level browser automation library built as an easier alternative to PhantomJS. It runs on the Electron engine. JavaScript MIT
grope A RubyCocoa interface to the macOS WebKit Framework RubyCocoa MIT
SlimerJS SlimerJS is similar to PhantomJs, except that it runs Gecko, the browser engine of Mozilla Firefox, instead of Webkit (And it is not yet truly headless). JavaScript Mozilla 2.0
SpecterJS A scriptable headless Internet Explorer port of PhantomJS. JavaScript MIT
trifleJS A headless Internet Explorer browser using the .NET WebBrowser Class with a Javascript API running on the V8 engine. JavaScript MIT

Fake Browser Engine

These libraries are typically naive or HTML-only browsers

Name About Supported Languages License
AngleSharp .Net Http Parsing Library .NET MIT
Guillotine A .NET headless browser, written in C# .NET LGPL-3.0
benv Stub a browser environment in node.js and headlessly test your client-side code. JavaScript MIT
browser.rb Headless Ruby browser on top of Nokogiri and TheRubyRacer Ruby Not specified
BrowserKit BrowserKit simulates the behavior of a web browser. PHP MIT
DamonJS Bot navigating urls and doing tasks. JavaScript Apache
Headless Headless browser support for fast web acceptance testing in .Net .NET MIT
HeadlessBrowser A very miniature headless browser, for testing the DOM on Node.js JavaScript Not specified
HtmlUnit HtmlUnit is a "GUI-Less browser for Java programs". Java Apache
Jaunt Java Web Scraping & Automation API Java Apache (monthly edition)
Jauntium Free Java library that allows you to easily automate Chrome, Firefox, Safari, Edge, IE, and other modern web browers. Java Apache
JSDom A JavaScript implementation of the WHATWG DOM and HTML standards, for use with Node.js. JavaScript MIT
MechanicalSoup A Python library for automating interaction with websites. Python MIT
mechanize Stateful programmatic web browsing. Python BSD 3-Clause, ZPL 2.1
node-as-browser Create a browser-like environment within Node.js JavaScript MIT
RoboBrowser A simple, Pythonic library for browsing the web without a standalone web browser. Python BSD 3-Clause
SimpleBrowser A flexible and intuitive web browser engine designed for automation tasks. Built on the .Net 4 framework. .NET BSD 3-Clause
stanislaw Naive, mechanize-like HTML parser/form driver. Python Not specified
twill Twill is a simple language that interacts with basic HTML pages (no JavaScript support). Python MIT
WeasyPrint WeasyPrint is a visual rendering engine for HTML and CSS that can export to PDF. It aims to support web standards for printing. Python BSD 3-Clause
WWW::Mechanize Headless browser for Perl with many plugins and extensions, notably Test::WWW:Mechanize for testing Perl Perl 5
X-RAY Supports strings, arrays, arrays of objects, nested object structures, selector API, pagination, crawler, concurrency, throttles, delays, timeouts, and pluggable drivers (PhantomJS, HTTP) JavaScript MIT
Xidel (Internet Tools) An XQuery-based cli web scraper for static X/HTML pages and JSON-APIs. FreePascal, XQuery GPL-2
Zombie.js Zombie.js is a lightweight framework for testing client-side JavaScript code in a simulated environment. No browser required. JavaScript MIT

Runs in a browser

Name About Supported Languages License
DalekJS [unmaintained and recommend TestCafé] Automated cross browser testing with JavaScript. JavaScript MIT
TestCafé Automated browser testing for the modern web development stack. JavaScript MIT
Sahi Sahi is a cross-browser automation/testing tool with the facility to record and playback scripts. JavaScript, Java, Ruby, PHP Apache / Commercial
WatiN Web Application Testing In .Net .NET Apache 2.0

Misc tools

Name About Supported Languages License
browser-launcher Detect and launch browser versions, headlessly or otherwise JavaScript MIT
Headless Recorder Chrome extension that records your browser interactions and generates a Playwright or Puppeteer script JavaScript MIT

headlessbrowsers's People

Contributors

1pav avatar a0viedo avatar androm3da avatar andyjansson avatar angrykoala avatar beckler avatar dhamaniasad avatar dotneet avatar electblake avatar gsouf avatar hughsw avatar jayfang avatar likemusic avatar mkoehnke avatar nathanielinman avatar native-api avatar neilstuartcraig avatar nexzhu avatar phimage avatar readmecritic avatar sangupta avatar shamsulamry avatar snowyu avatar spekulatius avatar spudley avatar stefre avatar techpavan avatar thepont avatar vallens avatar yoannmoinet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

headlessbrowsers's Issues

What would be good alternatives for the "puppeteer"? He is spending a lot of resources

Hi,
From the list options
What would be good alternatives for the "puppeteer"?

I basically need two things:
Set the agent and wait for the site to load completely!

await page.setUserAgent ('Mozilla / 5.0 (Windows NT 10.0; Win64; x64) AppleWebKit / 537.36 (KHTML, like Gecko) Chrome / 61.0.3163.100 Safari / 537.36');
await page.goto ('https://www.site.org', {
         waitUntil: 'networkidle0',
});

But the "puppeteer" is spending a lot of resources 😅
Thanks in advance!

Add performance measurements

Although defining peformance tests is a hard task, I think it's one of the most useful information from which one to choose and would be a nice addition to this repo

Nightmare uses electron not phantomjs

From their readme

Under the covers it uses Electron, which is similar to PhantomJS but roughly twice as fast and more modern. Because Nightmare uses Electron, it is your responsibility to ensure that the webpages loaded by Nightmare are not malicious. If you do load a malicious website, that website can execute arbitrary code on your computer.

dalekjs and phantomjs not actively maintained

Firstly thanks for your list, it really help.

  1. as dalekjs readme it say no longer maintained and now it pointing to testCafe web testCafe repo.

testCafe mainly use typescript

  1. Sadly phantom seems will not in active state as maintainer Announcement.

Before make a PR i like to ask, in my opinion it's good to have an additional info on the list. The main usage of the headlessbrowser. AFAIK the dalekjs and testCafe used to the web page, and the phantomjs have more wide usage.

I need the any opinion on this, hopefully we can restructure the list to have an additional info. 😄

Wendigo

I've been working on Wendigo a Puppeteer wrapper to make testing easier.

I wanted to know if it was suitable for this list, as it is not a HeadlessBrowser itself, but a wrapper of one.

zombie.js and jsdom

They are not fake browsers as they are using there own implementation of a stripped down browser to render javascript like for example what Splash is doing. right?

Listing which of these browsers are actually headless would be a great addition!

Basically, some of these browsers (nightmare comes to mind) use electron as their runtime, and electron requires a working X11 install!

Personally, I think if a browser requires X to be installed, it's not really headless, but in any event, this is a critical distinction for many use cases (I do a bit of browser automation, entirely in environments that don't have any X install whatsoever).

If a "headless" browser requires X and xvfb, at that point you might as well just run full chromium or firefox with webdriver or similar.

It'd be nice to disambiguate between "headless", as in "doesn't open a visible window" and actually headless, as in has no requirement for a framebuffer or X11/etc...

Case Scenario: Usage with nodemailer

Hello, I am trying to load my html page in order to send it as mail. I am using nodemailer for this job, and until now I was trying with puppeteer to retrieve the HTML of the page AFTER js has run. This is because 99.99% of my dom elements are created by js scripts.
I am not happy at all with puppeteer, the html I get after js still has all the scripts(they cant be sent over email), there is no easy way with pictures.
Even after creating by hand the html of the content I want, with nodemailer I dont get backgrounds and pictures, that is not on the topic but you might have some experience with something similar.
Which node module do you suggest i use for this implementation?

Please have an up to date howto on at least how to open a webpage

After following this:

https://cat.ninja/using-selenium-with-headless-firefox-on-freebsd/

And many other tutorials I still cannot get what I need is that I want to get and save the content of a webpage using javascript, the way it looks like in the browser.

After spending hours on dead projects like console crawler and phantomjs I have found selenium, and it MIGHT be the solution for what I need but I still cannot get it to work.

OS: latest FreeBSD stable FreeBSD 11.2-RELEASE

pkg install py36-pip
pip-3.6 install selenium
pkg install geckodriver
pkg install firefox-62.0.2,1

Example1:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()
driver.get("http://www.python.org")
assert "Python" in driver.title
elem = driver.find_element_by_name("q")
elem.clear()
elem.send_keys("pycon")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.close()

Traceback (most recent call last):
  File "open.py", line 1, in <module>
    from selenium import webdriver
ImportError: No module named selenium

I try:

python3.6 a.sel
Traceback (most recent call last):
  File "a.sel", line 5, in <module>
    br = webdriver.Firefox()
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/firefox/webdriver.py", line 162, in __init__
    keep_alive=True)
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 154, in __init__
    self.start_session(desired_capabilities, browser_profile)
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 243, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 312, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: Unable to find a matching set of capabilities


Example 2

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

br = webdriver.Firefox()
br.get('http://www.google.com/')

save_me = ActionChains(br).key_down(Keys.CONTROL)\
         .key_down('s').key_up(Keys.CONTROL).key_up('s')
save_me.perform()

Let's start it with that these are python, why on earth this is not mentioned anywhere that this is freaking python, why cannot at very least a simple example be provided for opening a god damn webpage?!?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.