Git Product home page Git Product logo

puppeteer's Introduction

Puppeteer

build npm puppeteer package

Puppeteer is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer runs in headless mode by default, but can be configured to run in full ("headful") Chrome/Chromium.

What can I do?

Most things that you can do manually in the browser can be done using Puppeteer! Here are a few examples to get you started:

  • Generate screenshots and PDFs of pages.
  • Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR" (Server-Side Rendering)).
  • Automate form submission, UI testing, keyboard input, etc.
  • Create an automated testing environment using the latest JavaScript and browser features.
  • Capture a timeline trace of your site to help diagnose performance issues.
  • Test Chrome Extensions.

Getting Started

Installation

To use Puppeteer in your project, run:

npm i puppeteer
# or using yarn
yarn add puppeteer
# or using pnpm
pnpm i puppeteer

When you install Puppeteer, it automatically downloads a recent version of Chrome for Testing (~170MB macOS, ~282MB Linux, ~280MB Windows) and a chrome-headless-shell binary (starting with Puppeteer v21.6.0) that is guaranteed to work with Puppeteer. The browser is downloaded to the $HOME/.cache/puppeteer folder by default (starting with Puppeteer v19.0.0). See configuration for configuration options and environmental variables to control the download behavor.

If you deploy a project using Puppeteer to a hosting provider, such as Render or Heroku, you might need to reconfigure the location of the cache to be within your project folder (see an example below) because not all hosting providers include $HOME/.cache into the project's deployment.

For a version of Puppeteer without the browser installation, see puppeteer-core.

If used with TypeScript, the minimum supported TypeScript version is 4.7.4.

Configuration

Puppeteer uses several defaults that can be customized through configuration files.

For example, to change the default cache directory Puppeteer uses to install browsers, you can add a .puppeteerrc.cjs (or puppeteer.config.cjs) at the root of your application with the contents

const {join} = require('path');

/**
 * @type {import("puppeteer").Configuration}
 */
module.exports = {
  // Changes the cache location for Puppeteer.
  cacheDirectory: join(__dirname, '.cache', 'puppeteer'),
};

After adding the configuration file, you will need to remove and reinstall puppeteer for it to take effect.

See the configuration guide for more information.

puppeteer-core

For every release since v1.7.0 we publish two packages:

puppeteer is a product for browser automation. When installed, it downloads a version of Chrome, which it then drives using puppeteer-core. Being an end-user product, puppeteer automates several workflows using reasonable defaults that can be customized.

puppeteer-core is a library to help drive anything that supports DevTools protocol. Being a library, puppeteer-core is fully driven through its programmatic interface implying no defaults are assumed and puppeteer-core will not download Chrome when installed.

You should use puppeteer-core if you are connecting to a remote browser or managing browsers yourself. If you are managing browsers yourself, you will need to call puppeteer.launch with an explicit executablePath (or channel if it's installed in a standard location).

When using puppeteer-core, remember to change the import:

import puppeteer from 'puppeteer-core';

Usage

Puppeteer follows the latest maintenance LTS version of Node.

Puppeteer will be familiar to people using other browser testing frameworks. You launch/connect a browser, create some pages, and then manipulate them with Puppeteer's API.

For more in-depth usage, check our guides and examples.

Example

The following example searches developer.chrome.com for blog posts with text "automate beyond recorder", click on the first result and print the full title of the blog post.

import puppeteer from 'puppeteer';

(async () => {
  // Launch the browser and open a new blank page
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Navigate the page to a URL
  await page.goto('https://developer.chrome.com/');

  // Set screen size
  await page.setViewport({width: 1080, height: 1024});

  // Type into search box
  await page.type('.devsite-search-field', 'automate beyond recorder');

  // Wait and click on first result
  const searchResultSelector = '.devsite-result-item-link';
  await page.waitForSelector(searchResultSelector);
  await page.click(searchResultSelector);

  // Locate the full title with a unique string
  const textSelector = await page.waitForSelector(
    'text/Customize and automate'
  );
  const fullTitle = await textSelector?.evaluate(el => el.textContent);

  // Print the full title
  console.log('The title of this blog post is "%s".', fullTitle);

  await browser.close();
})();

Default runtime settings

1. Uses Headless mode

By default Puppeteer launches Chrome in the Headless mode.

const browser = await puppeteer.launch();
// Equivalent to
const browser = await puppeteer.launch({headless: true});

Before v22, Puppeteer launched the old Headless mode by default. The old headless mode is now known as chrome-headless-shell and ships as a separate binary. chrome-headless-shell does not match the behavior of the regular Chrome completely but it is currently more performant for automation tasks where the complete Chrome feature set is not needed. If the performance is more important for your use case, switch to chrome-headless-shell as following:

const browser = await puppeteer.launch({headless: 'shell'});

To launch a "headful" version of Chrome, set the headless to false option when launching a browser:

const browser = await puppeteer.launch({headless: false});

2. Runs a bundled version of Chrome

By default, Puppeteer downloads and uses a specific version of Chrome so its API is guaranteed to work out of the box. To use Puppeteer with a different version of Chrome or Chromium, pass in the executable's path when creating a Browser instance:

const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});

You can also use Puppeteer with Firefox. See status of cross-browser support for more information.

See this article for a description of the differences between Chromium and Chrome. This article describes some differences for Linux users.

3. Creates a fresh user profile

Puppeteer creates its own browser user profile which it cleans up on every run.

Using Docker

See our Docker guide.

Using Chrome Extensions

See our Chrome extensions guide.

Resources

Contributing

Check out our contributing guide to get an overview of Puppeteer development.

FAQ

Our FAQ has migrated to our site.

puppeteer's People

Contributors

aslushnikov avatar browser-automation-bot avatar christian-bromann avatar dependabot[bot] avatar ebidel avatar hanselfmu avatar jackfranklin avatar joeleinbinder avatar johanbay avatar jrandolf avatar jrandolf-zz avatar jschfflr avatar juliandescottes avatar kblok avatar lightning00blade avatar lutien avatar mathiasbynens avatar mjzffr avatar orkon avatar paulirish avatar pavelfeldman avatar release-please[bot] avatar sadym-chromium avatar tasneemkoushar avatar thedavidbarton avatar thiagowfx avatar timvdlippe avatar vsemozhetbyt avatar whimboo avatar yanivefraim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

puppeteer's Issues

Implement full-page screenshots

Given the #5 is done, there should be a fullPage option in the page.screenshot method.
There should be also an alias page.takeFullScreenshot or page.takeFullPageScreenshot (or maybe both?)

Bikeshed the API

Lots of things I think could be simpler

"override" "set" "get" ...

2 space indentation across project

@aslushnikov

Mind if we switch to 2-space identation? That's more common in JS projects these days and we tend to stick with Google's style guide in the GoogleChrome org.

We can extend the google style guide in eslintrc.js to enforce it :)

Document `Page.setSize()` method

Thanks for the awesome module 🏆

I realize the docs are a WIP.

From the README:

Puppeteer sets an initial page size to 400px x 300px, which defines the screenshot size. The page size can be changed with Page.setSize() method

setSize() does not however appear to have any API docs over in https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md. Looking through the repo via code-search the README entry for this method appears to be the only one in there :)

Is there another issue tracking adding docs for methods like this?

Add core input APIs

The following should be implemented:

.click(selector, options)
.type(selector, text, options)

Refactor screenshots API

The screenshots API could be improved:

  1. Instead of passing multiple arguments in page.screenshot, accept single optional options object
  2. Make screenshot's type optional and default it to either jpeg or png.
  3. Add filePath option to the screenshot's options object.
  4. Make aliases:
    • saveScreenshot is an alias to the screenshot with filePath option
    • takeScreenshot is an alias for the screenshot

Get rid of chrome-remote-interface dependency

The chrome-remote-interface dependency served us good. However, our usage of chrome-remote-interface is minimal: we use only .send() method and event subscription. This is not hard to implement (e.g. there's a lighthouse implementation: connection.js).

lint jsdoc types

We should lint jsdoc types.

I spent some time trying to make closure compiler work with node.js, but it didn't work due to issues with node module requires.

Emulation API proposal

Currently we have the following api to emulate devices:

let names = page.emulatedDevices(); // to get list of emulatable devices
page.emulate('iphone 6'); // to emulate "iphone 6"

Problems with this api:

  • having emulatedDevices() as a getter and under a page makes me feel that the device list might change over time and from page to page.
  • it's hard to introspect what kind of options are actually set for the device, e.g. to see device's screen size
  • it's hard to do my own device, or import someone's pre-made from npm

How about:

// Puppeteer defines the "Device" class
class Device {
  constructor(name, userAgent, viewport, isMobile = true, hasTouch = true) {
    this.name = name;
    this.userAgent = userAgent;
    this.viewport = viewport;
    this.isMobile = isMobile;
    this.hasTouch = hasTouch;
  }
}

// The Device class and a list of all available devices are exposed
// in the top-level require
const {Browser, Device, devices} = require('puppeteer');

// `devices` list is an array with additional string properties. Devices
// could be both iterated and accessed via device names
devices[0];
devices.nexus5;

// If needed, a hand-made device could be done
let myDevice = new Device('handmade', 'handmade-webkit', {
  width: 100,
  height: 100,
  dpi: 2,
});

// Devices would be convenient to import from npm
let exoticDevice = require('list-of-exotic-devices-for-puppeteer');

// Finally, page can start emulating a device
page.emulateDevice(device);
page.emulatedDevice(); // returns currently-emulated device

// And we can change page's orientation
page.setOrientation('landscape'); // also could be 'portrait'

This kind of API would make emulation a first-class citizen in Puppeteer - which makes a lot of sense since devtools device emulation is very comprehensive.

Emulate offline

(taken from gr2m/headless-chrome-test#1)

Puppeteer should have an offline mode to test PWA's. For example:

page.setOfflineMode(true);

Emulation of other network conditions might be useful as well, but we can wait with them until there are good scenarios.

Refactor JavaScript dialogs API

The phantom's approach to handling javascript dialogs is convenient and familiar: instead of firing Alert, Prompt and Confirm events, we should use onAlert, onPrompt and onConfirm callbacks. It would be also convenient to have a complimentary Dialog event which dispatches dialog class.

Way to debug test

We need a way to run a test with --inspect-brk flag to make it debuggable!

Generalize page.waitFor, make waitForSelector a utility on top of it.

Rationale: waitForSelector is rarely useful on its own. I have a DOM element, but it is off screen / display:none, etc.

We can make waitFor generic and execute it either upon every task (by default) or upon tasks that had specific activities in them (optional).

utilitites.waitForSelector = (selector) => {
  return page.waitFor(() => document.querySelector(selector), ['style'], selector);
}

utilitites.waitForLayout = () => {
  return page.waitFor(() => true, ['layout']);
}

utilitites.waitForFrame = () => {
  return page.waitFor(() => true, ['frame']);
}

Stress test: scrape search results user story

I attempted to create a following stress test:

  • Navigate to google.com
  • Enter "Blin"
  • Capture screenshot with results
  • Go over the results
  • Capture screenshot for every result page

This is pretty much impossible using existing API :) Dumping observations here...

  • page.waitFor resolves even if element is not on the screen (display:none, etc). Many elements are in DOM too early, no way to click-via-screen them.
  • page.waitFor does not time timeout
  • keyboard API surface is large and cryptic. Did you figure out how to press 'Enter' using just page.*.
  • keyboard API does not allow for natural click delays, does not populate timestamps. google.com does not like that
  • Expose custom raf or timer-based page.waitFor(predicate)
  • Expose page.sleep(time)
  • Need to be able to page.waitForNavigation in case navigation is initiated via click [i'll fix]
  • Reload brings me back to the originally navigated page [i'll fix]
  • page.waitFor(selector) in my code is always followed by either page.click or page.focus with that element. Handles would make it look like (await page.waitFor(selector)).click().
  • navigation API is not exposed, I need to be able to click browser's back/forward.
  • Taking screenshot halts on mac headless good half of the time [i'll look at it]

^^ @aslushnikov @JoelEinbinder @dgozman @paulirish

Add waitFor APIs

.wait(selector) // wait until selector you specified appear in a DOM tree.

Page.evaluate doesn't return NaN or Infinity

Page.evaluate should handle unserializable values properly.
For example, the following returns false:

let result = await page.evaluate(() => NaN);
console.log(Number.isNaN(result));

Whereas it should be true.

Support browser contexts to launch different sessions

Support browser contexts (Target.createBrowserContext) so to avoid launching multiple instances if one wants a pristine session. See proposal #66 and original discussion at cyrus-and/chrome-remote-interface#118.

A viable user scenario might be testing several users logged in simultaneously into the service.
We might expose browser contexts as a string literal option to the browser.newPage:

browser.newPage(); // creates a new page in a default browser context
browser.newPage({ context: 'default' }); // same as previous call
browser.newPage({ context: 'another-context' }); // creates a page in another browser context

New project name

While "puppeteer" is apt, there are a few downsides: 3 syllables (and kinda a mouthful), difficult spelling (surprisingly), kinda plain, the npm name is taken. There's been an interest amongst a few folks to retitle the project before shipping.

I have collected a few possibilities below.

I'm not explaining each name for simplicity, I'd rather them stand on their own. However, most have connotations and associations with things like: legend of sleepy hollow, headlessness, phantom of the opera, scientific method/testing, shadow puppets.


blacksmith
silhouette
christine
theremin
bradshaw
dudley
leroux
sobaka
irving
mombi
scrim
curie


Any other proposals? Comment and I will integrate new ones into the list above.

page.click() doesn't always work

The following script hangs for me waiting for "click" to happen on the second link.
However, if I make viewport width 1000 instead of 300, the script works just fine.

const {Browser} = require('puppeteer');
const browser = new Browser({headless: false});

browser.newPage().then(async page => {
  page.on('load', () => {
    console.log('LOADED: ' + page.url());
  });
  // Make width 1000 instead of 300 and the scripts works just fine
  await page.setViewport({width: 300, height: 300}); 
  await page.navigate('https://google.com');
  await page.waitFor('input[name=q]');
  await page.focus('input[name=q]');
  await page.type('blin');
  await page.press('Enter');
  for (let i = 0; i < 10; ++i) {
    let searchResult = `div.g:nth-child(${i + 1}) h3 a`;
    await page.waitFor(searchResult, {visible: true});
    page.click(searchResult);
    await page.waitForNavigation();

    await page.screenshot({path: `screenshot-${i + 1}.png`});
    await page.goBack();
  }
  browser.close();
});

Running the script with DEBUG=*page node scrape.js shows what's going on.

Looks like we're clicking outside of viewport and the click turns out to be a noop.

@JoelEinbinder, wdyt?

docs/api.md should be linted

Currently, the docs/api.md is written manually to resemble the node.js documentation. The motivation was to make the doc look familiar to an average node dev.

However, the markup is tiring to maintain manually. It would be nice to generate the doc somehow from a simplistic format which would be easy to edit.

@paulirish, @ebidel - do you guys have any suggestions? @ebidel mentioned it's easy to setup a documentation website somehow, could you please share a link?

Emulate devices?

Thanks to the emulation domain, we can emulate mobile devices.

Is there a need for such an api in puppeteer? If yes, what's the scenario and what would api look like?

Implement POST navigation

There's a need to navigate to URL with a POST method rather then GET method, e.g. to simulate form submission.

For this, the page.navigate() method should accept options object with method parameter.

Add fancy input APIs

.mouseMoved(x, y, options = {})
.mousePressed(x, y, options = {})
.mouseReleased(x, y, options = {})
.tap(x, y, options = {})
.touchmove()
.touchend()

What to do with `page.evaluateAsync`?

Today, there are two methods in page's API:

  • page.evaluate which evaluates code in the page and returns result.
  • page.evaluateAync which too evaluates code in the page and, if the evaluation result is promise, waits for promise to resolve and returns the promise value.

The page.evaluate is straightforward, but page.evaluateAsync has cryptic name and hard to grasp. Should we remove the page.evaluateAsync and always await promise in page.evaluate?

If we want to keep them separate, is there a better name for the page.evaluateAsync?


Illustrating with examples, the following snippet returns "42":

var Browser = require('puppeteer').Browser();
var browser = new Browser();
browser.newPage().then(async page => {
    console.log(await page.evaluate(() => 6 * 7)); // prints '42'
    browser.close();
});

However, if we try to return a promise, we will get an empty object as a result:

var Browser = require('puppeteer').Browser();
var browser = new Browser();
browser.newPage().then(async page => {
    console.log(await page.evaluate(() => Promise.resolve(6 * 7))); // prints '{}'
    browser.close();
});

In order to get the '42' as a returned value, one would need to use page.evaluateAsync:

var Browser = require('puppeteer').Browser();
var browser = new Browser();
browser.newPage().then(async page => {
    console.log(await page.evaluateAsync(() => Promise.resolve(6 * 7))); // prints '42'
    browser.close();
});

This kind of situation could be avoided if we were to await promise in page.evaluate.


For the record: in case of merging evaluate with the evaluateAsync methods, we should make sure that evaluating the code from-inside the inpagecallback works successfully (which is a bit tricky since page is on a pause)

Make a sane README.md

There should be a nice README which

  • explains Puppeteer's motivation and goals
  • provides FAQ
  • points to other useful resources, such as API documentation

[headless] Impossible to accept/dismiss javascript dialogs

Example:

var Browser = require('puppeteer').Browser;
var browser = new Browser();
browser.newPage().then(async page => {
    page.on('dialog', dialog => {
        dialog.accept('test');
    });
    console.log(await page.evaluate(() => prompt('q?')));
    browser.close();
});

Dialog accepting/dismissing throws on error on headless:

Error: Protocol error (Page.handleJavaScriptDialog): Could not handle JavaScript dialog

Upstream bug: crbug.com/718235

Exception while trying to handle non-serializable object

const {Browser} = new require('.');
const browser = new Browser({headless: true});

browser.newPage().then(async page => {
  await page.evaluate(() => window);
  browser.close();
});

This code throws an error:

Object reference chain is too long

which happens because window is non-serializable.

PuppeteerScript

I've been long thinking about scripting interactions using something similar to the webpagetest test scripts.

It'd be great to have some interoperability with that preexisting script syntax/format/language. While I'm not a huge fan of the API, it has been established for some people already.

I'm certain that Patrick would have some feels regarding this idea too.


eg

visit https://google.com
focus #searchInput
type 'where is the world‘s best cheesecake?'
press button[type='submit']
screenshot my-image.jpg

Is this in scope for puppeteer? Or perhaps something that could just use puppeteer under the covers?

page.screenshot scrollbar artifacts

Hey,

I noticed that when taking a full page screenshot, we can see random scrollbar artifacts.

Is there a chance that this is related to Emulation.setVisibleSize?

image

image

Expose raw protocol in the API

While using puppeteer there are still occasions where I want to use the raw protocol.
I'd certainly want to do this against the Page, and potentially the browser as well.

I want to send methods and get responses. Additionally, I want to listen for specific events.

Not sure of the right API and if it needs to be transactional as to not mixup state.

Add form APIs

e.g.

// Fill an element 
await browser.fill('#username', 'myUser')

// set an element's `value` value
await browser.setValue('#comment', 'my feelings')

// Type in an element 
await browser.type('#password', 'Yey!ImAPassword!')

Add frames api

There should be a frames inspection API with an ability to evaluate code in frame

Suggestions from chrome-remote-interface

I did a (very) quick survey of the issues submitted to chrome-remote-interface so far and selected some features that users seem to struggle with using the low-level Chrome Debugging Protocol API.

  • Support browser contexts (Target.createBrowserContext) so to avoid launching multiple instances if one wants a pristine session (e.g., browser.newIsolatedPage() or even expose the context concept through the API).
  • Provide easy access to CSS rules (e.g., adding, toggling, etc.). This may be tricky and too broad to discuss but users particularly struggle with text-ranges.

These may or may not suit this project so I'm not I'm not filing multiple issues; this is mainly to have a place to discuss about them.

Improve navigation

Navigation is hard.
Make sure the following navigation scenarios work properly:

  • page.navigate('not-a-url') should return false
  • page.navigate('https://expired.badssl.com/') should return false
  • page.navigate('http://example.com/non-existing-page') should return false
  • page.navigate('http://example.com') with no internet should return false
  • page.navigate('data:text/html,hello') should return true
  • Page's navigation via inner javascript's window.location.href = 'http://example.com' should be reported to puppeteer, probably via the Navigated event.

All of this should be also applicable to frame navigation in #4

Parallel calls to Page.screenshot interfere with each other

Consider the following example, which takes 10 screenshots, each of a different 50pxx50px square.

browser.newPage().then(async page => {
    await page.navigate('http://example.com');
    var promises = [];
    for (var i = 0; i < 10; ++i) {
        promises.push(page.screenshot({
            path: i + '.png',
            clip: {x: 50 * i, y: 0, width: 50, height: 50},
        });
    }
    await Promise.all(promises);
    browser.close();
});

Unfortunately, this doesn't work - calls to page.screenshot interfere with each other, failing to clip proper rect.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.