SoFunction
Updated on 2025-03-10

Detailed explanation of the introductory tutorial of Puppeteer

1. Introduction to Puppeteer

Puppeteer is a node library that provides a set of APIs used to manipulate Chrome, which is commonly referred to as a headless chrome browser (of course you can also configure it to have a UI, which is not available by default). Since it is a browser, Puppeteer can do everything we can manually do on the browser. In addition, Puppeteer translates into Chinese as "puppeteer", so you can know that it is very convenient to manipulate, and you can easily manipulate it to implement:

1) Generate a screenshot or PDF
2) Advanced crawlers can crawl a large number of web pages with asynchronous rendering content
3) Simulate keyboard input, automatic form submission, login to web pages, etc., to realize automated UI testing
4) Capture the site's timeline to track your website and help analyze site performance issues

If you have used PhantomJS, you will find that they are a bit similar, but Puppeteer is maintained by the official Chrome team. As the saying goes, "people with their parents' family" have better prospects.

2. Operating environment

Check out Puppeteer's official API and you will find async, await and so on. These are the specifications of ES7, so you need:

  1. The version of Nodejs cannot be lower than v7.6.0, and it needs to support async and await.
  2. Need the latest chrome driver, which the system will automatically download when you install Puppeteer through npm
npm install puppeteer --save

3. Basic usage

Let's take a look at the official introduction DEMO

const puppeteer = require('puppeteer');

(async () => {
 const browser = await ();
 const page = await ();
 await ('');
 await ({path: ''});

 await ();
})();

The above code implements a web page screenshot. Let’s first interpret the above lines of code:

  1. First create a browser instance Browser object through ()
  2. Then create the page object through the Browser object.
  3. Then () jump to the specified page
  4. Call () to take screenshots of the page
  5. Close the browser

Do you think it's so simple? Anyway, I think it's simpler than PhantomJS, let alone selenium-webdriver. Here are some commonly used APIs for puppeteer.

3.1 (options)

Run puppeteer with () and it will return a promise and use the then method to obtain browser instance. Of course, higher versions of nodejs already support the await feature, so the above example uses the await keyword. This point needs to be explained specifically. Almost all operations of Puppeteer are asynchronous. In order to use a large number of then to reduce the readability of the code, all demo code in this article is implemented in async and await. This is also the official writing method recommended by Puppeteer. To async/await, the classmate who was confused, was so cruelClick here

Detailed explanation of options parameters

Parameter name Parameter Type Parameter description
ignoreHTTPSErrors boolean Whether to ignore the Https error message during the request process, the default is false
headless boolean Whether to run chrome in "headless" mode, that is, the UI is not displayed, default to true
executablePath string The road of executable files, Puppeteer uses its own chrome webdriver by default. If you want to specify your own webdriver path, you can set it through this parameter.
slowMo number Decelerates the Puppeteer operation in milliseconds. This parameter will be very useful if you want to see the entire working process of Puppeteer.
args Array(String) Other parameters passed to the chrome instance, such as you can use "-ash-host-window-bounds=1024x768" to set the browser window size. More parameter lists can be foundhere
handleSIGINT boolean Whether to allow the control of the chrome process through process signals, that is, whether it is possible to close and exit the browser using CTRL+C.
timeout number The maximum time to wait for the Chrome instance to start. The default is 30000 (30 seconds). If 0 is passed, the time is not limited
dumpio boolean Whether to import browser processes stdout and stderr into the and. Default is false.
userDataDir string Set the user data directory, the default linux is in the ~/.config directory, and the default window is in C:\Users{USER}\AppData\Local\Google\Chrome\User Data, where {USER} represents the currently logged in username
env Object Specifies the environment variables visible to Chromium. Default is.
devtools boolean Whether to automatically open the DevTools panel for each tab, this option is only valid when headless is set to false

3.2 Browser Object

When Puppeteer is connected to a Chrome instance, a Browser object will be created, in two ways:

and .

The following DEMO implements the reconnection of the browser instance after disconnection

const puppeteer = require('puppeteer');

().then(async browser => {
 // Save Endpoint so that you can reconnect Chromium const browserWSEndpoint = ();
 // Disconnect from Chromium ();

 // Use endpoint to reconnect with Chromiunm const browser2 = await ({browserWSEndpoint});
 // Close Chromium
 await ();
});

Browser Object API

Method name Return value illustrate
() Promise Close the browser
() void Disconnect the browser
() Promise(Page) Create a Page instance
() Promise(Array(Page)) Get all open Page instances
() Array(Target) Get targets for all activities
() Promise(String) Get the browser version
() String Return to the socket connection URL of the browser instance, through which you can reconnect to the Chrome instance

OK, I won’t introduce Puppeteer’s API one by one, the official detailed API,Click here

4. Puppeteer Practical

After understanding the API, we can have some practical experiments. Before that, let’s first understand the design principles of Puppeteer. Simply put, the biggest difference between Puppeteer and webdriver and PhantomJS is that it is from the perspective of user browsing. webdriver and PhantomJS were originally designed for automated testing, so it was designed from the perspective of machine browsing, so they use different design philosophies. For example, I need to open JD's homepage and conduct a product search to see the implementation process using Puppeteer and webdriver:

Puppeteer implementation process:

  1. Open JD Home Page
  2. Put the cursor focus to the search input box
  3. Click on the keyboard to enter text
  4. Click the search button

The implementation process of webdriver:

  1. Open JD Home Page
  2. Find the input element of the input box
  3. Set the input value to search for text
  4. Standalone event that triggers the search button

Personally, I feel that Puppeteer's design philosophy is more in line with any operating habits and is more natural.

Next, we will use a simple requirement implementation to conduct introductory learning of Puppeteer. This simple requirement is:

Crawl 10 mobile phone products on JD.com and take screenshots of the product details page.

First, let's sort out the operation process

  1. Open JD Home Page
  2. Enter the "Mobile" keyword and search
  3. Get the A tag of the first 10 products, and get the href attribute value, get the product details link
  4. Open the details pages of 10 products respectively and capture the web page pictures

To implement the above functions, we need to use search elements, obtain attributes, keyboard events, etc., so let’s explain them one by one.

4.1 Get elements

The Page object provides 2 APIs to obtain page elements

(1). Page.$(selector) gets a single element, the underlying layer is called () , so the selector format of the selector followscss selector specification

let inputElement = await page.$("#search", input => input);
//The following is equivalentlet inputElement = await page.$('#search');

(2). Page.$$(selector) Gets a set of elements, the underlying call is (). Returns a Promise(Array(ElemetHandle)) element array.

const links = await page.$$("a");
//The following is equivalentconst links = await page.$$("a", links => links);

The final return is the ElemetHandle object

4.2 Get element attributes

Puppeteer's logic to obtain element attributes is a bit different from the js we usually write in the previous section. According to the usual logic, it should be to obtain the element now and then obtain the element's attributes. But we know above that the API that gets elements returns the ElemetHandle object in the end, and if you look at the ElemetHandle API, you will find that it does not get the API that gets the element attributes.

In fact, Puppeteer provides a set of APIs for obtaining attributes, Page.$eval() and Page.$$eval()

(1). Page.$$eval(selector, pageFunction[, …args]), obtains the attributes of a single element. The selector here is the same as the Page.$(selector) above.

const value = await page.$eval('input[name=search]', input => );
const href = await page.$eval('#a", ele => );
const content = await page.$eval('.content', ele => );

4.3 Execute custom JS scripts

Puppeteer's Page object provides a series of evaluation methods, through which you can execute some custom js code, mainly providing the following three APIs

(1). (pageFunction, …args) Returns a serializable normal object. PageFunction represents the function to be executed on the page. args represents the parameters passed to pageFunction. The following pageFunction and args represent the same meaning.

const result = await (() => {
 return (8 * 7);
});
(result); // prints "56"

This method is very useful. For example, when we get screenshots of the page, the default is to only take a screenshot of the current browser window size, and the default value is 800x600. If we need to get the complete screenshot of the entire web page, there is no way to do it. The () method provides parameters that can set the size of the screenshot area. Then we can solve this problem by just getting the width and height of the page after the page is loaded.

(async () => {
 const browser = await ({headless:true});
 const page = await ();
 await ('https://jr.');
 await ({width:1920, height:1080});
 const documentSize = await (() => {
 return {
 width: ,
 height : ,
 }
 })
 await ({path:"", clip : {x:0, y:0, width:1920, height:}});

 await ();
})();

(2). (pageFunction, …args) Execute a pageFunction in the Page context, returning the JSHandle entity

const aWindowHandle = await (() => (window));
aWindowHandle; // Handle for the window object. 

const aHandle = await ('document'); // Handle for the 'document'.

From the above code, we can see that the () method also directly returns the final processing result of the Promise through the method, but only encapsulates the last returned object into a JSHandle object. In essence, there is no difference from evaluate.

The following code implements the HTML code to obtain the dynamics of the page (including elements dynamically inserted by js).

const aHandle = await (() => );
const resultHandle = await (body => , aHandle);
(await ());
await ();

(3). (pageFunction, …args), call pageFunction before loading the document page. If there is an iframe or frame in the page, the context environment of the function call will become a subpage, that is, an iframe or frame. Since it is called before the page is loaded, this function is generally used to initialize the javascript environment, such as resetting or initializing some global variables.

4.4

In addition to the above three APIs, there is also a similar and very useful API, that is, this API is used to register global functions on the page, which is very useful:

Because sometimes some functions need to be used when the page is processing some operations, although functions can be defined on the page through the () API, such as:

const docSize = await (()=> {
 function getPageSize() {
 return {
 width: ,
 height : ,
 }
 }

 return getPageSize();
});

However, such functions are not global and need to be redefined in each evaluation, and cannot be reused. One is that nodejs has many toolkits that can easily implement very complex functions. For example, to implement md5 encryption functions, it is not convenient to implement it with pure js, while nodejs is a matter of several lines of code.

The following code implements adding an md5 function to the window object in the Page context:

const puppeteer = require('puppeteer');
const crypto = require('crypto');

().then(async browser => {
 const page = await ();
 ('console', msg => ());
 await ('md5', text =>
 ('md5').update(text).digest('hex')
 );
 await (async () => {
 // use window.md5 to compute hashes
 const myString = 'PUPPETEER';
 const myHash = await window.md5(myString);
 (`md5 of ${myString} is ${myHash}`);
 });
 await ();
});

It can be seen that the API is very convenient and useful to use. For example, registering a readfile global function for a window object:

const puppeteer = require('puppeteer');
const fs = require('fs');

().then(async browser => {
 const page = await ();
 ('console', msg => ());
 await ('readfile', async filePath => {
 return new Promise((resolve, reject) => {
 (filePath, 'utf8', (err, text) => {
 if (err)
 reject(err);
 else
 resolve(text);
 });
 });
 });
 await (async () => {
 // use  to read contents of a file
 const content = await ('/etc/hosts');
 (content);
 });
 await ();
});

5. Modify the simulator (client) running configuration

Puppeteer provides some APIs for us to modify the configuration of the browser terminal.

  1. () Change the browser window size
  2. () Set the UserAgent information of the browser
  3. () Change the CSS media type of the page to perform simulated media simulation. Optional values ​​are "screen", "print", "null", and if set to null, media emulation is disabled.
  4. () Simulate devices, parameter device objects, such as iPhone, Mac, Android, etc.
({width:1920, height:1080}); //Set the window size to 1920x1080('Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36');
('print'); //Set the printer media style

In addition, we can also simulate non-PC devices, such as the following code to simulate iPhone 6 and access Google:

const puppeteer = require('puppeteer');
const devices = require('puppeteer/DeviceDescriptors');
const iPhone = devices['iPhone 6'];

().then(async browser => {
 const page = await ();
 await (iPhone);
 await ('');
 // other actions...
 await ();
});

Puppeteer supports many device simulations, such as Galaxy, iPhone, IPad, etc. If you want to know the detailed device support, please click here..

6. Keyboard and mouse

The keyboard and mouse API is relatively simple, and several keyboard APIs are as follows:

  1. (key[, options]) Trigger keydown event
  2. (key[, options]) Press a key, key represents the name of the key, such as ‘ArrowLeft’ left arrow key, please refer to the detailed key name mappingClick here
  3. (char) Enter a character
  4. (text, options) Enter a string
  5. (key) Trigger keyup event
("Shift"); //Press Shift key('Hi');
('Hello'); // Complete input at once('World', {delay: 100}); // Enter slowly like a user

Mouse operation:

(x, y, [options]) Move the mouse pointer to the specified position and then press the mouse. This is a quick operation of , and , and ,

([options]) Triggers mousedown event, options can be configured:

  1. Which key was pressed? The optional value is [left, right, middle]. The default is left, which means the left mouse button.
  2. Number of times you press, click, double-click or other times
  3. delay button delay time

(x, y, [options]) Move the mouse to the specified position, indicating the step length of the movement

([options]) Trigger mouseup event

7. Several other useful APIs

Puppeteer also provides several very useful APIs, such as:

7.1 Series API

  1. (selectorOrFunctionOrTimeout[, options[, …args]]) The following three comprehensive APIs
  2. (pageFunction[, options[, …args]]) Wait for pageFunction to complete execution
  3. (options) After waiting for the basic elements of the page to load, such as synchronized HTML, CSS, JS and other codes
  4. (selector[, options]) After waiting for the element of a selector to be loaded, this element can be loaded asynchronously. This API is very useful, you know.

For example, if I want to get an element loaded asynchronously through js, then it is definitely impossible to get it directly. This time you can use it to solve it:

await ('.gl-item'); //Waiting for the element to load, otherwise the element loaded asynchronously cannot be obtainedconst links = await page.$$eval('.gl-item > .gl-i-wrap > .p-img > a', links => {
 return (a => {
 return {
 href: (),
 name: 
 }
 });
});

In fact, the above code can solve our top needs and crawl JD's products. Because they are loaded asynchronously, they use this method.

7.2 ()

() can obtain some page performance data, which captures the website's timeline tracking to help diagnose performance problems.

  1. Timestamp metric sampling timestamp
  2. Documents Page Documents
  3. Frames page frame count
  4. JSEventListeners Number of event listeners in the page
  5. Nodes Page DOM Nodes
  6. LayoutCount Total page layout
  7. RecalcStyleCount style recalculation
  8. LayoutDuration The merge duration of all page layouts
  9. RecalcStyleDuration The combined duration of all page styles recalculated.
  10. ScriptDuration The duration of all script execution
  11. TaskDuration All browser task duration
  12. JSHeapUsedSize JavaScript occupies heap size
  13. JSHeapTotalSize JavaScript heap total

8. Summary and source code

This article learns some basic commonly used APIs of Puppeteer through an actual requirement. The version of the API is v0.13.0-alpha. For the latest API of Bangben, please refer toPuppeteer official API.

Overall, Puppeteer is a really good headless tool, easy to operate and powerful. It is used for UI automation testing and some gadgets.

Below is the source code of our initial requirements for reference only:

//Delay functionfunction sleep(delay) {
 return new Promise((resolve, reject) => {
 setTimeout(() => {
 try {
 resolve(1)
 } catch (e) {
 reject(0)
 }
 }, delay)
 })
}

const puppeteer = require('puppeteer');
({
 ignoreHTTPSErrors:true, 
 headless:false,slowMo:250, 
 timeout:0}).then(async browser => {

 let page = await ();
 await (true);
 await ("/");
 const searchInput = await page.$("#key");
 await (); //Locate the search box await ("cell phone");
 const searchBtn = await page.$(".button");
 await ();
 await ('.gl-item'); //Waiting for the element to load, otherwise obtain the element that is not loaded asynchronously const links = await page.$$eval('.gl-item > .gl-i-wrap > .p-img > a', links => {
 return (a => {
 return {
 href: (),
 title: 
 }
 });
 });
 ();

 const aTags = (0, 10);
 for (var i = 1; i < ; i++) {
 page = await ()
 (true);
 await ({width:1920, height:1080});
 var a = aTags[i];
 await (, {timeout:0}); //Prevent the page from being too long and the loading timeout
 //Inject the code and slowly slide the scrollbar to the bottom to ensure that all elements are loaded let scrollEnable = true;
 let scrollStep = 500; //The step length of each scroll while (scrollEnable) {
 scrollEnable = await ((scrollStep) => {
 let scrollTop = ;
  = scrollTop + scrollStep;
 return  > scrollTop + 1080 ? true : false
 }, scrollStep);
 await sleep(100);
 }
 await ("#footer-2014", {timeout:0}); //Judge whether it has reached the bottom let filename = "images/items-"+i+".png";
 //There is a Puppeteer bug here that has not been solved. I found that the maximum height of the screenshot can be 16384px, and the excess part has been stolen. await ({path:filename, fullPage:true});
 ();
 }

 ();
});

The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.