This article introduces the example of using Cheerio to create a simple web crawler. It is shared with you, with the following:
1. Goal
- Complete the acquisition of the title information of the website
- Output the obtained information in a new file
- Tools: Cheerio, use npm to download npm install cheeseio
- Cheerio's API usage method is basically the same as jQuery usage method
- If you are proficient in jQuery, then cheeseio will get started quickly
2. Code section
Introduction: Get the list title of the segment fault page, and output the obtained title list number to the file.
const https = require('https'); const fs = require('fs'); const cheerio = require('cheerio'); const url = '/'; (url, (res) => { let html = ''; ('data', (data) => { html += data; }); ('end', () => { getPageTitle(html); }); }).on('error', () => { ('Error getting web page information'); }); function getPageTitle(html) { const $ = (html); let chapters = $('.news__item-title'); let data = []; let index = 0; let fileName = ''; for (let i = 0; i < ; i++) { let chapterTitle = $(chapters[i]).find('a').text().trim(); index++; (`\n${index}, ${chapterTitle}`); } (fileName, data, 'utf8', (err) => { if (err) { ('Fast to create a new file in the fs file system', err); } (`The obtained title has been successfully placed in a new file${fileName}In the file`) }) }
The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.