SoFunction
Updated on 2025-03-03

Using Cheerio to create simple web crawler examples

This article introduces the example of using Cheerio to create a simple web crawler. It is shared with you, with the following:

1. Goal

  1. Complete the acquisition of the title information of the website
  2. Output the obtained information in a new file
  3. Tools: Cheerio, use npm to download npm install cheeseio
  4. Cheerio's API usage method is basically the same as jQuery usage method
  5. If you are proficient in jQuery, then cheeseio will get started quickly

2. Code section

Introduction: Get the list title of the segment fault page, and output the obtained title list number to the file.

const https = require('https');
const fs = require('fs');
const cheerio = require('cheerio');
const url = '/';

(url, (res) => {
  let html = '';
  ('data', (data) => {
    html += data;
  });
  ('end', () => {
    getPageTitle(html);
  });
}).on('error', () => {
  ('Error getting web page information');
});

function getPageTitle(html) {
  const $ = (html);
  let chapters = $('.news__item-title');
  let data = [];
  let index = 0;
  let fileName = '';
  for (let i = 0; i < ; i++) {
    let chapterTitle = $(chapters[i]).find('a').text().trim();
    index++;
    (`\n${index}, ${chapterTitle}`);
  }
  (fileName, data, 'utf8', (err) => {
    if (err) {
      ('Fast to create a new file in the fs file system', err);
    }
    (`The obtained title has been successfully placed in a new file${fileName}In the file`)
  })
}

The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.