This article introduces the python scrapy repeat implementation of the realization of the code in detail, the text of the sample code through the introduction of the very detailed, for everyone's learning or work has a certain reference to the learning value of the need for friends can refer to the following
Scrapy is an application framework written to crawl website data and extract structured data, we only need to implement a small amount of code to be able to quickly crawl the
Scrapy module:
1, scheduler: used to store the url queue
2、downloader:Send request
3. spiders: extract data and url
4、itemPipeline:Data saving
from import reactor, defer from import CrawlerRunner from import configure_logging import time import logging from import get_project_settings # Print logs on the console configure_logging() Setting information in the #CrawlerRunner fetch. runner = CrawlerRunner(get_project_settings()) @ def crawl(): while True: ("new cycle starting") yield ("xxxxx") #1s run once (1) () crawl() ()
This is the whole content of this article, I hope it will help you to learn more.