preamble
Let's start with a thought:First of all, the target websiteSend Requestto get the html source code, and then to the source code so that the image links toScreening.Then the request is sent again for the image link andSave.
The idea is roughly the same, without further ado, directly on the code:
Modules used:
import requests # request library Third-party library, needs to be installed: pip install requests import re #screening library,pybring one's own,No installation required
Find the interface:
Open F12 to open the developer tools, click on the network, Fetch/XHR, load, in order to point down, you can see that there are two query parameters, respectively:word: landscape diagrams queryWord: landscape diagrams
We can use these two query parameters for customization:
We have to find the real url address and then customize the url query parameters by clicking next to thenumber one spot on a list, we see the query parameters from earlier:word and queryWord parameters.
Next, we use a parameter value that allows the user to enter the value of the parameter and then pass the parameter to the url address inside theword and queryWord parameters.
in that caseword and queryWord parameters, urlThe address can't have a value in it anymore, using the{} wasPassing a parameter, the format function is used later to pass a parameter to the input.{}In the end, we have the URL we need.
word = input('Please enter the image to be searched:') url = '/search/acjson?tn=resultjson_com&logid=5853806806594529489&ipn=rj&ct=201326592&is=&fp=result&fr=ala&word={}&queryWord={}&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=&z=&ic=&hd=&latest=©right=&s=&se=&tab=&width=&height=&face=&istype=&qc=&nc=&expermode=&nojc=&isAsync=&pn=30&rn=30&gsm=1e&1658411978178='.format(word, word) print(url) Open the URL and it's what you typed
The next step is to disguise the request header to prevent it from being recognized as a crawler by the server
headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36 Edg/99.0.1150.39'}
Determines if the folder exists, creates it if it does, overwrites it if it doesn't; sends the request and prints the source code
if not (files): # If there is no file execute the following code: (files) #Write the folder if it exists, create it if it doesn't req=(url=url,headers=headers).text # Get the source code print(req) #Output source code
Regular form:
res='"thumbURL":"(.*?)"' # Regular zhengze=(res,req) #invocationsfindallfunction to match
Iterate over the url address and send the request
i=1 # Counting for a in zhengze: # traverse the url after swiping get_image(a,i) # pass the traversed url address to the function get-image i+=1 # Plus one for each execution print(a) #Print Address response=(url=a,headers=headers).content #Getting the binary file
Setting the save type and save location
file=files+word+str(i)+'Zhang.jpg' # Settings Folder Path + Filename and Type (full address) with open(file,'wb') as f: # Write secondary file types and change variable names (response) # Write the fetched binary to the print(word+str(i)+'Zhang.jpg''Saved successfully') #Prompt to save successfully
Then the full source code is presented next:
import re #filter url import requests # Request import os #Creating Folders word = input('Please enter the image to be searched:') url = '/search/acjson?tn=resultjson_com&logid=5853806806594529489&ipn=rj&ct=201326592&is=&fp=result&fr=ala&word={}&queryWord={}&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=&z=&ic=&hd=&latest=©right=&s=&se=&tab=&width=&height=&face=&istype=&qc=&nc=&expermode=&nojc=&isAsync=&pn=30&rn=30&gsm=1e&1658411978178='.format(word, word) #Camouflage Browser headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36 Edg/99.0.1150.39'} files='D:/{}/'.format(word) # Create folder path if not (files): # If there is no file execute the following code: (files) #Write the folder if it exists, create it if it doesn't req=(url=url,headers=headers).text # Get the source code res='"thumbURL":"(.*?)"' #Regular zhengze=(res,req) # Screening i=1 # Counting for a in zhengze: # traverse the url after swiping get_image(a,i) # pass the traversed url address to the function get-image i+=1 # Plus one for each execution print(a) #Print Address response=(url=a,headers=headers).content # Get the binary file file=files+word+str(i)+'Zhang.jpg' # Settings Folder Path + Filename and Type (full address) with open(file,'wb') as f: # Write secondary file types and change variable names (response) # Write the fetched binary to the print(word+str(i)+'Zhang.jpg''Saved successfully') #Prompt to save successfully
Let's see how the run turns out:
You can see that I searched for Shiba Inu, which sends and saves each image link in the source code.
So is the picture I saved a Shiba Inu? Let's see:
You can see that the Shiba Inu picture is saved and a folder is created!
to this article on python to do image search engine and save to the local details of the article is introduced to this, more related python image search engine content, please search for my previous posts or continue to browse the following related articles I hope you will support me more in the future!