Preface
This article mainly introduces how to use Pythonrequests
The module performs network request operations, covering multiple aspects such as file download, cookie processing, redirection and historical requests. Through detailed sample code, we show how to efficiently implement various network operations, helping developers more easily handle HTTP requests and manage data.
1. Download the network file
(I) Basic steps
Use the () method to send an HTTP GET request to download the file from the given URL. Here are the typical steps to download a file:
Send a requestuse()
Send a request to the URL of the file.
Get the file contentResponse objectcontent
The attribute contains the binary data of the file and can be saved to a local file.
Save the fileusewith open()
Create a local file to write the downloaded content.
Example:
import requests # URL of the file to be downloadedurl = '/' # Send a GET requestresponse = (url) # Check whether the request is successfulif response.status_code == 200: # Open a file in binary mode and write the file contents to the local with open('', 'wb') as file: () print("File download successfully") else: print(f"File download failed,Status code:{response.status_code}")
(II) Download large files in segments
If the file is large, it is recommended to use segmented download method. passiter_content()
Methods: avoid loading the entire file into memory at one time, but processing data block by block, suitable for large file downloads.
Example:
import requests # URL of the file to be downloadedurl = '/' # Send a GET request and stream the fileresponse = (url, stream=True) # Check whether the request is successfulif response.status_code == 200: # Open a file in binary mode and write data block by block with open('', 'wb') as file: for chunk in response.iter_content(chunk_size=1024): if chunk: # Filter out empty data blocks that are kept alive (chunk) print("Big file download successfully") else: print(f"File download failed,Status code:{response.status_code}")
(III) FAQ
There are two common problems:
1. Timeout setting: Can be usedtimeout
Parameters to avoid long-term hangs of requests. For example:
response = (url, timeout=10) # Set 10 seconds timeout
2. Error handling: It is recommended to add exception handling to catch network errors. For example:
try: response = (url) response.raise_for_status() # Check whether the request is successfulexcept as e: print(f"Request failed:{e}")
2. Requests module handles cookies
The requests module can easily handle cookies in HTTP requests, including sending requests with cookies and getting cookies in responses. Here are some common methods and examples of how to use the requests module to handle cookies.
(I) Send a request with cookies
When sending a request, cookies can be sent to the server through the cookies parameter. This parameter receives a dictionary cookie data, where the key is the cookie name and the value is the value of the cookie.
Example:
import requests # Define Cookiescookies = { 'session_id': '123456', 'user': 'john_doe' } # Send a request with cookiesresponse = ('', cookies=cookies) # Print response contentprint()
In this example, session_id and user are cookies sent to the server.
(II) Get cookies from the response
The server can also return the Set-Cookie header in the response, and the requests module will automatically store these cookies in the attribute.
Example:
import requests # Send a requestresponse = ('') # Get cookies in responsecookies = # traversal cookiesfor cookie in cookies: print(f"{}: {}")
is a RequestsCookieJar object, similar to a dictionary, which allows access to specific cookies like accessing dictionaries.
(three)Session
Object Management Cookies
When using the () object, cookies are automatically saved and sent between different requests. This is useful when dealing with situations where login authentication is required, because the Session object can automatically keep the session state.
Example:
import requests # Create a Session objectsession = () # Cookies may be set for the first request (for example, login)response = ('/login') # In subsequent requests, cookies will be automatically sentresponse = ('/dashboard') # View cookies in the current sessionprint()
In this example, the session object automatically manages cookies received from the response and sends them in subsequent requests. This way, a session can be maintained (such as after login).
(IV) Manually set and modify cookies
If you want to manually manage cookies for Session objects, you can set or modify cookies through the () method.
Example:
import requests # Create a Session objectsession = () # Set a new cookie('my_cookie', 'cookie_value') # Send a request and automatically attach this cookieresponse = ('') # Print response contentprint()
(five)RequestsCookieJar
Convert to dictionary
Returns a RequestsCookieJar object, which can be converted into a normal dictionary for easy subsequent processing.
Example:
# Convert cookies to dictionarycookies_dict = .dict_from_cookiejar() print(cookies_dict)
(VI) Dictionary conversion toRequestsCookieJar
A dictionary can be converted to a RequestsCookieJar, which allows easy management of cookies.
Example:
from import cookiejar_from_dict # Define Cookies Dictionarycookies_dict = {'session_id': '123456', 'user': 'john_doe'} # Convert dictionary to RequestsCookieJarjar = cookiejar_from_dict(cookies_dict) # Use this CookieJar when sending a requestresponse = ('', cookies=jar)
(VII) Summary
Send Cookies: passcookies
Parameter passing dictionary.
Get Cookies: passGet the cookie returned by the server.
Automatically manage cookies: use()
Cookies can be automatically managed between multiple requests.
Manually set and modify: pass()
Set cookies manually.
Convert CookieJar and Dictionary: .dict_from_cookiejar()
and.cookiejar_from_dict()
Methods can be converted to each other.
3. Redirection and historical requests
Redirection and historical requests are common network request processing requirements. The requests module automatically handles HTTP redirects by default and provides the ability to view redirect history.
(I) The concept of redirection
Redirection is when the server tells the client that the resource currently requested has been moved to another URL and the client needs to access the new URL. Common redirection status codes include:
301 Moved Permanently: Permanent redirect, the resource has been permanently relocated to the new URL.
302 Found: Temporary redirection, resource relocation, but the client should continue to use the original URL.
303 See Other: Tells the client that the requested resource can be retrieved using a GET request at another URL.
307 Temporary Redirect: When requesting resources to be relocated temporarily, the client should retain the request method and data.
308 Permanent Redirect: Similar to 301, but the client must use the same request method.
(II) Automatic redirection
The requests module automatically handles redirects by default. If the server returns a 3xx response, requests will be redirected with the new URL in the Location header. You can view all requests in the redirect chain by viewing .
Example:
import requests # Send a request that may redirectresponse = ('') # Print the URL of the final responseprint(f"The final URL: {}") # Check redirect historyif : print("Redirection occurred") for resp in : print(f"Status code: {resp.status_code}, URL: {}") else: print("No redirection occurred")
In this example,is a list containing the redirect history, containing the response object for each redirect, the final response will be saved in
response
middle.
(III) Redirection is prohibited
If you do not want to follow redirects automatically, you can disable redirects by allowing_redirects=False. In this case, requests will return a redirect response, but will not continue to follow.
Example:
import requests # Disable automatic redirectionresponse = ('', allow_redirects=False) # View the response status code and redirected URLprint(f"Status code: {response.status_code}") if response.is_redirect or response.status_code in [301, 302, 303, 307, 308]: print(f"Redirected URL: {['Location']}")
If the server returns a redirect status code (such as 301 or 302), the location header will contain a new redirected URL and requests will return this information without automatically making a new request.
(IV) Redirection of POST request
When sending a POST request, if a 302 or 303 redirect is encountered, requests automatically changes the method to GET to comply with the HTTP specification.
Example:
import requests # Send a POST request and trigger a redirectresponse = ('/login', data={'username': 'user', 'password': 'pass'}) # Print the request method after redirectionif : print(f"Request method used after redirection: {}")
In this case, the POST request may be redirected as a GET request.
(V) Redirecting chain and historical request
You can view the status code and URL of each redirect to track the entire request chain through .
Example:
import requests # Send a request that may have multiple redirectsresponse = ('') # Print each request information in the redirect chainfor resp in : print(f"Status code: {resp.status_code}, URL: {}, Request method: {}")
(VI) Limit the number of redirects
The requests module allows up to 30 redirects by default. If you need to limit the number of redirects, you can control it through the max_redirects parameter.
Example:
import requests # Limit the maximum number of redirects to 5response = ('', max_redirects=5) # View the response status codeprint(f"Final status code: {response.status_code}")
If the number of redirects exceeds the set limit, requests throws a TooManyRedirects exception.
(VII) Summary
Automatic redirection: requests
3xx redirects will be processed automatically by default and can be passedView the redirect chain.
Redirection is prohibited: useallow_redirects=False
Automatic redirection is prohibited.
POST Request Redirection: If a 302 or 303 status code is encountered, the POST request will be automatically converted to a GET request.
History Request: passGet the status code, URL and request method for each redirect to understand the request process.
Limit the number of redirects: Can be passedmax_redirects
Limit the maximum number of redirects to prevent falling into an infinite redirect loop.
4. Summary
passrequests
Module, developers can easily implement file downloads, automatic cookie management, and process redirects and historical requests. Through clear steps and code examples, this article shows how to deal with common network request needs, and provides solutions to complex scenarios such as large file downloads, request timeouts and multiple redirects, making network programming easier and more efficient.
This is the article about Python implementation file download, cookies and redirection method code. For more related Python implementation file download, cookies and redirection content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!