A comprehensive guide to HTTP requests in Python

In modern network applications, the HTTP (HyperText Transfer Protocol) protocol is the core of data transmission between clients and servers. As a Python developer, it is crucial to understand and master how to send and process HTTP requests. Whether you are developing web applications, crawlers, or performing API integration, this article will gradually guide you to become an expert in HTTP request processing from basic to advanced.

1. Basic knowledge of HTTP requests

HTTP is a stateless application layer protocol used for data transmission between clients and servers. Its main features include request-response model, statelessness, and support for a variety of data formats.

Request-response model: The client sends the request, the server processes the request and returns the response.
Stateless: Each request is independent and the server will not remember the status of the previous request.
Supports multiple data formats: HTTP can transmit various types of data such as text, images, and videos.

An HTTP request consists of three parts: a request line, a request header and a request body:

Request line: Contains the request method (such as GET, POST), request URL, and HTTP version.
Request header: Contains additional information of the request, such as browser type, accepted content type, etc.
Request body: Optional part, usually used for POST requests, containing the data to be sent.

An HTTP response contains status rows, response headers, and response bodies:

Status line: Contains HTTP version, status code (such as 200, 404) and status information.
Response header: Contains additional information of the response, such as content type, content length, etc.
Response body: actual response data, such as HTML pages, JSON data, etc.

2. HTTP request library in Python

Python provides multiple modules and libraries to handle HTTP requests and responses, the most commonly used library is requests. The requests library is powerful and easy to use and is a popular choice for sending HTTP requests.

Install the requests library

You can use the pip command to install the requests library:

pip install requests

Send HTTP requests using the requests library

The requests library provides a simple API to send HTTP requests, including common methods such as GET, POST, PUT, DELETE, etc.

GET Request

GET request is used to obtain data from the server. Here is a simple GET request example:

import requests  
  
response = ('/posts')  
print(response.status_code)  # Print status codeprint(())  # Print the returned JSON data

In this example, we send a GET request to /posts and print the status code and JSON data of the response.

POST request

POST request is used to send data to the server. Here is a simple POST request example:

import requests  
  
data = {'title': 'foo', 'body': 'bar', 'userId': 1}  
response = ('/posts', json=data)  
print(response.status_code)  # Print status codeprint(())  # Print the returned JSON data

In this example, we send a POST request to /posts and send JSON data containing the title, body, and user ID.

PUT Request

PUT requests are used to update resources on the server. Here is a simple PUT request example:

import requests  
  
data = {'id': 1, 'title': 'updated title', 'body': 'updated body', 'userId': 1}  
response = ('/posts/1', json=data)  
print(response.status_code)  # Print status codeprint(())  # Print the returned JSON data

In this example, we send a PUT request to /posts/1 and update the title, body, and user ID of the specified post.

DELETE request

DELETE request is used to delete resources on the server. Here is a simple DELETE request example:

import requests  
  
response = ('/posts/1')  
print(response.status_code)  # Print status code

In this example, we sent a DELETE request to /posts/1 and deleted the specified post.

3. Process HTTP response

When processing HTTP responses, we usually need to get the status code, response header, and response body.

Get the status code

The status code represents the processing result of the request, and common status codes include:

200: The request was successful.
404: The requested resource was not found.
500: Internal server error.

Example of getting status code:

response = ('/posts')  
print(f"Status code: {response.status_code}")

Get the response header

The response header contains additional information returned by the server, which can be obtained through the headers property:

response = ('/posts')  
print("Response header:")  
for key, value in ():  
    print(f"{key}: {value}")

Get the response body

The response body is the actual data content, which can be obtained through text or json() method:

response = ('/posts')  
print("Responsive body:")  
print()  # Get as a stringprint(())  # Get it in JSON format

4. Advanced skills and practice

In addition to basic HTTP requests and response processing, there are some advanced tips and practices that can help you send and process HTTP requests more efficiently.

Using connection pool

Establishing an HTTP connection is a time-consuming operation. To reduce the overhead of the connection, you can use a connection pool to reuse existing connections. In the requests library, you can enable connection pooling by setting the Session object.

import requests  
  
session = ()  
  
# Use session to send multiple requestsresponse1 = ('/posts/1')  
response2 = ('/posts/2')  
  
# Close session()

Set request header

When sending HTTP requests, it is very important to set the appropriate request header information. For example, setting User-Agent can simulate different browser behaviors, and setting Accept-Encoding can support compression to reduce the amount of data transmitted.

headers = {  
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}  
response = ('', headers=headers)

Handle Cookies and Sessions

If you need to keep the session state or make multiple requests, it is recommended to use() to manage the session, which will automatically handle the persistence and sending of cookies.

session = ()  
  
# Log in and get cookieslogin_data = {'username': 'your_username', 'password': 'your_password'}  
response = ('/login', data=login_data)  
  
# Use session to send other requestsresponse = ('/protected_page')

Error handling

Check the HTTP response status code to ensure the request is successful. For non-200 responses, errors should be handled appropriately and detailed error information should be recorded for debugging.

try:  
    response = ('/nonexistent_page')  
    response.raise_for_status()  # If the status code is not 200, an HTTPError exception is raisedexcept  as errh:  
    print("Http Error:", errh)  
except  as errc:  
    print("Error Connecting:", errc)  
except  as errt:  
    print("Timeout Error:", errt)  
except  as err:  
    print("OOps: Something Else", err)

Set timeout

To prevent requests from waiting indefinitely, a reasonable timeout should be set. This can be done by passing the timeout parameter in the request.

response = ('', timeout=5)  # Set the timeout to 5 seconds

Using a proxy

Sometimes, due to the limitations of the network environment, direct HTTP requests may encounter speed bottlenecks. At this time, you can consider using a proxy to bypass the limit and increase the request speed.

proxies = {  
    'http': 'http://10.10.1.10:3128',  
    'https': 'http://10.10.1.10:3128'
    }
 
response = ('', proxies=proxies)  
print()

Note that the URL format of the proxy server is usually protocol://address:port. If you need authentication, you can include the username and password in the URL, such as http://user:password@:port.

HTTP authentication

Some websites require basic HTTP authentication to access. The requests library handles this situation through the AuthBase class and its subclass HTTPBasicAuth.

from  import HTTPBasicAuth  
  
url = '/protected'  
username = 'your_username'  
password = 'your_password'  
  
response = (url, auth=HTTPBasicAuth(username, password))  
print()

In addition, the requests library also supports more complex authentication mechanisms such as OAuth, which usually needs to be implemented through third-party libraries.

V. Advanced functions and practices

Custom request header

In addition to the common User-Agent and Accept-Encoding, you can also customize other request headers as needed.

headers = {  
    'User-Agent': 'Custom User Agent',  
    'Custom-Header': 'CustomHeaderValue',  
}  
  
response = ('', headers=headers)  
print()

File upload

Uploading files using the requests library is very simple. You just need to pass the file object to the POST request as part of the file field.

url = '/upload'  
files = {'file': open('', 'rb')}  
  
response = (url, files=files)  
print()

Streaming response

For large files or long-running requests, you may want to process the response data in a stream to avoid loading the entire response into memory at once.

response = ('/largefile', stream=True)  
with open('largefile', 'wb') as f:  
    for chunk in response.iter_content(chunk_size=8192):  
        (chunk)

Handle redirects

The requests library automatically handles HTTP redirects by default. But if you need to control the behavior of redirection, you can do it by setting the allow_redirects parameter.

response = ('/redirect', allow_redirects=False)  
print(response.status_code)  # It may be 301 or 302print(['Location'])  # Redirected target URL

SSL certificate verification

By default, the requests library verifies the SSL certificate. But in some cases, you may need to ignore SSL verification (for example, in a test environment). Although this is not recommended in production environments, you can do it by setting the verification parameter to False.

response = ('', verify=False)  
print()

However, it is better to specify a CA certificate file to verify the server's SSL certificate.

response = ('', verify='/path/to/')  
print()

6. Summary

This article provides a comprehensive introduction to how to use the requests library to send and process HTTP requests in Python. From basics to advanced tips, we cover common request methods such as GET, POST, PUT, DELETE, and how to handle HTTP responses, set request headers, manage cookies and sessions, handle errors, set timeouts, use proxying, and perform HTTP authentication.

The above is the detailed content of the comprehensive guide to HTTP requests in Python. For more information about Python HTTP requests, please follow my other related articles!