1. Selenium
Introduction
Selenium
It is a powerful browser automation tool that supports multiple browsers (such as Chrome, Firefox, Edge, etc.). It can simulate user operations, such as clicking buttons, filling in forms, processing JavaScript dynamic content, etc.
Applicable scenarios
- Complex interactions with the page are required (such as clicking buttons, selecting drop-down menus).
- Need to handle content loading dynamically in JavaScript.
- Cross-browser testing is required.
Sample code
from selenium import webdriver from import By import time # Start the browserdriver = (executable_path='path/to/chromedriver') # Open the form page('/form') # Fill out the formdriver.find_element(, 'username').send_keys('John Doe') driver.find_element(, 'email').send_keys('johndoe@') driver.find_element(, 'password').send_keys('securepassword123') # Submit the formdriver.find_element(, '//button[@type="submit"]').click() # Close the browser(5) ()
advantage
- Supports multiple browsers.
- Powerful, able to handle complex interactive and dynamic content.
shortcoming
- Requires a browser driver installed.
- Slow execution speed.
2. Playwright
Introduction
Playwright
is a modern browser automation tool that supports Chromium, Firefox and WebKit browsers. It's betterSelenium
More efficient and richer API.
Applicable scenarios
- Need to deal with complex JavaScript dynamic content.
- Cross-browser testing is required.
- Efficient automation is required.
Sample code
from playwright.sync_api import sync_playwright with sync_playwright() as p: # Start the browser browser = (headless=False) page = browser.new_page() # Open the form page ('/form') # Fill out the form ('input[name="username"]', 'John Doe') ('input[name="email"]', 'johndoe@') ('input[name="password"]', 'securepassword123') ('input[type="submit"]') # Close the browser ()
advantage
- Supports multiple browsers.
- Fast execution speed and rich API.
- Can handle complex interactive and dynamic content.
shortcoming
- Requires a browser driver installed.
- The learning curve is slightly higher.
3. Requests + BeautifulSoup
Introduction
requests
is an HTTP library used to send HTTP requests.BeautifulSoup
is an HTML parsing library used to extract data from web pages. Combining both can enable simple form submission.
Applicable scenarios
- Static web pages (no JavaScript dynamic loading content).
- Form submission is done via HTTP POST/GET request.
Sample code
import requests from bs4 import BeautifulSoup # Get the form pagesession = () response = ('/form') soup = BeautifulSoup(, '') # Extract CSRF tokencsrf_token = ('input', {'name': 'csrf_token'})['value'] # Construct form dataform_data = { 'username': 'John Doe', 'email': 'johndoe@', 'password': 'securepassword123', 'csrf_token': csrf_token } # Submit the formresponse = ('/submit', data=form_data) # Check the submission resultsif response.status_code == 200: print('The form was submitted successfully! ') else: print('Form submission failed! ')
advantage
- Lightweight, no need to start a browser.
- Suitable for handling simple form submissions.
shortcoming
- Unable to process JavaScript dynamic content.
- Unable to simulate complex user interactions.
4. MechanicalSoup
Introduction
MechanicalSoup
It is based onrequests
andBeautifulSoup
library, specially used to automate form submissions. It's purerrequests
Simpler and easier to use.
Applicable scenarios
- Simple form submission task.
- No need to deal with JavaScript dynamic content.
Sample code
import mechanicalsoup # Create a browser objectbrowser = () # Open the form pagepage = ('/form') form = .select_one('form') # Fill out the formform.select_one('input[name="username"]')['value'] = 'John Doe' form.select_one('input[name="email"]')['value'] = 'johndoe@' form.select_one('input[name="password"]')['value'] = 'securepassword123' # Submit the formresponse = (form, ) # Check the submission resultsif response.status_code == 200: print('The form was submitted successfully! ') else: print('Form submission failed! ')
advantage
- Simple and easy to use, suitable for quick form submission.
- No need to start a browser.
shortcoming
- Unable to process JavaScript dynamic content.
- The functions are relatively limited.
5. Pyppeteer
Introduction
Pyppeteer
It's a Python versionPuppeteer
, used to control headless browsers (Headless Chrome). It's similar toPlaywright
, but more focused on the Chromium browser.
Applicable scenarios
- Need to deal with complex JavaScript dynamic content.
- Needs headless browser support.
Sample code
import asyncio from pyppeteer import launch async def fill_form(): # Start the browser browser = await launch(headless=False) page = await () # Open the form page await ('/form') # Fill out the form await ('input[name="username"]', 'John Doe') await ('input[name="email"]', 'johndoe@') await ('input[name="password"]', 'securepassword123') await ('input[type="submit"]') # Close the browser await () # Run asynchronous tasksasyncio.get_event_loop().run_until_complete(fill_form())
advantage
- Supports headless browsers.
- Can handle complex interactive and dynamic content.
shortcoming
- Asynchronous programming is required.
- Only Chromium browser is supported.
6. RoboBrowser
Introduction
RoboBrowser
It is a simple library that combinesrequests
andBeautifulSoup
, suitable for quickly implementing form submission.
Applicable scenarios
- Simple form submission task.
- No need to deal with JavaScript dynamic content.
Sample code
from robobrowser import RoboBrowser # Create a browser objectbrowser = RoboBrowser() # Open the form page('/form') # Get the formform = browser.get_form() # Fill out the formform['username'].value = 'John Doe' form['email'].value = 'johndoe@' form['password'].value = 'securepassword123' # Submit the formbrowser.submit_form(form) # Check the submission resultsif .status_code == 200: print('The form was submitted successfully! ') else: print('Form submission failed! ')
advantage
- Simple and easy to use.
- No need to start a browser.
shortcoming
- Unable to process JavaScript dynamic content.
- Limited functions.
Horizontal comparison
tool | Is a browser required | Whether JavaScript is supported | Is it supported for multiple browsers | Learning curve | Applicable scenarios |
---|---|---|---|---|---|
Selenium | yes | yes | yes | medium | Complex interaction, cross-browser testing |
Playwright | yes | yes | yes | medium | Complex interaction, efficient automation |
Requests + BS4 | no | no | no | Low | Simple form submission |
MechanicalSoup | no | no | no | Low | Simple form submission |
Pyppeteer | yes | yes | No (Chromium only) | medium | Complex interaction, headless browser support |
RoboBrowser | no | no | no | Low | Simple form submission |
Summarize
- If you need to deal with complex interactive and dynamic content, it is recommended to use
Playwright
orSelenium
。 - If you only need a simple form submission, you can use
MechanicalSoup
orRoboBrowser
。 - If you don't want to start the browser, you can use
requests
+BeautifulSoup
。
The above is the detailed content of Python's automated form filling function. For more information about Python's automated form filling, please pay attention to my other related articles!