SoFunction
Updated on 2025-04-08

Sharing of batch file processing and automation management skills in Python

In daily office or data processing, we often need to process large amounts of files, such as batch renaming, copying, deleting or sorting files by type. Performing these tasks manually is not only time consuming, but also error-prone. Fortunately, Python provides powerful modules such as os, shutil and pathlib, which can help us easily implement batch file processing and automated management. This article will use specific cases and combined with concise and clear code to introduce how to use Python for common tasks such as file operation, directory management, batch renaming, copying/moving files, and deleting files to improve your file management efficiency.

1. Basics of file operation

Read file content

Reading file content is one of the most basic file operations. Here is an example of reading the contents of a text file:

with open("", "r", encoding="utf-8") as file:
    content = ()
    print(content)

In this example, we use the open function to open a file named "" in read-only mode ("r") and read its contents. Using the with statement ensures that the file is closed correctly after the operation is completed.

Write to a file

Writing to a file is just as simple. Here is an example of writing strings to a text file:

with open("", "w", encoding="utf-8") as file:
    ("Hello, Python file processing!")

In this example, we use the open function to open the file in write mode ("w"). If the file does not exist, it will be created; if the file already exists, its contents will be emptied.

Append content to file

Sometimes we need to append content at the end of the file instead of overwriting the original content. At this time, you can use the append mode ("a"):

with open("", "a", encoding="utf-8") as file:
    ("\nThis is a new line of content!")

2. Directory Management

Get the current working directory

Use the getcwd function of the os module to get the current working directory:

import os
current_dir = ()
print(f"Current working directory: {current_dir}")

List all files and folders in the directory

Use the listdir function of the os module to list all files and folders in the specified directory:

folder_path = "your_folder_path"  # Replace with the target directoryfiles = (folder_path)
print(files)

Create a new directory

Use the makedirs function of the os module to create a new directory. If the directory already exists, you can avoid throwing an exception by setting exist_ok=True:

("new_folder", exist_ok=True)

3. Batch rename files

Batch renaming files is one of the common tasks in file processing. The following are several common batch renaming operations.

Unified rename files (add prefix)

Suppose we have files in a folder that need to be prefixed uniformly, we can use the following code:

import os
 
folder_path = "your_folder_path"  # Replace with the target directoryfiles = (folder_path)
 
for index, filename in enumerate(files, start=1):
    old_path = (folder_path, filename)
    new_filename = f"new_{index}.txt"  # Add prefix new_    new_path = (folder_path, new_filename)
    (old_path, new_path)
    print(f"{filename} -> {new_filename}")
 
print("Batch renaming is completed!")

In this example, we enumerate the file list using the enumerate function and prefix "new_" to each file. Then use the function to rename the file.

Modify file extension

Sometimes we need to batch modify the file's extension, such as changing the .txt file to a .md file. The following code can be used:

import os
 
folder_path = "your_folder_path"  # Replace with the target directoryfiles = (folder_path)
 
for filename in files:
    if (".txt"):  # Modify only .txt files        old_path = (folder_path, filename)
        new_path = (folder_path, (".txt", ".md"))
        (old_path, new_path)
        print(f"{filename} -> {new_path}")
 
print("The extension is modified!")

In this example, we use the endswith method to filter out all .txt files and modify the extension using the replace method of the string.

4. Batch copy and move files

Copy the file to another directory

Use the copy function of the shutil module to copy files to another directory:

import shutil
 
source_file = ""
destination_folder = "backup/"
(source_file, destination_folder)
print("File copy is complete!")

In this example, we copy the file named "" to a directory named "backup/".

Batch copy of the entire folder

Sometimes we need to copy the entire folder and its content in batches, and we can use the copytree function of the shutil module:

import shutil
 
source_folder = "my_folder"
destination_folder = "backup_folder"
(source_folder, destination_folder)
print("Folder copy is complete!")

In this example, we copy the folder named "my_folder" and its contents into a directory named "backup_folder".

Move files to another folder

Use the shutil module move function to move files to another folder:

import shutil
 
source_file = ""
destination_folder = "moved_folder/"
(source_file, destination_folder)
print("File move is complete!")

In this example, we move the file named "" to a directory named "moved_folder/".

5. Organize files by type

Suppose we have a folder that stores various types of files (PDF, pictures, Excel, Word documents, etc.), and we can use Python to automatically organize these files by type.

import os
import shutil
 
folder_path = "your_folder_path"  # Replace with the target directory# Define the category directorycategories = {
    "picture": [".jpg", ".png", ".jpeg", ".gif"],
    "document": [".pdf", ".docx", ".txt", ".xlsx"],
    "video": [".mp4", ".avi", ".mkv"],
    "music": [".mp3", ".wav"]
}
 
# traverse files in foldersfor filename in (folder_path):
    file_path = (folder_path, filename)
    if (file_path):  # Only process files, ignore folders        file_ext = (filename)[1].lower()  # Get file extension        # Match Category        for category, extensions in ():
            if file_ext in extensions:
                category_folder = (folder_path, category)
                # If the classification folder does not exist, create it                (category_folder, exist_ok=True)
                # Move files to the classification folder                (file_path, (category_folder, filename))
                break  # Find the matching classification and jump out of the loop

In this example, we first define a classification dictionary category, where the key is the category name and the value is the list of file extensions corresponding to the category. Then iterate through the files in the target folder and move them to the corresponding classification folder according to the file extension.

6. Practical cases: Automatically clean up duplicate files

There may be many duplicate files in your computer occupying storage space. Python can quickly find and delete duplicate files. First calculate the hash value of the file, and then compare the hash value to determine whether the file is duplicated.

import hashlib
import os
 
def get_file_hash(file_path):
    hash_object = hashlib.sha256()
    with open(file_path, 'rb') as f:
        while True:
            data = (8192)
            if not data:
                break
            hash_object.update(data)
    return hash_object.hexdigest()
 
folder_path = "your_folder_path"  # Replace with your folder pathfile_hashes = {}
duplicate_files = []
 
for root, dirs, files in (folder_path):
    for file in files:
        file_path = (root, file)
        file_hash = get_file_hash(file_path)
        if file_hash in file_hashes:
            duplicate_files.append(file_path)
        else:
            file_hashes[file_hash] = file_path
 
# Delete duplicate filesfor file in duplicate_files:
    (file)
    print(f"Deleted files: {file}")

In this practical case, we first define a get_file_hash function, which accepts a file path as a parameter and returns the SHA-256 hash value of the file. We then iterate through all files in the specified folder, calculate the hash value for each file, and use a dictionary file_hashes to store the mapping of hash value to the file path. If the same hash value already exists in the dictionary, it means that the duplicate file is found, and we add its path to the duplicate_files list. Finally, we loop through the duplicate_files list, delete all duplicate files, and print out the deleted file path.

7. Summary

This article introduces how to use Python for common tasks such as file operation, directory management, batch renaming, copy/move files, and delete files. By combining modules such as os, shutil and pathlib, we can easily realize batch processing and automated management of files, greatly improving file management efficiency.

The above is the detailed content shared by batch file processing and automation management techniques in Python. For more information about Python file processing, please pay attention to my other related articles!