In daily office or data processing, we often need to process large amounts of files, such as batch renaming, copying, deleting or sorting files by type. Performing these tasks manually is not only time consuming, but also error-prone. Fortunately, Python provides powerful modules such as os, shutil and pathlib, which can help us easily implement batch file processing and automated management. This article will use specific cases and combined with concise and clear code to introduce how to use Python for common tasks such as file operation, directory management, batch renaming, copying/moving files, and deleting files to improve your file management efficiency.
1. Basics of file operation
Read file content
Reading file content is one of the most basic file operations. Here is an example of reading the contents of a text file:
with open("", "r", encoding="utf-8") as file: content = () print(content)
In this example, we use the open function to open a file named "" in read-only mode ("r") and read its contents. Using the with statement ensures that the file is closed correctly after the operation is completed.
Write to a file
Writing to a file is just as simple. Here is an example of writing strings to a text file:
with open("", "w", encoding="utf-8") as file: ("Hello, Python file processing!")
In this example, we use the open function to open the file in write mode ("w"). If the file does not exist, it will be created; if the file already exists, its contents will be emptied.
Append content to file
Sometimes we need to append content at the end of the file instead of overwriting the original content. At this time, you can use the append mode ("a"):
with open("", "a", encoding="utf-8") as file: ("\nThis is a new line of content!")
2. Directory Management
Get the current working directory
Use the getcwd function of the os module to get the current working directory:
import os current_dir = () print(f"Current working directory: {current_dir}")
List all files and folders in the directory
Use the listdir function of the os module to list all files and folders in the specified directory:
folder_path = "your_folder_path" # Replace with the target directoryfiles = (folder_path) print(files)
Create a new directory
Use the makedirs function of the os module to create a new directory. If the directory already exists, you can avoid throwing an exception by setting exist_ok=True:
("new_folder", exist_ok=True)
3. Batch rename files
Batch renaming files is one of the common tasks in file processing. The following are several common batch renaming operations.
Unified rename files (add prefix)
Suppose we have files in a folder that need to be prefixed uniformly, we can use the following code:
import os folder_path = "your_folder_path" # Replace with the target directoryfiles = (folder_path) for index, filename in enumerate(files, start=1): old_path = (folder_path, filename) new_filename = f"new_{index}.txt" # Add prefix new_ new_path = (folder_path, new_filename) (old_path, new_path) print(f"{filename} -> {new_filename}") print("Batch renaming is completed!")
In this example, we enumerate the file list using the enumerate function and prefix "new_" to each file. Then use the function to rename the file.
Modify file extension
Sometimes we need to batch modify the file's extension, such as changing the .txt file to a .md file. The following code can be used:
import os folder_path = "your_folder_path" # Replace with the target directoryfiles = (folder_path) for filename in files: if (".txt"): # Modify only .txt files old_path = (folder_path, filename) new_path = (folder_path, (".txt", ".md")) (old_path, new_path) print(f"{filename} -> {new_path}") print("The extension is modified!")
In this example, we use the endswith method to filter out all .txt files and modify the extension using the replace method of the string.
4. Batch copy and move files
Copy the file to another directory
Use the copy function of the shutil module to copy files to another directory:
import shutil source_file = "" destination_folder = "backup/" (source_file, destination_folder) print("File copy is complete!")
In this example, we copy the file named "" to a directory named "backup/".
Batch copy of the entire folder
Sometimes we need to copy the entire folder and its content in batches, and we can use the copytree function of the shutil module:
import shutil source_folder = "my_folder" destination_folder = "backup_folder" (source_folder, destination_folder) print("Folder copy is complete!")
In this example, we copy the folder named "my_folder" and its contents into a directory named "backup_folder".
Move files to another folder
Use the shutil module move function to move files to another folder:
import shutil source_file = "" destination_folder = "moved_folder/" (source_file, destination_folder) print("File move is complete!")
In this example, we move the file named "" to a directory named "moved_folder/".
5. Organize files by type
Suppose we have a folder that stores various types of files (PDF, pictures, Excel, Word documents, etc.), and we can use Python to automatically organize these files by type.
import os import shutil folder_path = "your_folder_path" # Replace with the target directory# Define the category directorycategories = { "picture": [".jpg", ".png", ".jpeg", ".gif"], "document": [".pdf", ".docx", ".txt", ".xlsx"], "video": [".mp4", ".avi", ".mkv"], "music": [".mp3", ".wav"] } # traverse files in foldersfor filename in (folder_path): file_path = (folder_path, filename) if (file_path): # Only process files, ignore folders file_ext = (filename)[1].lower() # Get file extension # Match Category for category, extensions in (): if file_ext in extensions: category_folder = (folder_path, category) # If the classification folder does not exist, create it (category_folder, exist_ok=True) # Move files to the classification folder (file_path, (category_folder, filename)) break # Find the matching classification and jump out of the loop
In this example, we first define a classification dictionary category, where the key is the category name and the value is the list of file extensions corresponding to the category. Then iterate through the files in the target folder and move them to the corresponding classification folder according to the file extension.
6. Practical cases: Automatically clean up duplicate files
There may be many duplicate files in your computer occupying storage space. Python can quickly find and delete duplicate files. First calculate the hash value of the file, and then compare the hash value to determine whether the file is duplicated.
import hashlib import os def get_file_hash(file_path): hash_object = hashlib.sha256() with open(file_path, 'rb') as f: while True: data = (8192) if not data: break hash_object.update(data) return hash_object.hexdigest() folder_path = "your_folder_path" # Replace with your folder pathfile_hashes = {} duplicate_files = [] for root, dirs, files in (folder_path): for file in files: file_path = (root, file) file_hash = get_file_hash(file_path) if file_hash in file_hashes: duplicate_files.append(file_path) else: file_hashes[file_hash] = file_path # Delete duplicate filesfor file in duplicate_files: (file) print(f"Deleted files: {file}")
In this practical case, we first define a get_file_hash function, which accepts a file path as a parameter and returns the SHA-256 hash value of the file. We then iterate through all files in the specified folder, calculate the hash value for each file, and use a dictionary file_hashes to store the mapping of hash value to the file path. If the same hash value already exists in the dictionary, it means that the duplicate file is found, and we add its path to the duplicate_files list. Finally, we loop through the duplicate_files list, delete all duplicate files, and print out the deleted file path.
7. Summary
This article introduces how to use Python for common tasks such as file operation, directory management, batch renaming, copy/move files, and delete files. By combining modules such as os, shutil and pathlib, we can easily realize batch processing and automated management of files, greatly improving file management efficiency.
The above is the detailed content shared by batch file processing and automation management techniques in Python. For more information about Python file processing, please pay attention to my other related articles!