Automated organization of computer files
Automatic file categorization, quick finding of files and folders, cleaning of duplicate files, conversion of image formats and other common tasks are done through Python programming.
1. Automatic classification of documents
Categorize and organize files into different folders according to their extensions.
Using os and shutil modules
The os module provides a number of functions for manipulating files and folders, allowing you to create, delete, view attributes and find paths to files or folders.
The shutil module provides functions to move, copy, compress, etc. files or folders.
""" The os module provides a number of functions for manipulating files and folders, which allow you to create, delete, view attributes, and find paths to files or folders. The shutil module provides functions to move, copy, and compress files and folders. """ import os import shutil # Directory of source files src_dir = "Documents to be classified/" # Directory of the output file output_dir = 'Classified documents/' files = (src_dir) # List the names of all files and subfolders in the src_dir directory print(files) for f in files: # Get the path src_path = src_dir + f # Determine if it's a file if (src_path): # Get the file suffix, splice it with the output directory to form the output folder path output_path = output_dir + ('.')[-1] # Determine if the output folder exists, if not it needs to be created if not (output_path): (output_path) # Move the file to the folder corresponding to its extension in the output directory (src_path, output_path)
Test file:
Effect:
Using the pathlib module
from pathlib import Path # Directory of source files src_dir_name = "Documents to be classified/" # Directory of the output file output_dir_name = 'Classified documents/' # Use the Path() function to create path objects for the source and destination folders src_dir = Path(src_dir_name) output_dir = Path(output_dir_name) # Find files and subfolders in the source folder, * means return all files and subfolders (full path) files = src_dir.glob('*') for f in files: # Determine if a path represents a file if f.is_file(): # Get the output folder path output_path = output_dir / ('.') # Determine if the output folder exists if not output_path.exists(): # Create if not present, parents is True for creating multi-level folders output_path.mkdir(parents=True) # Rename a file path to a given path to enable file movement (output_path / )
2. Quick search of files and folders
Write fast find file and folder programs using python for exact and fuzzy find.
Find files and folders with precision
from pathlib import Path while True: folder = input("Please enter the path to the search directory (e.g., :D:\\\):") folder = Path(()) # Use the Path() function to create a path object # Determine if the input path exists and if it is a directory if () and folder.is_dir(): break else: print("The path entered was incorrect, please re-enter!") search_word = input("Please enter the name of the file or folder to be found:").strip() # Get the name of the file or folder entered, stripping out the leading and trailing spaces """ glob()functions andrglob()Difference between functions: glob()functions andrglob()functions can use wildcards to find files and subfolders in a given path.。 The difference is that: glob()function only performs as well as a lookup,(indicates contrast)rglob()The function does a multilevel lookup。 """ # Use the rglob() function to find files and folders with names identical to the specified keywords in the path entered by the user and convert the results into a list. results = list((pattern=search_word)) if len(results) != 0: print(f'exist【{folder}】The following results were found under:') for r in results: print(r) else: print(f'exist【{folder}】I did not find a file named【{search_word}】file or folder!')
Effect:
Fuzzy Find Files and Folders
# author:mlnt # createdate:2022/8/23 from pathlib import Path while True: folder = input("Please enter the path to the search directory (e.g., :D:\\\):") folder = Path(()) # Use the Path() function to create a path object # Determine if the input path exists and if it is a directory if () and folder.is_dir(): break else: print("The path entered was incorrect, please re-enter!") search_word = input("Please enter the name of the file or folder to be found:").strip() # Get the name of the file or folder entered, stripping out the leading and trailing spaces """ glob()functions andrglob()Difference between functions: glob()functions andrglob()functions can use wildcards to find files and subfolders in a given path.。 The difference is that: glob()function only performs as well as a lookup,(indicates contrast)rglob()The function does a multilevel lookup。 """ # Use the rglob() function to find files and folders with names identical to the specified keywords in the path entered by the user and convert the results into a list. results = list((pattern=f'*{search_word}*')) if len(results) == 0: print(f'exist【{folder}】No names were found under the【{search_word}】file or folder!') else: result_folders = [] # Keyword-related folders found result_files = [] # Documents related to keywords for r in results: if r.is_dir(): # If directory (folder), add to folder list result_folders.append(r) else: result_files.append(r) if len(result_folders) != 0: print(f'exist【{folder}】The keywords found under the{search_word}Related folders:') for f in result_folders: print(f) if len(result_files) != 0: print(f'exist【{folder}】The keywords found under the{search_word}The relevant documents are as follows:') for f in result_files: print(f)
Effect:
3. Automatic clean-up of duplicate documents
Steps to implement automatic file cleanup:
1. List all files in the specified folder;
2. Compare the contents of the documents two by two;
3. If the contents are the same, move one of the files to the specified folder
""" Steps to implement automatic file cleanup: 1. list all files under the specified folder; 2. compare the contents of the files two by two to see if they are the same; 3. if the content is the same, move one of the files to the specified folder """ # Import the Path() function from the pathlib module from pathlib import Path # Import the cmp() function from the filecmp module to do file comparisons from filecmp import cmp input_dir = 'Documents to be processed' output_dir = 'Duplicate documents' # Create the Path object src_folder = Path(input_dir) output_folder = Path(output_dir) # Determine if the output directory exists if not output_folder.exists(): # Create directories if they don't exist (multi-level creation) output_folder.mkdir(parents=True) results = list(src_folder.glob('*')) # List files and subfolders in a given directory file_list = [] for r in results: # Determine if a path points to a file if r.is_file(): # Yes add to file list file_list.append(r) # Iterate through the list of documents and compare for i in file_list: for j in file_list: if i != j and () and (): # Compare two files to see if they are the same if cmp(i, j): # If two files are the same, move one of them to the specified folder # Delete duplicate files() (output_folder / )
Test file:
Effect:
4. Batch conversion of image formats
from pathlib import Path from PIL import Image input_dir = 'input_images' output_dir = 'output_images' # Create the Path object src_folder = Path(input_dir) output_folder = Path(output_dir) # Determine if the output directory exists if not output_folder.exists(): # Create directories if they don't exist (multi-level creation) output_folder.mkdir(parents=True) file_list = list(src_folder.glob('*[.jpg|.jpeg]')) # Find images with a jpg or jpeg suffix for f in file_list: output_file = output_folder / # Replace the path extension output_file = output_file.with_suffix('.png') # Save the image to the specified path (f).save(output_file) print(f'{}-->Format conversion complete!')
Test file:
Effect:
5. Automatic sorting of pictures by date of shooting
The exifread module needs to be installed:
pip install exifread
Steps:
1. List all the pictures in the specified folder;
2. Read the EXIF (Exchangeable Image File Format) information of the picture and extract the shooting date;
3. Convert the shooting date to the desired format and then create a folder using the shooting date;
4. Move the pictures to the folder corresponding to the shooting date.
""" Steps: 1.List all the pictures under the specified folder; 2. read the EXIF (Exchangeable Image File Format) information of the pictures and extract the shooting date; pip install exifread 3. Convert the shooting date to the desired format, and then create a folder with the shooting date; 4. Move the pictures to the folder corresponding to the shooting date. """ from pathlib import Path from datetime import datetime from exifread import process_file input_dir = 'input_images' output_dir = 'output_dir' # Create the Path object src_folder = Path(input_dir) output_folder = Path(output_dir) # Determine if the output directory exists if not output_folder.exists(): # Create directories if they don't exist (multi-level creation) output_folder.mkdir(parents=True) # Find images with a jpg or jpeg suffix file_list = list(src_folder.glob('*[.jpg|.jpeg]')) for f in file_list: with open(f, 'rb') as fp: # Read the EXIF information of an image # process_file function returns the read EXIF information in dictionary format tags = process_file(fp, details=False) # Determine if a shooting date is in the dictionary if 'EXIF DateTimeOriginal' in (): dto = str(tags['EXIF DateTimeOriginal']) # Convert shooting date to desired format as folder name folder_name = (dto, '%Y:%m:%d %H:%M:%S').strftime('%Y-%m-%d') # Set the path to the output directory output_path = output_folder / folder_name if not output_path.exists(): output_path.mkdir(parents=True) # Move pictures to the folder corresponding to the date they were taken (output_path / )
Test file:
Effect:
Above is Python to achieve automation of organizing files of the sample code in detail, more information about Python organizing files please pay attention to my other related articles!