Share 20 practical Python Excel automation scripts

Preface

In the process of data processing and analysis, Excel files are a common format in our daily work. Through Python, we can implement various automated operations on Excel files and improve work efficiency.

This article will share 20 practical Excel automation scripts to help novice novice master these skills more easily.

1. Excel cell batch fill

import pandas as pd  

# Batch fills the cells of the specified columndef fill_column(file_path, column_name, value):  
    df = pd.read_excel(file_path)  
    df[column_name] = value  # Fill all cells of the specified column with value    df.to_excel(file_path, index=False)  

fill_column('', 'Remark', 'Processed')  
print("The comment column has been populated successfully!")

explain

This script populates all the Notes columns in it as Processed. For ordinary users, when processing large amounts of data, a certain column is often required to be uniformly marked, which is particularly important.

2. Set row height and column width

from openpyxl import load_workbook  

# Set the row height and column width of Exceldef set_row_column_size(file_path):  
    wb = load_workbook(file_path)  
    ws =   

    # Set the first row height and the first column width    ws.row_dimensions[1].height = 30  # Set the row height    ws.column_dimensions['A'].width = 20  # Set column width
    (file_path)  

set_row_column_size('')  
print("The row height and column width are set successfully!")

explain

This script sets the row height of the first row and the column width of the first column for the Excel file. Appropriate adjustment of row height and column width can improve the readability of the table, especially when there are more or more complex content. Using this feature can make the report more beautiful and easy to read.

3. Delete the row according to the conditions

# Delete rows in Excel according to conditionsdef delete_rows_based_on_condition(file_path, column_name, condition):  
    df = pd.read_excel(file_path)  
    df = df[df[column_name] != condition]  # Delete rows that meet the criteria    df.to_excel(file_path, index=False)  

delete_rows_based_on_condition('', 'state', 'invalid')  
print("The row that meets the criteria has been deleted!")

explain

This script removes rows with the value "Invalid" in the Status column from Excel. This operation is very common during data cleaning and helps reduce noise in the data set and improves the accuracy of data analysis.

4. Create a new Excel worksheet

# Create a new worksheet in an existing Excel filedef create_new_sheet(file_path, sheet_name):  
    wb = load_workbook(file_path)  
    wb.create_sheet(title=sheet_name)  # Create a new worksheet    (file_path)  

create_new_sheet('', 'New worksheet')  
print("New worksheet was created successfully!")

explain

This script creates a new worksheet in an existing Excel file. This is very useful for organizing data, separating data from different tasks or projects, keeping the file structure clear.

5. Import CSV files to Excel

# Import CSV files into Excel worksheetsdef import_csv_to_excel(csv_file, excel_file):  
    df = pd.read_csv(csv_file)  
    df.to_excel(excel_file, index=False)  

import_csv_to_excel('', 'imported_data.xlsx')  
print("The CSV file was successfully imported into Excel!")

explain

This script imports CSV files into Excel. Many times, data is provided in CSV format, and the script can be easily converted to Excel format for subsequent analysis and processing.

6. Pivot table generation

# Generate pivot table and save to a new Excel filedef generate_pivot_table(file_path, index_column, values_column, output_file):  
    df = pd.read_excel(file_path)  
    pivot_table = df.pivot_table(index=index_column, values=values_column, aggfunc='sum')  # Summary    pivot_table.to_excel(output_file)  

generate_pivot_table('sales_data.xlsx', 'area', 'Sales', 'pivot_output.xlsx')  
print("Pivot table generation successfully!")

explain

The script generates a summary pivot table based on the given Region and Sales columns and saves it to a new file. When conducting business analysis, the pivot table can quickly display data summary in different dimensions.

7. Format Excel

from  import Font, Color  

# Set Excel cell font styledef format_cells(file_path):  
    wb = load_workbook(file_path)  
    ws =   

    for cell in ws['A']:  # traverse column A         = Font(bold=True, color="FF0000")  # Set font bold and red
    (file_path)  

format_cells('')  
print("Cell formatting successfully!")

explain

This script sets the column A font in it to bold and red. This formatting is often used to emphasize specific data, making the report more visually appealing.

8. Analyze and output descriptive statistics

# Output descriptive statistics to Exceldef descriptive_statistics(file_path, output_file):  
    df = pd.read_excel(file_path)  
    stats = ()  # Calculate descriptive statistics    stats.to_excel(output_file)  

descriptive_statistics('', 'statistics_output.xlsx')  
print("Descriptive statistics output succeeded!")

explain

This script calculates descriptive statistics (such as mean, standard deviation, etc.) of Excel file and saves the results to a new Excel file. This is very important for understanding the basic characteristics of data, especially in the early stages of data analysis.

9. Bulkly modify the Excel file name

import os  

# Batch rename Excel files in the specified directorydef rename_excel_files(directory, prefix):  
    for filename in (directory):  
        if ('.xlsx'):  
            new_name = f"{prefix}_{filename}"  
            ((directory, filename), (directory, new_name))  
            print(f"Already {filename} Rename to {new_name}")  

rename_excel_files('/path/to/excel/files', '2024')

explain

This script batch renames all Excel files in the specified directory, prefixing each file name. This batch operation is very convenient for users who need to process large numbers of Excel files, such as naming files based on year or project for easy management and archiving.

10. Automatically send emails containing Excel data

import smtplib  
from  import MIMEMultipart  
from  import MIMEApplication  
from  import MIMEText  

# Automatically send emails with Excel attachmentsdef send_email(to_address, subject, body, excel_file):  
    from_address = "your_email@"  
    password = "your_password"  

    msg = MIMEMultipart()  
    msg['From'] = from_address  
    msg['To'] = to_address  
    msg['Subject'] = subject  

    # Add text    (MIMEText(body, 'plain'))  

    # Add Excel attachments    with open(excel_file, "rb") as attachment:  
        part = MIMEApplication((), Name=(excel_file))  
        part['Content-Disposition'] = f'attachment; filename="{(excel_file)}"'  
        (part)  

    # Send email    with ('', 587) as server:  
        ()  
        (from_address, password)  
        server.send_message(msg)  

send_email('recipient@', 'Monthly Report', 'Please find attached the monthly report.', '')  
print("The email was sent successfully!")

explain

This script uses the SMTP protocol to automatically send an email with an Excel file attached. This feature is especially useful in work, such as sending financial statements or performance reports to relevant personnel regularly every month. Automated mailing can save time and reduce human errors.

11. Merge multiple Excel files

import pandas as pd
import os

def merge_excel_files(folder_path, output_file):
    all_data = ()
    for filename in (folder_path):
        if ('.xlsx'):
            file_path = (folder_path, filename)
            df = pd.read_excel(file_path)
            all_data = ([all_data, df], ignore_index=True)
    all_data.to_excel(output_file, index=False)

merge_excel_files('your_folder_path', 'merged_file.xlsx')
print("Multiple Excel files merge successfully!")

explain

This script merges all Excel files in the specified folder into one file. When processing data scattered in multiple files, this function can integrate the data together to facilitate subsequent unified analysis.

12. Split Excel file

import pandas as pd

def split_excel_file(file_path, column_name, output_folder):
    df = pd.read_excel(file_path)
    unique_values = df[column_name].unique()
    for value in unique_values:
        sub_df = df[df[column_name] == value]
        output_file = (output_folder, f'{value}.xlsx')
        sub_df.to_excel(output_file, index=False)

split_excel_file('', 'department', 'output_folder')
print("Excel file split successfully!")

explain

This script splits Excel files into multiple files based on the unique value of the specified column. For example, split the data into files corresponding to different departments according to the "Department" column, so that each department can independently view and process its own data.

13. Replace the cell content

import pandas as pd

def replace_cell_content(file_path, column_name, old_value, new_value):
    df = pd.read_excel(file_path)
    df[column_name] = df[column_name].replace(old_value, new_value)
    df.to_excel(file_path, index=False)

replace_cell_content('', 'Product Name', 'Old products', 'New Products')
print("Cell content replacement was successful!")

explain

This script replaces the specific content in the specified column with the new content. This feature can quickly modify errors or outdated information in the data when it is corrected or updated.

14. Sort the data

import pandas as pd

def sort_excel_data(file_path, column_name, ascending=True):
    df = pd.read_excel(file_path)
    df = df.sort_values(by=column_name, ascending=ascending)
    df.to_excel(file_path, index=False)

sort_excel_data('', 'Sales', ascending=False)
print("The data sorted successfully!")

explain

The main function of this script is to sort the data in the Excel file according to the specified columns, and you can choose ascending or descending order, and finally save the sorted data back to the original Excel file. Sorting operations are very common in data processing and analysis. For example, sorting sales data in descending order by sales, which can quickly find records with high sales.

15. Statistics the number of unique values in a specific column

import pandas as pd

def count_unique_values(file_path, column_name):
    df = pd.read_excel(file_path)
    unique_count = df[column_name].nunique()
    print(f"{column_name}The number of unique values of the column is: {unique_count}")

count_unique_values('', 'Customer number')

explain

This script is used to count the number of unique values for the specified column in the Excel file. In data analysis, understanding how many different values there are in a column can help us quickly grasp the distribution of data. For example, counting the number of unique values of customer numbers can help us know how many different customers there are.

16. Extract the specified column to the new Excel file

import pandas as pd

def extract_columns(file_path, columns, output_file):
    df = pd.read_excel(file_path)
    new_df = df[columns]
    new_df.to_excel(output_file, index=False)

extract_columns('', ['Name', 'age'], 'extracted_columns.xlsx')
print("The specified column is extracted successfully!")

explain

This script can extract the specified column from an Excel file and save it to a new Excel file. When we only need some of the information in the data, using this script can quickly filter out the required data and avoid processing a large amount of irrelevant information.

17. Add borders to Excel tables

from openpyxl import load_workbook
from  import Border, Side

def add_border_to_excel(file_path):
    wb = load_workbook(file_path)
    ws = 
    thin_border = Border(left=Side(style='thin'), 
                         right=Side(style='thin'), 
                         top=Side(style='thin'), 
                         bottom=Side(style='thin'))
    for row in ws.iter_rows():
        for cell in row:
             = thin_border
    (file_path)

add_border_to_excel('')
print("The table border was added successfully!")

explain

This script adds thin borders to each cell in an Excel table. Adding borders can make the table clearer and easier to read, especially when printing or displaying data, which can improve the aesthetics and professionalism of the table.

18. Check whether there are empty lines in Excel file and delete

import pandas as pd

def remove_empty_rows(file_path):
    df = pd.read_excel(file_path)
    df = (how='all')
    df.to_excel(file_path, index=False)

remove_empty_rows('')
print("Blank line deleted successfully!")

explain

This script is used to check if there are rows in the Excel file where all columns are empty and delete these blank rows. Blank rows may affect the results of data processing and analysis. Deleting blank rows can ensure the integrity and accuracy of the data.

19. Filter data based on multiple columns of conditions

import pandas as pd

def filter_data_by_multiple_conditions(file_path, conditions, output_file):
    df = pd.read_excel(file_path)
    query_str = ' &amp; '.join([f'{col} {op} {val}' for col, op, val in conditions])
    filtered_df = (query_str)
    filtered_df.to_excel(output_file, index=False)

# Example Conditions: Age older than 25 and gender is femaleconditions = [('age', '&gt;', 25), ('gender', '==', "'female'")]
filter_data_by_multiple_conditions('', conditions, 'filtered_data.xlsx')
print("Multiple-condition filtering data was successful!")

explain

This script can filter Excel data based on the conditions of multiple columns and save the filter results to a new file. In actual data analysis, we often need to filter out data that meets the requirements based on multiple conditions. Using this script can facilitate multi-condition filtering.

20. Format the date column in Excel

import pandas as pd

def format_date_column(file_path, column_name, date_format):
    df = pd.read_excel(file_path)
    df[column_name] = pd.to_datetime(df[column_name]).(date_format)
    df.to_excel(file_path, index=False)

format_date_column('', 'date', '%Y-%m-%d')
print("Date column formatting successfully!")

explain

This script is used to format the date column specified in the Excel file. When processing date data, different business requirements may require different date formats. Through this script, the date column can be converted into the format we need, which facilitates subsequent data analysis and display.

Summarize

This is the end of this article about 20 practical Python Excel automation scripts. For more related Python Excel automation scripts, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!

Share 20 practical Python Excel automation scripts

Preface

1. Excel cell batch fill

explain

2. Set row height and column width

explain

3. Delete the row according to the conditions

explain

4. Create a new Excel worksheet

explain

5. Import CSV files to Excel

explain

6. Pivot table generation

explain

7. Format Excel

explain

8. Analyze and output descriptive statistics

explain

9. Bulkly modify the Excel file name

explain

10. Automatically send emails containing Excel data

explain

11. Merge multiple Excel files

explain

12. Split Excel file

explain

13. Replace the cell content

explain

14. Sort the data

explain

15. Statistics the number of unique values ​​in a specific column

explain

16. Extract the specified column to the new Excel file

explain

17. Add borders to Excel tables

explain

18. Check whether there are empty lines in Excel file and delete

explain

19. Filter data based on multiple columns of conditions

explain

20. Format the date column in Excel

explain

Summarize

15. Statistics the number of unique values in a specific column