This document has been summarizedKey Python knowledge points involved in scripts. This script is used to extract data in a specific format from multiple Excel files and convert it into a new Excel file.
Import library
The script uses the following main libraries:
-
tkinter
: Used to create a graphical user interface. -
pandas
: Used to process Excel data. -
os
: Used to process file and directory paths.
import tkinter as tk from tkinter import filedialog, messagebox import pandas as pd import os
Pandas Data Processing
Read Excel files
usepd.read_excel
Methods to read Excel files and usesheet_name=None
Parameters read all worksheets. Add toindex_col=None
Parameters to ensure that the first column is not automatically set as the index column.
source_df = pd.read_excel(file_path, sheet_name=None, index_col=None) source_data = source_df['One case']
Data Extraction
By Pandasiloc
Method, extract specific data based on row and column index.
result_data = { 'Grid Number': source_data.iloc[1, 1], 'Responsibility Section': source_data.iloc[1, 3], ... }
Process merged cell data:
risk_check_path = "\n".join(source_data.iloc[9:19, 1].dropna().astype(str)) result_data['5. Risk item point inspection path'] = risk_check_path
Create a DataFrame and export it as an Excel file
Put all the extracted data into a DataFrame and useto_excel
Methods are exported as Excel files.
result_df = (all_data) result_df.to_excel(output_file_path, index=False)
Tkinter GUI interface
Create the main window
useCreate the main window and set the window title, size and position.
root = () ("Excel Conversion Tool") (f'{window_width}x{window_height}+{position_right}+{position_top}')
Create buttons and labels
useand
Create buttons and labels and set their properties and layout.
title_label = (root, text="Excel Conversion Tool", font=("Arial", 18)) title_label.pack(pady=20) select_button = (root, text="Select Excel File", command=select_files, font=("Arial", 12)) select_button.pack(pady=10)
File Operation
File dialog box
useOpens the file selection dialog box, allowing users to select multiple Excel files. use
Open the file save dialog box and allow the user to select the save path.
file_paths = (filetypes=[("Excel File", "*.xlsx")]) output_file_path = (defaultextension=".xlsx", filetypes=[("Excel File", "*.xlsx")])
Main function explanation
transform_to_result_format_specific
This function extracts a specific field from the source data and returns a dictionary-formatted result data.
def transform_to_result_format_specific(source_data, source_file_path): risk_check_path = "\n".join(source_data.iloc[9:19, 1].dropna().astype(str)) result_data = { ... } return result_data
select_files
This function handles the main logic of file selection, data conversion and result saving.
def select_files(): file_paths = (filetypes=[("Excel File", "*.xlsx")]) all_data = [] for file_path in file_paths: source_df = pd.read_excel(file_path, sheet_name=None, index_col=None) source_data = source_df['One case'] transformed_data = transform_to_result_format_specific(source_data, file_path) all_data.append(transformed_data) result_df = (all_data) output_file_path = (defaultextension=".xlsx", filetypes=[("Excel File", "*.xlsx")]) if output_file_path: result_df.to_excel(output_file_path, index=False) ("success", "The file has been successfully converted and saved.")
Summarize
With this script, we learned how to use Pandas to read and process Excel data, how to create a graphical user interface using Tkinter, and how to handle file dialogs and file operations. These knowledge points are very useful in daily Python development, especially in projects involving data processing and user interfaces.
This is the end of this article about python extracting data in a specific format. For more related python extracting data, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!