SoFunction
Updated on 2025-03-03

How to extract data in a specific format in python

This document has been summarizedKey Python knowledge points involved in scripts. This script is used to extract data in a specific format from multiple Excel files and convert it into a new Excel file.

Import library

The script uses the following main libraries:

  • tkinter: Used to create a graphical user interface.
  • pandas: Used to process Excel data.
  • os: Used to process file and directory paths.
import tkinter as tk
from tkinter import filedialog, messagebox
import pandas as pd
import os

Pandas Data Processing

Read Excel files

usepd.read_excelMethods to read Excel files and usesheet_name=NoneParameters read all worksheets. Add toindex_col=NoneParameters to ensure that the first column is not automatically set as the index column.

source_df = pd.read_excel(file_path, sheet_name=None, index_col=None)
source_data = source_df['One case']

Data Extraction

By PandasilocMethod, extract specific data based on row and column index.

result_data = {
    'Grid Number': source_data.iloc[1, 1],
    'Responsibility Section': source_data.iloc[1, 3],
    ...
}

Process merged cell data:

risk_check_path = "\n".join(source_data.iloc[9:19, 1].dropna().astype(str))
result_data['5. Risk item point inspection path'] = risk_check_path

Create a DataFrame and export it as an Excel file

Put all the extracted data into a DataFrame and useto_excelMethods are exported as Excel files.

result_df = (all_data)
result_df.to_excel(output_file_path, index=False)

Tkinter GUI interface

Create the main window

useCreate the main window and set the window title, size and position.

root = ()
("Excel Conversion Tool")
(f'{window_width}x{window_height}+{position_right}+{position_top}')

Create buttons and labels

useandCreate buttons and labels and set their properties and layout.

title_label = (root, text="Excel Conversion Tool", font=("Arial", 18))
title_label.pack(pady=20)
select_button = (root, text="Select Excel File", command=select_files, font=("Arial", 12))
select_button.pack(pady=10)

File Operation

File dialog box

useOpens the file selection dialog box, allowing users to select multiple Excel files. useOpen the file save dialog box and allow the user to select the save path.

file_paths = (filetypes=[("Excel File", "*.xlsx")])
output_file_path = (defaultextension=".xlsx", filetypes=[("Excel File", "*.xlsx")])

Main function explanation

transform_to_result_format_specific

This function extracts a specific field from the source data and returns a dictionary-formatted result data.

def transform_to_result_format_specific(source_data, source_file_path):
    risk_check_path = "\n".join(source_data.iloc[9:19, 1].dropna().astype(str))
    result_data = { ... }
    return result_data

select_files

This function handles the main logic of file selection, data conversion and result saving.

def select_files():
    file_paths = (filetypes=[("Excel File", "*.xlsx")])
    all_data = []
    for file_path in file_paths:
        source_df = pd.read_excel(file_path, sheet_name=None, index_col=None)
        source_data = source_df['One case']
        transformed_data = transform_to_result_format_specific(source_data, file_path)
        all_data.append(transformed_data)
    result_df = (all_data)
    output_file_path = (defaultextension=".xlsx", filetypes=[("Excel File", "*.xlsx")])
    if output_file_path:
        result_df.to_excel(output_file_path, index=False)
        ("success", "The file has been successfully converted and saved.")

Summarize

With this script, we learned how to use Pandas to read and process Excel data, how to create a graphical user interface using Tkinter, and how to handle file dialogs and file operations. These knowledge points are very useful in daily Python development, especially in projects involving data processing and user interfaces.

This is the end of this article about python extracting data in a specific format. For more related python extracting data, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!