Python implements reading Excel table data and converting it to JSON format file

Python reads Excel table data and writes it into JSON format file (lite version)

1. Steps

Step 1: Install the necessary libraries

Make sure you have pandas and openpyxl installed (for reading Excel files). You can use pip to install these libraries:

pip install pandas openpyxl

Step 2: Read Excel file

Use pandas' read_excel function to read Excel files.

Step 3: Convert data to JSON format

Use pandas' to_json method to convert DataFrame to JSON format.

Step 4: Save the JSON file

Write the converted JSON data to the file.

2. Sample code

The code is as follows

import pandas as pd
 
# Step 2: Read Excel filefile_path = 'path_to_your_excel_file.xlsx'  # Replace with your Excel file pathdf = pd.read_excel(file_path)
 
# Step 3: Convert DataFrame to JSON format# orient parameter can control the output format of JSON, such as 'records', 'index', 'columns', 'values', 'split', etc.# Use the 'records' format here, which will take each row of data as an object.json_data = df.to_json(orient='records')
 
# Step 4: Save the JSON fileoutput_file_path = ''  # Specify the path and name of the output JSON filewith open(output_file_path, 'w', encoding='utf-8') as f:
    (json_data)
 
print("Excel data has been successfully converted to JSON and saved to file.")

3. Things to note

Make sure your Excel file path is correct.

Choose the appropriate orient parameter value according to your needs. For example, if you want to use a column as a key to a JSON object, you can use ORient='columns'. Different options affect the structure of JSON.

Use encoding='utf-8' to ensure that Unicode characters are processed correctly when writing to a file.

This way, you can read and save Excel table data into a JSON file.

Python reads Excel table data into JSON format file (detailed optimization version)

1. Detailed optimization part

This part is optimized and modified based on the above streamlined code

1.1 Download pandas

Use the pip command to download the pandas library

pip install pandas openpyxl

1.2 Read the table file (you can specify the sheet page)

Read the excel file, you can add the parameter sheet_name="sheet name" to specify the Sheet page. If not specified, the default first sheet page is

If you want to read the name Sheet2

df = pd.read_excel(file_path, sheet_name='Sheet2')

1.3 Library date processing

Since the date in the table is converted to json and becomes a string of numbers, how to solve this problem can be used to use the to_datetime function

# Assume that the date column is named 'Date' and the data type is an integer (Excel sequence number)df['Date'] = pd.to_datetime(df['Date'], origin='unix', unit='D')  # Convert Excel serial number to datetime objectdf['Date'] = df['Date'].('%Y-%m-%d')  # Convert datetime object to string format 'Year-Month-Day'

1.4 Boolean conversion

The true and false in the table will not be processed. If json is to be imported into the mysql table in the future, the mysql table will fail to recognize true and false, and only 0 and 1 will be recognized.
You can use the following code to convert true and false to 1 and 0

# Modify the Boolean value in the table to meet the 0-1 requirement in the table, instead of false true, otherwise the import will report an error.df['Whether to install the emergency stop switch'] = df['Whether to install the emergency stop switch'].map({True: 1, False: 0})

1.5 Processing of null value Na

If a null value appears, you can use the fillna function to replace it as follows

# Replace Na with a null characterdf = ('')

1.6 Column name replacement

If you want to modify the column name you read, you can use the following code to modify the column name to modify the value

If the column names in the excel table are Chinese, in order to make it easier to import the mysql table, the Chinese column names are modified to the corresponding table field (not the field in the table, but a small camel)

column_map ={
    'Task order number': 'taskTicketId',
        'Work number': 'employeeId'
        }

print()
col_names = .to_list()
col_names = [column_map[s] for s in col_names]
print(col_names)
 = col_names

2. Detailed optimized complete code

The complete code is as follows

# Step 1: Introduce the pandas libraryimport pandas as pd
 
# Step 2: Read Excel file# file_path = 'path_to_your_excel_file.xlsx' # Replace with your Excel file pathfile_path = 'Vehicle Renovation Single Database.xlsm'  # Replace with your Excel file pathdf = pd.read_excel(file_path, sheet_name='Sheet2')
# print(df)


# Assume that the date column is named 'Date' and the data type is an integer (Excel sequence number)# df['Date'] = pd.to_datetime(df['Date'], origin='unix', unit='D') # Convert Excel serial number to datetime object# df['Date'] = df['Date'].('%Y-%m-%d') # Convert datetime object to string format 'Year-Month-D'
list = ['Create date','Sample delivery date','Required Completion Date','Actual completion date']

for i in list:
    df[i] = pd.to_datetime(df[i], origin='unix', unit='D')  # Convert Excel serial number to datetime object    df[i] = df[i].('%Y-%m-%d')  # Convert datetime object to string format 'Year-Month-Day'

# df['Create date'] = pd.to_datetime(df['Create date'], origin='unix', unit='D')  # Convert Excel serial number to datetime object# df['Create date'] = df['Create date'].('%Y-%m-%d')  # Convert datetime object to string format 'Year-Month-Day' 

print(df['Whether to install the emergency stop switch'])
# Modify the Boolean value in the table to meet the 0-1 requirement in the table, instead of false true, otherwise the import will report an error.df['Whether to install the emergency stop switch'] = df['Whether to install the emergency stop switch'].map({True: 1, False: 0})

# df = ('')

column_map ={
    'Task order number': 'taskTicketId',
'Work number': 'employeeId',
'Customer': 'requester',
'Create date': 'creationDate',
'Sample delivery date': 'sampleDeliveryDate',
'Required Completion Date': 'requiredCompletionDate',
'Actual completion date': 'completionDate'
}

print()
col_names = .to_list()
col_names = [column_map[s] for s in col_names]
print(col_names)
 = col_names

# Step 3: Convert DataFrame to JSON format# orient parameter can control the output format of JSON, such as 'records', 'index', 'columns', 'values', 'split', etc.# Use the 'records' format here, which will take each row of data as an object.  ORient='records' ensures that JSON output is a record list.# force_ascii=False ensures that the output JSON contains non-ASCII characters, such as Chinese.# date_format='iso' ensures that the date is output in ISO format.  If you need other date formats, you can adjust or remove this parameter accordingly.json_data = df.to_json(orient='records', indent=4, force_ascii=False, date_format='iso',)
 
# Step 4: Save the JSON file# output_file_path = '' # Specify the path and name of the output JSON fileoutput_file_path = 'Vehicle Renovation Single Database.json'  # Specify the path and name of the output JSON file
with open(output_file_path, 'w', encoding='utf-8') as f:
    (json_data)
 
print("Excel data has been successfully converted to JSON and saved to file.")

The above is the detailed content of Python's implementation of reading Excel table data and converting it to JSON format file. For more information about Python's reading Excel and converting it to JSON, please pay attention to my other related articles!