introduction
Python provides a variety of libraries and methods to manipulate Excel files, each with its unique advantages and applicable scenarios. This article will introduce these 11 methods in detail, including their advantages and disadvantages, applicable scenarios, and detailed usage and code examples, to help readers fully understand these tools. This article not only covers basic read and write operations, but also explores advanced features and best practices in depth.
1. Use the pandas library
advantage:
- Powerful, supporting data cleaning, conversion and analysis.
- Supports multiple file formats, including
.xls
and.xlsx
。 - Provides rich data operation methods, such as filtering, sorting, grouping, etc.
shortcoming:
- The learning curve is high and requires a certain programming foundation.
- It may take up a lot of memory when processing large-scale data.
Applicable scenarios:
- Data analysis and processing tasks.
- Scenarios where complex data operations and analysis are required.
Detailed usage:
-
Install
pandas
:
pip install pandas
- Read Excel files:
import pandas as pd # Read Excel filesdf = pd.read_excel('') print(()) # Show the first five elements data # Read a specific worksheetdf = pd.read_excel('', sheet_name='Sheet1') print(())
- Write to Excel file:
import pandas as pd # Create datadata = { 'Name': ['Tom', 'Jerry'], 'Age': [20, 21] } df = (data) # Write to Excel filedf.to_excel('', index=False)
Data operation:
- Select a specific column:
df = pd.read_excel('', usecols=['Name', 'Age']) print(())
- Filter data:
df = pd.read_excel('') filtered_df = df[df['Age'] > 20] print(filtered_df)
- Data grouping:
df = pd.read_excel('') grouped_df = ('Age').mean() print(grouped_df)
- Add a new column:
df = pd.read_excel('') df['NewColumn'] = df['Age'] * 2 print(df)
- Update cell:
df = pd.read_excel('') [0, 'Age'] = 25 print(df)
- Delete columns:
df = pd.read_excel('') del df['Age'] print(df)
- Merge multiple Excel files:
df1 = pd.read_excel('') df2 = pd.read_excel('') merged_df = ([df1, df2], ignore_index=True) print(merged_df)
- Pivot Vision Table:
df = pd.read_excel('') pivot_table = pd.pivot_table(df, values='Age', index=['Name'], aggfunc='sum') print(pivot_table)
2. Use the openpyxl library
advantage:
- Focus on
.xlsx
File format. - Provides fine-grained operations such as cell formats, charts, etc.
- Supports reading, writing, and modifying Excel files.
shortcoming:
- Not supported
.xls
File format. - The function is relatively single and is not suitable for complex data analysis tasks.
Applicable scenarios:
- Scenarios where fine-grained operations on Excel files are required.
- deal with
.xlsx
File format tasks.
Detailed usage:
-
Install
openpyxl
:
pip install openpyxl
- Read Excel files:
from openpyxl import load_workbook # Load Excel fileswb = load_workbook('') sheet = # Read datafor row in sheet.iter_rows(values_only=True): print(row)
- Write to Excel file:
from openpyxl import Workbook # Create a new workbookwb = Workbook() sheet = # Write data(['Name', 'Age']) (['Tom', 20]) (['Jerry', 21]) # Save the file('')
- Modify Excel files:
from openpyxl import load_workbook # Load Excel fileswb = load_workbook('') sheet = # Modify cell datasheet['A1'] = 'New Name' sheet['B1'] = 25 # Save the modified Excel file('modified_example.xlsx')
- Format cell:
from openpyxl import Workbook from import Font, Alignment # Create a new workbookwb = Workbook() sheet = # Set cell formatcell = sheet['A1'] = 'Hello, World!' = Font(bold=True, color='FF0000') = Alignment(horizontal='center', vertical='center') # Save the file('formatted_output.xlsx')
3. Use xlrd and xlwt libraries
advantage:
- support
.xls
File format. -
xlrd
Used to read Excel files,xlwt
Used to write to Excel files. - Lightweight, suitable for simple data manipulation tasks.
shortcoming:
- Not supported
.xlsx
File format. - The functions are relatively limited and are not suitable for complex operations.
Applicable scenarios:
- deal with
.xls
File format tasks. - Scenarios that require simple data operations.
Detailed usage:
-
Install
xlrd
andxlwt
:
pip install xlrd xlwt
- Read Excel files:
import xlrd # Open Excel fileworkbook = xlrd.open_workbook('') sheet = workbook.sheet_by_index(0) # Read datafor row_idx in range(): row = sheet.row_values(row_idx) print(row)
- Write to Excel file:
import xlwt # Create a new workbookworkbook = () sheet = workbook.add_sheet('Sheet1') # Write data(0, 0, 'Name') (0, 1, 'Age') (1, 0, 'Tom') (1, 1, 20) (2, 0, 'Jerry') (2, 1, 21) # Save the file('')
4. Use the xlwings library
advantage:
- support
.xls
and.xlsx
File format. - You can read and write Excel files and modify cell formats.
- Provides functionality to interact with Excel applications.
shortcoming:
- The Excel application is required.
- The functions are relatively complex and the learning curve is high.
Applicable scenarios:
- Scenarios that require interaction with Excel applications.
- Tasks that handle complex data operations and formatting.
Detailed usage:
-
Install
xlwings
:
pip install xlwings
- Read Excel files:
import xlwings as xw # Create Excel application objectapp = (visible=True, add_book=False) # Open the workbookwb = ('') sheet = [0] # Read datadata = ('A1:B7').value print(data) # Close workbooks and applications() ()
- Write to Excel file:
import xlwings as xw # Create Excel application objectapp = (visible=True, add_book=False) # Create a new workbookwb = () sheet = [0] # Write data('A1').value = [['Name', 'Age'], ['Tom', 20], ['Jerry', 21]] # Save the file('') # Close workbooks and applications() ()
- Format cell:
import xlwings as xw # Create Excel application objectapp = (visible=True, add_book=False) # Create a new workbookwb = () sheet = [0] # Write data('A1').value = [['Name', 'Age'], ['Tom', 20], ['Jerry', 21]] # Set cell formatcell = ('A1') = True = -4108 # horizontal center = -4108 # Vertical center # Save the file('formatted_output.xlsx') # Close workbooks and applications() ()
5. Use the XlsxWriter library
advantage:
- Focus on writing
.xlsx
document. - Supports writing of text, numbers, formulas, etc.
- Provides rich cell formatting functions.
shortcoming:
- Reading of Excel files is not supported.
- The function is relatively single and is not suitable for complex data operation tasks.
Applicable scenarios:
- Need to write
.xlsx
File tasks. - Scenarios that require fine formatting.
Detailed usage:
-
Install
XlsxWriter
:
pip install XlsxWriter
- Write to Excel file:
import xlsxwriter # Create a new Excel fileworkbook = ('') worksheet = workbook.add_worksheet() # Write datadata = [['Name', 'Age'], ['Tom', 20], ['Jerry', 21]] for row_num, row_data in enumerate(data): worksheet.write_row(row_num, 0, row_data) # Set cell formatbold = workbook.add_format({'bold': True}) ('A1', 'Name', bold) ('B1', 'Age', bold) # Insert chartchart = workbook.add_chart({'type': 'column'}) chart.add_series({ 'categories': '=Sheet1!$A$2:$A$3', 'values': '=Sheet1!$B$2:$B$3', }) worksheet.insert_chart('D2', chart) # Save the file()
6. Use the pyexcel library
advantage:
- Supports multiple Excel file formats, including
.xls
and.xlsx
。 - Provides a consistent interface to read and write these files.
- Lightweight and easy to use.
shortcoming:
- The functions are relatively limited and are not suitable for complex operations.
- Not as good as
pandas
andopenpyxl
Rich features.
Applicable scenarios:
- Handle tasks in multiple Excel file formats.
- Scenarios that require simple data operations.
Detailed usage:
-
Install
pyexcel
:
pip install pyexcel pyexcel-xls pyexcel-xlsx
- Read Excel files:
import pyexcel # Read Excel filessheet = pyexcel.get_sheet(file_name="") print(sheet) # Read a specific worksheetsheet = pyexcel.get_sheet(file_name="", sheet_name="Sheet1") print(sheet)
- Write to Excel file:
import pyexcel # Create datadata = [['Name', 'Age'], ['Tom', 20], ['Jerry', 21]] # Write to Excel filesheet = (data) sheet.save_as("")
7. Use the et_xmlfile library
advantage:
- XML content used to process Excel files.
- Suitable for advanced users, you can directly manipulate the internal structure of Excel files.
shortcoming:
- The learning curve is high and you need to understand the XML structure.
- The functions are relatively complex and not suitable for beginners.
Applicable scenarios:
- Scenarios that require direct manipulation of the internal structure of Excel file.
- Tasks that handle complex Excel files.
Detailed usage:
-
Install
et_xmlfile
:
pip install et_xmlfile
- Read XML content of Excel files:
from et_xmlfile import xmlfile # Read XML content of Excel filewith ('') as xf: for event, elem in (): print(event, )
8. Use the library
advantage:
- Operate Excel files through the COM interface.
- Supports multiple Excel file formats.
- Provides functionality to interact with Excel applications.
shortcoming:
- The Excel application is required.
- The functions are relatively complex and the learning curve is high.
- Applicable to Windows environments only.
Applicable scenarios:
- Scenarios that require interaction with Excel applications.
- Tasks that handle complex data operations and formatting.
Detailed usage:
-
Install
pywin32
:
pip install pywin32
- Read Excel files:
import # Create Excel application objectexcel = ("") = True # Open the workbookworkbook = (r'C:\path\to\') sheet = (1) # Read datacell_value = (1, 1).Value print(cell_value) # Close workbooks and applications() ()
- Write to Excel file:
import # Create Excel application objectexcel = ("") = True # Create a new workbookworkbook = () sheet = (1) # Write data(1, 1).Value = 'Name' (1, 2).Value = 'Age' (2, 1).Value = 'Tom' (2, 2).Value = 20 (3, 1).Value = 'Jerry' (3, 2).Value = 21 # Save the file(r'C:\path\to\') # Close workbooks and applications() ()
9. Use the tablib library
advantage:
- Supports a variety of data formats, including Excel.
- Provides a consistent interface to process data in different formats.
- Lightweight and easy to use.
shortcoming:
- The functions are relatively limited and are not suitable for complex operations.
- Not as good as
pandas
andopenpyxl
Rich features.
Applicable scenarios:
- Handle tasks in multiple data formats.
- Scenarios that require simple data operations.
Detailed usage:
-
Install
tablib
:
pip install tablib
- Write to Excel file:
import tablib # Create a datasetdata = () = ['Name', 'Age'] (['Tom', 20]) (['Jerry', 21]) # Export as Excel filewith open('', 'wb') as f: (('xlsx'))
10. Use the odfpy library
advantage:
- Used to process OpenDocument format files, including
.ods
document. - Provide read and write
.ods
File function. - Lightweight and easy to use.
shortcoming:
- Not supported
.xls
and.xlsx
File format. - The functions are relatively limited and are not suitable for complex operations.
Applicable scenarios:
- deal with
.ods
File format tasks. - Scenarios that require simple data operations.
Detailed usage:
-
Install
odfpy
:
pip install odfpy
- Read ODS files:
from import load from import TableRow, TableCell from import P # Read ODS filedoc = load('') table = (Table)[0] # traverse the data in the tablefor row in (TableRow): cells = (TableCell) row_data = [(P)[0].text for cell in cells] print(row_data)
- Write to ODS file:
from import OpenDocumentSpreadsheet from import Table, TableRow, TableCell from import P # Create a new ODS filedoc = OpenDocumentSpreadsheet() table = Table(name="Sheet1") (table) # Add a new linenew_row = TableRow() new_row.addElement(TableCell(text=P(text='Name'))) new_row.addElement(TableCell(text=P(text='Age'))) (new_row) # Add more linesnew_row = TableRow() new_row.addElement(TableCell(text=P(text='Tom'))) new_row.addElement(TableCell(text=P(text='20'))) (new_row) new_row = TableRow() new_row.addElement(TableCell(text=P(text='Jerry'))) new_row.addElement(TableCell(text=P(text='21'))) (new_row) # Save the file('')
11. Using pyexcel-ods3 library
advantage:
- support
.ods
File format. - Provides consistent interfaces to read and write
.ods
document. - Lightweight and easy to use.
shortcoming:
- Not supported
.xls
and.xlsx
File format. - The functions are relatively limited and are not suitable for complex operations.
Applicable scenarios:
- deal with
.ods
File format tasks. - Scenarios that require simple data operations.
Detailed usage:
-
Install
pyexcel-ods3
:
pip install pyexcel-ods3
- Read ODS files:
import pyexcel_ods3 # Read ODS filedata = pyexcel_ods3.get_data('') print(data)
- Write to ODS file:
import pyexcel_ods3 # Create datadata = { 'Sheet1': [ ['Name', 'Age'], ['Tom', 20], ['Jerry', 21] ] } # Write to ODS filepyexcel_ods3.save_data('', data)
Summarize
Python provides a variety of libraries and methods to manipulate Excel files, each with its unique advantages and disadvantages and applicable scenarios. Choosing the right library can improve development efficiency and code quality. Here is a brief summary of each method:
-
pandas
: Powerful, supports data cleaning, conversion and analysis, suitable for data analysis and processing tasks. -
openpyxl
: Focus on.xlsx
File format, providing fine-grained operations, suitable for scenarios where fine-grained operations on Excel files are required. -
xlrd
andxlwt
:support.xls
File format, lightweight, suitable for processing.xls
File format tasks. -
xlwings
:support.xls
and.xlsx
File format, providing the function of interacting with Excel applications, suitable for scenarios where interaction with Excel applications is required. -
XlsxWriter
: Focus on writing.xlsx
File, providing rich cell formatting functions, suitable for writing.xlsx
File tasks. -
pyexcel
: Supports multiple Excel file formats and provides a consistent interface, suitable for handling tasks in multiple Excel file formats. -
et_xmlfile
: Used to process XML content of Excel files, suitable for scenarios where you need to directly manipulate the internal structure of Excel files. -
: Operate Excel files through the COM interface, suitable for scenarios where you need to interact with Excel applications.
-
tablib
: Supports multiple data formats and provides a consistent interface, suitable for handling tasks in multiple data formats. -
odfpy
: Used to process OpenDocument format files, including.ods
File, suitable for processing.ods
File format tasks. -
pyexcel-ods3
:support.ods
File format, provides a consistent interface, suitable for processing.ods
File format tasks.
I hope this article can help you fully understand the various methods of operating Excel files in Python.
The above is the detailed content of 11 methods for Python to operate Excel files (the most complete network). For more information about Python to operate Excel files, please pay attention to my other related articles!