SoFunction
Updated on 2025-04-14

How to extract data from an Excel file using Pandas library

Easily get Excel data extraction with Pandas: Python skills that novices can learn

Preparation: Install the necessary libraries

First, we need to make sure that the pandas library is installed on the computer. If you haven't installed it, just type in the command line:

pip install pandas openpyxl

hereopenpyxlIt is a dependent library used to process Excel files.

Code analysis: look at one line by one

Let's take a closer look at the code in the picture, and I will explain the function of each line in the most common way:

# Import pandas libraryimport pandas as pd

This line of code is like saying to Python: "Hey, I'm going to use the pandas toolkit, prepare it for me!" pandas is a powerful tool for processing table data in Python.

# Compact displaypd.set_option('display.max_rows', 10)

This line of code sets the display option, which means "When there is too much data, only 10 lines will be displayed at most, and the others will be represented by ellipsis". In this way, our output will not flood the console and will look refreshing.

# Read file datadf = pd.read_excel('', index_col='Student number')

This is the most critical step: •pd.read_excelIt is a function that reads Excel files •''It's our file name •index_col='Student number'Indicates that the column "student number" is used as the identification (index) of each row

Imagine that it is like setting the "student number" column as the first column in Excel, so that we can search.

name = df['Name']

This line of code extracts the "name" column from the data, which is equivalent to selecting the entire name column in Excel.

# Output name columnprint(name)

Finally, we print out the name list and we can see a neat list of students!

Actual effect

After running this code, you will see an output like this:

Student ID
2301    Zhang San
2302    Li Si
2303    Wang Wu
...
2418    Zhao Liu
Name: Name, dtype: object

The "..." in the middle means that some lines are omitted, which is exactly the compact display effect we set.

FAQ

Q: Why is my code reported an error?A: Possible reasons are:

  • The file is not in the current directory (Solution: Use the full file path)
  • File name typo (check case and extension)
  • The column names do not match (confirm that there are indeed "student number" and "name" columns in Excel)

Q: How to display more/less rows?A: Modifydisplay.max_rowsThe value of , for example, set to 5 will only display 5 lines.

Q: Can multiple columns be displayed at the same time?A: Of course! For example, to display the student number and name:

print(df[['Student number', 'Name']])

Tips sharing

  • Run this code in Jupyter Notebook for more intuitive results
  • Can be used()Quickly view the first few lines of data
  • Add toencoding='utf-8'Parameters can avoid the problem of garbled Chinese

Summarize

With this simple example, we learned:

  • Read Excel files with pandas
  • Set up a compact display
  • Extract specific column data
  • Output the sorted results

This skill is like a "Swiss Army Knife" for data processing, simple but super practical. With it mastered, you can quickly extract the required information from various forms without having to manually copy and paste it!

The above is the detailed content of how Python uses the Pandas library to extract data from Excel files. For more information about Python Pandas extracting Excel data, please follow my other related articles!