1. Review of basic knowledge
Before we start, let's review the basics of Pandas DataFrame. DataFrame is a core data structure in Pandas. It can be regarded as a table with rows and columns that can store different types of data. Examples are as follows:
import pandas as pd # Create a simple DataFramedata = { 'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 35, 40], 'City': ['New York', 'Los Angeles', 'Chicago', 'Houston'] } df = (data) print(df)
Output:
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston
2. Filter rows with specific values
In Pandas, we can use boolean indexes to filter rows with specific values. Boolean index is to filter data based on whether each element satisfies a certain condition (returns True or False).
# Filter rows older than 30df_filtered = df[df['Age'] > 30] print(df_filtered)
The above code will filter out lines older than 30 and return a new DataFrame:
Name Age City
2 Charlie 35 Chicago
3 David 40 Houston
3. Delete rows with specific values
If we want to delete rows that meet a certain condition from the original DataFrame, we can usedrop
method.
# Delete rows older than 30df_dropped = (df[df['Age'] > 30].index) print(df[df['Age'] > 30].index) print("*"*30) print(df_dropped)
The above code will delete lines older than 30 and return a new DataFrame:
Index([2, 3], dtype='int64')
******************************
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
4. Filter columns with specific values
Similarly, we can filter columns with specific values.
# Filter columns with city "Chicago"df_filtered_columns = df[df['City'] == 'Chicago'] print(df['City'] == 'Chicago') print("*"*30) print(df_filtered_columns)
The above code will filter out the columns with the city "Chicago" and return a new DataFrame:
0 False
1 False
2 True
3 False
Name: City, dtype: bool
******************************
Name Age City
2 Charlie 35 Chicago
5. Delete columns with specific values
To delete a column with a specific value, we can usedrop
method and specifycolumns
parameter.
# Delete the column with the city "Chicago"df_dropped_columns = (columns=['City']) print(df_dropped_columns)
The above code will delete the city column and return a new DataFrame:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 35
3 David 40
Note: The filtering and deletion operations return a new DataFrame by default, which will not change the original DataFrame.
6. Practical drill
Suppose we have a DataFrame containing student information, and we want to filter out students who are older than 15 years old and have a city of "New York".
import pandas as pd # Create a DataFrame containing student informationstudent_data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'], 'Age': [22, 25, 18, 28, 21, 27], 'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'New York', 'San Francisco'] } student_df = (student_data) print("Original DataFrame:") print(student_df) # Filter students who are older than 1 and 5 cities with "New York"filtered_students = student_df[(student_df['Age'] > 15) & (student_df['City'] == 'New York')] print("\nFiltered DataFrame:") print(filtered_students)
The above code will filter out students who are older than 15 and have a city of "New York" and print out the filtered DataFrame:
Original DataFrame:
Name Age City
0 Alice 22 New York
1 Bob 25 Los Angeles
2 Charlie 18 Chicago
3 David 28 Houston
4 Eve 21 New York
5 Frank 27 San FranciscoFiltered DataFrame:
Name Age City
0 Alice 22 New York
4 Eve 21 New York
7. Finally
This is the article about using DataFrame to filter and delete rows and columns with specific values in Python's Pandas library. For more related DataFrame to filter and delete rows and columns with specific values, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!