Details of DataFrame data deletion in Pandas

This article introducesPandasmiddleDataFrameData deletion, mainly useddrop、delWay.

# Parameter explanation of drop functiondrop(
        self,
        labels=None, # is the label of the row and column to be deleted, given in the list;        axis=0, # axis refers to which axis is located, 0 is the row (default), and 1 is the column;        index=None, # index refers to a certain line or multiple lines        columns=None, # columns refers to a certain column or multiple columns        level=None, # level refers to the situation of level, which is based on multiple indexes;        inplace=False, # whether to replace the original dataframe inplaces;        errors="raise",
)
axis=0or and indexorcolumns Only one group is required to specify the ranks and columns.

1. Operation based on the default row and column indexing

Sample data

import numpy as np
import pandas as pd
# Generate a random array - 5 rows and 5 columnsdf = ((5,5))
print(df)

Data display

          0         1         2         3         4
0  0.760489  0.074633  0.788416  0.087612  0.560539
1  0.758450  0.599777  0.384075  0.525483  0.628910
2  0.386808  0.148106  0.742207  0.452627  0.775963
3  0.662909  0.134640  0.186186  0.735429  0.459556
4  0.328694  0.269088  0.331404  0.835388  0.899107

Line 1.1 Delete

[1] Delete a single line

# Delete a single line, delete the second line([1],inplace=True) # inplace=True Modify in placeprint(df)

Execution results:

0 1 2 3 4
0 0.605764 0.234973 0.566346 0.598105 0.478153
2 0.383230 0.822174 0.228855 0.743258 0.076701
3 0.875287 0.576668 0.176982 0.341827 0.112582
4 0.205425 0.898544 0.799174 0.000905 0.377990

[2] Delete discontinuous multiple rows

# Delete multiple lines in succession, delete lines 2 and 4([[1,3]],inplace=True)
print(df)

Execution results:

0 1 2 3 4
0 0.978612 0.556539 0.781362 0.547527 0.706686
2 0.845822 0.321716 0.444176 0.053915 0.296631
4 0.617735 0.040859 0.129235 0.525116 0.005357

[3] Delete multiple rows in succession

# Delete multiple consecutive rows([1:3],inplace=True) # Open the interval, the last index number is not countedprint(df)

Execution results:

0 1 2 3 4
0 0.072891 0.926297 0.882265 0.971368 0.567840
3 0.163212 0.546069 0.360990 0.494274 0.065744
4 0.752917 0.242112 0.526675 0.918713 0.320725

1.2 column deletion

Deletion of columns can be useddelanddropThere are two ways, del df[1] # Delete the second column. This method is to delete it in place. This article specifically explains drop function deletion.

[1] Delete the specified column

([1,3],axis=1,inplace=True) # Specify the axis as a column# (columns=[1,3],inplace=True) # Specify the column directly

Execution results:

0 2 4
0 0.592869 0.123369 0.815126
1 0.127064 0.093994 0.332790
2 0.411560 0.118753 0.143854
3 0.965317 0.267740 0.349927
4 0.688604 0.699658 0.932645

[2] Delete consecutive columns

([1:3],axis=1,inplace=True) #Specify the axis# (columns=[1:3],inplace = True) # Specify the columnprint(df)

Execution results:

0 3 4
0 0.309674 0.974694 0.660285
1 0.677328 0.969440 0.953452
2 0.954114 0.953569 0.959771
3 0.365643 0.417065 0.951372
4 0.733081 0.880914 0.804032

2. Operation based on custom row and column indexing

Sample data

df = (data=(5,5))
 = list('abcde')
 = list('One, two, three, four, five')
print(df)

Data display

          one         two         three         Four         five
a  0.188495  0.574422  0.530326  0.842489  0.474946
b  0.912522  0.982093  0.964031  0.498638  0.826693
c  0.580789  0.013957  0.515229  0.795052  0.859267
d  0.540641  0.865602  0.305256  0.552566  0.754791
e  0.375407  0.236118  0.129210  0.711744  0.067356

Line 2.1 Delete

[1] Delete a single line

(['b'],inplace=True)
print(df)

Execution results:

1 2 3 4 5
a 0.306350 0.622067 0.030573 0.490563 0.009987
c 0.672423 0.071661 0.274529 0.400086 0.263024
d 0.654204 0.809087 0.066099 0.167290 0.534452
e 0.628917 0.232629 0.070167 0.469962 0.957898

[2] Delete multiple rows

(['b','d'],inplace=True)
print(df)

Execution results:

1 2 3 4 5
a 0.391583 0.509862 0.924634 0.466563 0.058414
c 0.802016 0.621347 0.659215 0.575728 0.935811
e 0.223372 0.286116 0.130587 0.113544 0.910859

2.2 column deletion

[1] Delete a single column

(['two'],axis=1,inplace=True)# Delete a single columnprint(df)

Execution results:

1 3 4 5
a 0.276147 0.797404 0.184472 0.081162
b 0.630190 0.328055 0.428668 0.168491
c 0.979958 0.029032 0.934626 0.106805
d 0.762995 0.003134 0.136252 0.317423
e 0.137211 0.116607 0.367742 0.840080

[2] Delete multiple columns

(['two','Four'],axis=1,inplace=True) # Delete multiple columns# (columns=['two','Four'],inplace=True) # Delete multiple columnsprint(df)

Execution results:

1 3 5
a 0.665647 0.709243 0.019711
b 0.920729 0.995913 0.490998
c 0.352816 0.185802 0.406174
d 0.136414 0.563546 0.762806
e 0.259710 0.775422 0.794880

Here's the article aboutPandasmiddleDataFrameThis is all about the article on data deletion details, more relatedPandasmiddleDataFrameData deletion. Please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!