SoFunction
Updated on 2024-10-28

Delete the specified rows and columns (drop) implementation

Rows and columns deleted using the drop() method.

Prior to version 0.21.0, use the parameters labels and axis to specify rows and columns. Starting with 0.21.0, you can use index or columns.

The following will be explained here.

  • DataFrame Specified Row Deletion
    • Specify by line name (line label)
    • Specify by line number
    • Notes on unset row names
  • DataFrame Specified Column Deletion
    • Specify by column name (column labeling)
    • Specify by column number
  • Deletion of multiple rows and columns

For removing missing value NaN and removing rows with duplicate elements, please refer to the article.

Pandas removes, replaces and extracts the missing values in it NaN(dropna,fillna,isnull)

The following data is used as an example in the sample code.

import pandas as pd

df = pd.read_csv('./data/12/sample_pandas_normal.csv', index_col=0)
print(df)
#          age state  point
# name
# Alice     24    NY     64
# Bob       42    CA     92
# Charlie   18    CA     70
# Dave      68    TX     70
# Ellen     24    CA     88
# Frank     30    NY     57

DataFrame Specified Row Deletion

Specify by line name (line label)

It is specified by the first argument labels and the second argument axis. The line specifies axis= 0.

print(('Charlie', axis=0))
#        age state  point
# name                   
# Alice   24    NY     64
# Bob     42    CA     92
# Dave    68    TX     70
# Ellen   24    CA     88
# Frank   30    NY     57

The default value is axis = 0, so axis can be omitted.

print(('Charlie'))
#        age state  point
# name                   
# Alice   24    NY     64
# Bob     42    CA     92
# Dave    68    TX     70
# Ellen   24    CA     88
# Frank   30    NY     57

Starting with version 0.21.0 or later, it can also be specified by the parameter index.

print((index='Charlie'))
#        age state  point
# name                   
# Alice   24    NY     64
# Bob     42    CA     92
# Dave    68    TX     70
# Ellen   24    CA     88
# Frank   30    NY     57

If you want to delete more than one line at a time, specify it in the list.

print((['Bob', 'Dave', 'Frank']))
#          age state  point
# name                     
# Alice     24    NY     64
# Charlie   18    CA     70
# Ellen     24    CA     88

print((index=['Bob', 'Dave', 'Frank']))
#          age state  point
# name                     
# Alice     24    NY     64
# Charlie   18    CA     70
# Ellen     24    CA     88

By default, the original DataFrame remains unchanged and a new DataFrame is returned. if the parameter inplace is set to True, the original DataFrame is changed, in which case no new DataFrame is returned and the return value is None.

Specify by line number

To specify by row number, use the index property of the DataFrame.

If you specify the line number in [] of the index attribute, you can get the corresponding line name. Multiple line numbers can be specified in the list.

print([[1, 3, 5]])
# Index(['Bob', 'Dave', 'Frank'], dtype='object', name='name')

Specify the name of the labels or index in the first argument of drop().

print(([[1, 3, 5]]))
#          age state  point
# name                     
# Alice     24    NY     64
# Charlie   18    CA     70
# Ellen     24    CA     88

print((index=[[1, 3, 5]]))
#          age state  point
# name                     
# Alice     24    NY     64
# Charlie   18    CA     70
# Ellen     24    CA     88

Notes on unset row names

If no line name is set, index defaults to an integer ordinal number. Be careful when using a numeric value as an index instead of such a string.

df_noindex = pd.read_csv('./data/12/sample_pandas_normal.csv')
print(df_noindex)
#       name  age state  point
# 0    Alice   24    NY     64
# 1      Bob   42    CA     92
# 2  Charlie   18    CA     70
# 3     Dave   68    TX     70
# 4    Ellen   24    CA     88
# 5    Frank   30    NY     57

print(df_noindex.index)
# RangeIndex(start=0, stop=6, step=1)

If it is a sequence number, the result will be the same whether you specify a numeric value as is or use the index attribute.

print(df_noindex.drop([1, 3, 5]))
#       name  age state  point
# 0    Alice   24    NY     64
# 2  Charlie   18    CA     70
# 4    Ellen   24    CA     88

print(df_noindex.drop(df_noindex.index[[1, 3, 5]]))
#       name  age state  point
# 0    Alice   24    NY     64
# 2  Charlie   18    CA     70
# 4    Ellen   24    CA     88

If its not a sequence number due to sorting, the result will be different. When a numeric value is specified directly, the rows whose row labels are that numeric value will be deleted, while when the index attribute is used, the rows whose row numbers are that numeric value will be deleted.

df_noindex_sort = df_noindex.sort_values('state')
print(df_noindex_sort)
#       name  age state  point
# 1      Bob   42    CA     92
# 2  Charlie   18    CA     70
# 4    Ellen   24    CA     88
# 0    Alice   24    NY     64
# 5    Frank   30    NY     57
# 3     Dave   68    TX     70

print(df_noindex_sort.index)
# Int64Index([1, 2, 4, 0, 5, 3], dtype='int64')

print(df_noindex_sort.drop([1, 3, 5]))
#       name  age state  point
# 2  Charlie   18    CA     70
# 4    Ellen   24    CA     88
# 0    Alice   24    NY     64

print(df_noindex_sort.drop(df_noindex_sort.index[[1, 3, 5]]))
#     name  age state  point
# 1    Bob   42    CA     92
# 4  Ellen   24    CA     88
# 5  Frank   30    NY     57

DataFrame Specified Column Deletion

Specify by column name (column labeling)

It is specified by the first argument labels and the second argument axis. Columns specifies axis= 1.

print(('state', axis=1))
#          age  point
# name               
# Alice     24     64
# Bob       42     92
# Charlie   18     70
# Dave      68     70
# Ellen     24     88
# Frank     30     57

Starting with version 0.21.0 or later, it can be specified using the parameter column.

print((columns='state'))
#          age  point
# name               
# Alice     24     64
# Bob       42     92
# Charlie   18     70
# Dave      68     70
# Ellen     24     88
# Frank     30     57

If you want to delete more than one column at a time, specify it in the list.

print((['state', 'point'], axis=1))
#          age
# name        
# Alice     24
# Bob       42
# Charlie   18
# Dave      68
# Ellen     24
# Frank     30

print((columns=['state', 'point']))
#          age
# name        
# Alice     24
# Bob       42
# Charlie   18
# Dave      68
# Ellen     24
# Frank     30

The argument inplace is used in the same way as for lines.

df_org = ()
df_org.drop(columns=['state', 'point'], inplace=True)
print(df_org)
#          age
# name        
# Alice     24
# Bob       42
# Charlie   18
# Dave      68
# Ellen     24
# Frank     30

Specify by column number

To specify by column number, use the columns property of the DataFrame.

print([[1, 2]])
# Index(['state', 'point'], dtype='object')

print(([[1, 2]], axis=1))
#          age
# name        
# Alice     24
# Bob       42
# Charlie   18
# Dave      68
# Ellen     24
# Frank     30

print((columns=[[1, 2]]))
#          age
# name        
# Alice     24
# Bob       42
# Charlie   18
# Dave      68
# Ellen     24
# Frank     30

If columns is an integer value, be careful with the above line.

Deletion of multiple rows and columns

As of version 0.21.0 and higher, multiple rows/multiple columns can be deleted by specifying the parameters index and column at the same time.

Of course, it can also be specified by row/column number, and by using the parameter inplace.

print((index=['Bob', 'Dave', 'Frank'],
              columns=['state', 'point']))
#          age
# name        
# Alice     24
# Charlie   18
# Ellen     24

print((index=[[1, 3, 5]],
              columns=[[1, 2]]))
#          age
# name        
# Alice     24
# Charlie   18
# Ellen     24

to this article on the deletion of the specified rows and columns (drop) of the implementation of the article is introduced to this, more related Pandas DataFrame to delete the specified rows and columns of content, please search for my previous posts or continue to browse the following related articles I hope that you will support me in the future more!