SoFunction
Updated on 2025-04-16

Pandas Dataframe data frame iterative iterrows, itertuples, items detailed explanation

Pandas Dataframe Iteration

Iterating data frames is usually not recommended becausepandasThe original design intention is to vectorize operations to improve efficiency. However, in some cases, it may also be necessary to iterate the data row by row or column by column.

It should be noted that iterating over rows or columns of data frames can lead to performance degradation, especially for large data sets. Where possible, it is best to use vectorized operations to improve efficiency.

Iterative

1. Use.iterrows()

  • .iterrows()is a generator that iterates over the index and rows of data frames.
  • For each row, it returns a tuple containing the index and the data of that row.
import pandas as pd  

df = ({  
    'A': [1, 2, 3],  
    'B': [4, 5, 6],  
    'C': [7, 8, 9]  
})  

for index, row in ():  
    print(f"Index: {index}")  
    print(row)  

2. Use.itertuples()

  • .itertuples()Method comparison.iterrows()Faster because it returns named tuples that perform better than dictionaries in Python.
  • The first element of the returned tuple is the index of the row, and the rest are the data in the row.
  • .itertuples()Returns a named tuple, which can be accessed by the attribute name, e.g.wait.
import pandas as pd  

df = ({  
    'A': [10, 20, 30],  
    'B': [40, 50, 60],  
    'C': ['p', 'q', 'r']  
})  
  
for row in ():  
    print(row)

'''
Pandas(Index=0, A=10, B=40, C='p')  
Pandas(Index=1, A=20, B=50, C='q')  
Pandas(Index=2, A=30, B=60, C='r')'''

Iterative columns

Iterating columns is usually simpler because the column names of data frames can be iterated directly or accessed column data.

import pandas as pd  
  
# Create a DataFramedf = ({  
    'A': [1, 2, 3],  
    'B': [4, 5, 6],  
    'C': [7, 8, 9]  
})  
  
# Method 1: Iterate over the column namefor column in :  
    print(f"Column Name: {column}")  
    # Access column data through column names    print(df[column])   
  
# Method 2: Iterate directly over the DataFrame object (default iterates column names)for column in df:  
    print(f"Column Name: {column}")  
    print(df[column])  
  
# Method 3: Use() to iterate over column names and data simultaneouslyfor column, data in ():  
    print(f"Column Name: {column}")  
    # The data variable directly contains column data, without accessing through df[column]    print(data)  

Summarize

The above is personal experience. I hope you can give you a reference and I hope you can support me more.