SoFunction
Updated on 2025-03-02

Usage of merge function in pandas

merge()Functions are functions in Pandas that combine two DataFrames, similar to JOIN operations in SQL. It allows you to join operations based on specified columns or indexes. Here are somemerge()Example of usage of functions:

Basic usage of merge function

The basic usage of the merge() function is very simple, mainly including the following parameters:

  • left: The left DataFrame to be merged;
  • right: the DataFrame to merge on the right;
  • how: specifies the merge method, default is 'inner', which can be 'left', 'right', 'outer', etc.;
  • on: Specifies the column name used for merge. If not specified, the common columns in two DataFrames are used by default for merge.

Several commonly used parameters

  • left_on, right_on: Specifies the column names used for merged by the DataFrame on the left and DataFrame on the right, which can be used to handle cases where the column names are different in two DataFrames;
  • suffixes: Specifies the suffix used for distinction when column names conflict, default is ('_x', '_y');
  • indicator: add a special column to the result DataFrame to indicate the merge method of each row, which defaults to False;
  • validate: Check whether the type of the merge operation is valid, the default is None.

Suppose we have two DataFrames, respectivelydf1anddf2

import pandas as pd

# Create DataFrame df1data1 = {'ID': [1, 2, 3, 4],
         'Name': ['Alice', 'Bob', 'Charlie', 'David']}
df1 = (data1)

# Create DataFrame df2data2 = {'ID': [2, 3, 4, 5],
         'Age': [25, 30, 35, 40]}
df2 = (data2)

1. Inner Join:

merged_outer = (df1, df2, on='ID', how='outer')
print(merged_outer)

This will follow the two DataFrames in commonIDColumns are connected inward. The result will only contain two DataFramesID

2.Left Join:

merged_outer = (df1, df2, on='ID', how='outer')
print(merged_outer)

This willdf1FollowIDThe column is connected to the left, and it will bedf1All rows of   are retained and willdf2Combination of matching rows in  .

3. Right Join:

merged_outer = (df1, df2, on='ID', how='outer')
print(merged_outer)

This willdf2FollowIDThe column is connected right, and it is about todf2All rows of   are retained and willdf1Combination of matching rows in  .

4. Outer Join:

merged_outer = (df1, df2, on='ID', how='outer')
print(merged_outer)

This will follow the two DataFrames in commonIDColumns are connected externally, that is, all rows in the two DataFrames are retained and matching rows are merged.

This is the end of this article about the usage of merge functions in pandas. For more related pandas merge functions, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!