SoFunction
Updated on 2024-10-29

Python Pandas implementation of DataFrame merge graphic tutorials

I. The syntax of merge:

(left, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=True, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)
Parameter Description:
left, right: the dataframe to be merged or the series with a name.
how: join type, 'left', 'right', 'outer', 'inner'
on: the key of the join, both left and right need to have this key
left_on: key of left's df or series
right_on: df of right or key of seires
left_index, right_index: use index instead of normal column for join
suffixes: the suffixes of the two elements, if the columns are renamed, the suffixes are added automatically, default is ('_x', '_y')

II. Merging two dataframes with key columns

You can see that only the left and right rows with key1=y are preserved, i.e., only rows with common column entries and equal values (i.e., intersections) are preserved after the default merge.

In this case there are 2 k1=y for left and right respectively, which ultimately constitute 2*2=4 rows

import pandas as pd
left = ({'A': ['a0', 'a1', 'a2', 'a3'],
                     'B': ['b0', 'b1', 'b2', 'b3'],
                     'k1': ['x', 'x', 'y', 'y']})
right = ({'C': ['c1', 'c2', 'c3', 'c4'],
                      'D': ['d1', 'd2', 'd3', 'd4'],
                      'k1': ['y', 'y', 'z', 'z']})

left

left

right

(left, right, on=‘k1’)

III. Understanding the alignment of quantities when merging

one-to-one: one-to-one relationship, the associated keys are all unique
For example (school number, name) merge (school number, age)
The number of result entries is: 1*1
one-to-many: one-to-many relationship, unique key on the left, not unique key on the right
For example (school number, name) merge (school number, [language grade, math grade, English grade])
The number of result entries is: 1*N
many-to-many: many-to-many relationship, neither the left nor the right side is unique
For example (school number, [language grade, math grade, English grade]) merge (school number, [basketball, soccer, table tennis])
The number of results is: M*N

1、one-to-one One-to-one relationship merge

left = ({'sno': [11, 12, 13, 14],
                      'name': ['name_a', 'name_b', 'name_c', 'name_d']
                    })
right = ({'sno': [11, 12, 13, 14],
                      'age': ['21', '22', '23', '24']
                    })

left

right

# One-to-one relationships with 4 results
(left, right, on='sno')

2、one-to-many One-to-many relationship of merge

Note: Data will be copied

left = ({'sno': [11, 12, 13, 14],
                      'name': ['name_a', 'name_b', 'name_c', 'name_d']
                    })
right = ({'sno': [11, 11, 11, 12, 12, 13],
                       'grade': ['Language 88', 'Math 90', 'English 75','Language 66', 'Math 55', 'English 29']
                     })

left

right

# Whichever is greater
(left, right, on='sno')

3、many-to-many many-to-many relationship merge

Note: the number of results will appear to multiply

left = ({'sno': [11, 11, 12, 12,12],
                      'Hobbies': ['Basketball', 'Badminton', 'Ping Pong', 'Basketball', "Football."]
                    })
right = ({'sno': [11, 11, 11, 12, 12, 13],
                       'grade': ['Language 88', 'Math 90', 'English 75','Language 66', 'Math 55', 'English 29']
                     })

left

right

(left, right, on=‘sno’)

IV. Understanding the difference between left join, right join, inner join, and outer join

left = ({'key': ['K0', 'K1', 'K2', 'K3'],
                      'A': ['A0', 'A1', 'A2', 'A3'],
                      'B': ['B0', 'B1', 'B2', 'B3']})

right = ({'key': ['K0', 'K1', 'K4', 'K5'],
                      'C': ['C0', 'C1', 'C4', 'C5'],
                      'D': ['D0', 'D1', 'D4', 'D5']})

left

right

1. inner join, default

Both the left and right keys are present to appear in the result

(left, right, how='inner')

2、left join

The ones on the left appear in the result, the ones on the right are Null if they cannot be matched.

(left, right, how='left')

3、 right join

The ones on the right appear in the results, the ones on the left are Null if they cannot be matched.

(left, right, how='right')

4、 outer join

The left and right ones will appear in the result, or Null if they cannot be matched

(left, right, how='outer')

V. What to do if a non-Key field is renamed

left = ({'key': ['K0', 'K1', 'K2', 'K3'],
                      'A': ['A0', 'A1', 'A2', 'A3'],
                      'B': ['B0', 'B1', 'B2', 'B3']})

right = ({'key': ['K0', 'K1', 'K4', 'K5'],
                      'A': ['A10', 'A11', 'A12', 'A13'],
                      'D': ['D0', 'D1', 'D4', 'D5']})

left

right

(left, right, on='key')

(left, right, on='key', suffixes=('_left', '_right'))

summarize

To this article on the Pandas DataFrame merger article is introduced to this, more related Pandas DataFrame merger content please search for my previous articles or continue to browse the following related articles I hope you will support me in the future!