I. The syntax of merge:
(left, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=True, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)
Parameter Description:
left, right: the dataframe to be merged or the series with a name.
how: join type, 'left', 'right', 'outer', 'inner'
on: the key of the join, both left and right need to have this key
left_on: key of left's df or series
right_on: df of right or key of seires
left_index, right_index: use index instead of normal column for join
suffixes: the suffixes of the two elements, if the columns are renamed, the suffixes are added automatically, default is ('_x', '_y')
II. Merging two dataframes with key columns
You can see that only the left and right rows with key1=y are preserved, i.e., only rows with common column entries and equal values (i.e., intersections) are preserved after the default merge.
In this case there are 2 k1=y for left and right respectively, which ultimately constitute 2*2=4 rows
import pandas as pd left = ({'A': ['a0', 'a1', 'a2', 'a3'], 'B': ['b0', 'b1', 'b2', 'b3'], 'k1': ['x', 'x', 'y', 'y']}) right = ({'C': ['c1', 'c2', 'c3', 'c4'], 'D': ['d1', 'd2', 'd3', 'd4'], 'k1': ['y', 'y', 'z', 'z']})
left
right
(left, right, on=‘k1’)
III. Understanding the alignment of quantities when merging
one-to-one: one-to-one relationship, the associated keys are all unique
For example (school number, name) merge (school number, age)
The number of result entries is: 1*1
one-to-many: one-to-many relationship, unique key on the left, not unique key on the right
For example (school number, name) merge (school number, [language grade, math grade, English grade])
The number of result entries is: 1*N
many-to-many: many-to-many relationship, neither the left nor the right side is unique
For example (school number, [language grade, math grade, English grade]) merge (school number, [basketball, soccer, table tennis])
The number of results is: M*N
1、one-to-one One-to-one relationship merge
left = ({'sno': [11, 12, 13, 14], 'name': ['name_a', 'name_b', 'name_c', 'name_d'] }) right = ({'sno': [11, 12, 13, 14], 'age': ['21', '22', '23', '24'] })
left
right
# One-to-one relationships with 4 results (left, right, on='sno')
2、one-to-many One-to-many relationship of merge
Note: Data will be copied
left = ({'sno': [11, 12, 13, 14], 'name': ['name_a', 'name_b', 'name_c', 'name_d'] }) right = ({'sno': [11, 11, 11, 12, 12, 13], 'grade': ['Language 88', 'Math 90', 'English 75','Language 66', 'Math 55', 'English 29'] })
left
right
# Whichever is greater (left, right, on='sno')
3、many-to-many many-to-many relationship merge
Note: the number of results will appear to multiply
left = ({'sno': [11, 11, 12, 12,12], 'Hobbies': ['Basketball', 'Badminton', 'Ping Pong', 'Basketball', "Football."] }) right = ({'sno': [11, 11, 11, 12, 12, 13], 'grade': ['Language 88', 'Math 90', 'English 75','Language 66', 'Math 55', 'English 29'] })
left
right
(left, right, on=‘sno’)
IV. Understanding the difference between left join, right join, inner join, and outer join
left = ({'key': ['K0', 'K1', 'K2', 'K3'], 'A': ['A0', 'A1', 'A2', 'A3'], 'B': ['B0', 'B1', 'B2', 'B3']}) right = ({'key': ['K0', 'K1', 'K4', 'K5'], 'C': ['C0', 'C1', 'C4', 'C5'], 'D': ['D0', 'D1', 'D4', 'D5']})
left
right
1. inner join, default
Both the left and right keys are present to appear in the result
(left, right, how='inner')
2、left join
The ones on the left appear in the result, the ones on the right are Null if they cannot be matched.
(left, right, how='left')
3、 right join
The ones on the right appear in the results, the ones on the left are Null if they cannot be matched.
(left, right, how='right')
4、 outer join
The left and right ones will appear in the result, or Null if they cannot be matched
(left, right, how='outer')
V. What to do if a non-Key field is renamed
left = ({'key': ['K0', 'K1', 'K2', 'K3'], 'A': ['A0', 'A1', 'A2', 'A3'], 'B': ['B0', 'B1', 'B2', 'B3']}) right = ({'key': ['K0', 'K1', 'K4', 'K5'], 'A': ['A10', 'A11', 'A12', 'A13'], 'D': ['D0', 'D1', 'D4', 'D5']})
left
right
(left, right, on='key')
(left, right, on='key', suffixes=('_left', '_right'))
summarize
To this article on the Pandas DataFrame merger article is introduced to this, more related Pandas DataFrame merger content please search for my previous articles or continue to browse the following related articles I hope you will support me in the future!