In MySQL,JOIN
Operation is used to combine data from multiple tables together. In order to perform efficientlyJOIN
Operation, MySQL implements a variety ofJOIN
Algorithm, the following will explain several common ones in detailJOIN
Algorithm principle.
1. Nested Loop Join (Nested - Loop Join, NLJ)
principle
Nested loop connection is the most basicJOIN
Algorithm, which completes table join operations through two or more layers of nested loops. Suppose there are two tablesA
andB
,The basic steps of the NLJ algorithm are as follows:
- Outer loop traversal table
A
record every line in it. - For the table
A
Record each row in the record, the inner layer loops through the tableB
Record each line in and check whether these two line records meet theJOIN
condition. If the condition is met, the two rows of records are combined into part of the result set.
Sample code explanation
SELECT * FROM tableA JOIN tableB ON = ;
In this query, MySQL may use a nested loop join algorithm. FirsttableA
Take out a row and scan it progressivelytableB
, find satisfaction =
Records of conditions, combine matching records and output them.
Complexity analysis
- Time complexity: where is the table
A
The number of rows is a tableB
number of rows. This algorithm is less efficient when dealing with large tables.
2. Index Nested - Loop Join, INLJ)
principle
Index nested loop joins are an optimized version of nested loop joins. When the driven table (usually the inner loop table) has the sameJOIN
When a condition-related index, MySQL will use this index to speed up finding matching records instead of full table scanning. The basic steps are as follows:
- The outer layer loops through each row record in the driver table (usually a table with fewer rows).
- For each row record in the driver table, use the index on the driven table to quickly locate and satisfy the content
JOIN
Recording of conditions without the need to scan the driven table progressively.
Sample code explanation
SELECT * FROM tableA JOIN tableB ON = tableB.a_id;
iftableB
Tablea_id
There are indexes on the column, and MySQL will use the index nested loop join algorithm. FirsttableA
Take out a row from and usetableB
superiora_id
The index of the column is quickly found to satisfy = tableB.a_id
Record of conditions.
Complexity analysis
- Time complexity: where is the number of rows in the driving table and the number of rows in the driven table. Due to the use of indexes, the search efficiency has been significantly improved.
3. Block Nested - Loop Join, BNLJ)
principle
When there is no available index on the driven table, in order to reduce the number of inner loops, MySQL introduces a block nested loop joining algorithm. Its basic idea is to divide the data of the driver table into multiple blocks, each time the data of one block is loaded into the cache area in memory, and then scan the driven table progressively to check whether each row in the cache area and the row in the driven table meet the satisfaction.JOIN
condition. The basic steps are as follows:
- Divide the data of the driver table into multiple blocks, and the size of each block is
join_buffer_size
Parameter control. - Load the data of one block at a time to
join buffer
middle. - Scan the driven table progressively, for each row in the driven table, check whether it is with
join buffer
Any line in it satisfiesJOIN
condition.
Sample code explanation
SELECT * FROM tableA JOIN tableB ON tableA.some_column = tableB.some_column;
iftableB
No on the table withJOIN
For condition-related indexes, MySQL may use block nested loop joining algorithm. Let's firsttableA
The data is divided into blocks and loaded intojoin buffer
, then scantableB
,examinetableB
Whether each line in thejoin buffer
The records in match.
Complexity analysis
- Time complexity: Although the time complexity is the same as the nested loop connection, the performance has been improved to a certain extent due to the reduction of the number of inner loops.
4. Hash Join
principle
Hash connection is a suitable type of processing large data setsJOIN
Algorithms, usually used in MySQL 8.0 and above,JOIN
operate. Its basic steps are as follows:
- Construction stage: Select a smaller table as the construction table, traverse each row of records in the construction table, according to
JOIN
The columns in the condition calculate the hash value and insert the record into the corresponding hash bucket. - Detection stage: traverse each row of records in a larger table (detection table), according to the same
JOIN
The conditional column calculates the hash value and then looks for matching records in the hash table.
Sample code explanation
SELECT * FROM large_table JOIN small_table ON large_table.key = small_table.key;
In this query, ifsmall_table
Small, MySQL willsmall_table
As a build table, build the hash table, and then iterate overlarge_table
Perform detection and find matching records.
Complexity analysis
- Time complexity: where and are the row counts of two tables respectively. Hash connections are highly efficient when processing large data sets.
This is the end of this article about the principle of understanding and reading of MySQL Join algorithm. For more related contents of MySQL Join algorithm, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!