Reasons for not recommending subquery and JOIN
In MySQL, the use of subqueries and JOIN is not recommended for the following reasons:
- Performance issues: When the subquery is executed, MySQL needs to create a temporary table to store the inner layer query results, and then delete it after the query, increasing CPU and IO resource consumption, which is prone to slow queries. JOIN operation efficiency is also low, especially when the data volume is large, performance is difficult to ensure.
- Index failure: Subquery may invalidate the index. MySQL will convert the query into a join execution. The subquery cannot be executed first. If the appearance is large, the performance will be affected.
- Query Optimizer Complexity: Subquery affects query optimizer judgment, resulting in insufficient optimization of execution plan. In contrast, conjunction table queries are easier to understand and process by the optimizer.
- Data transmission overhead: Subqueries may cause a large amount of unnecessary data transmission, and each subquery needs to return the result to the main query. Conjunction table query can return all required data through one query, reducing data transmission overhead.
- Maintenance cost: SQL statements written using JOIN are more complicated and costly when modifying table schema, especially when the system is large, and are not easy to maintain.
Solution
To address these problems, the following solutions can be adopted:
- Application layer association: After querying data from a single table in the business layer, the next single table query is given as a condition to reduce the burden on the database layer.
- Use IN instead of subquery: If the subquery result set is small, you can use the "IN" operator to query, and the data volume is small, and the query efficiency is higher.
- Using WHERE EXISTS:WHERE EXISTS is better than "IN". It checks whether the subquery returns a result set and can significantly improve the query speed.
- Rewrite to JOIN: Use JOIN query instead of subquery, there is no need to create temporary tables, it is fast, and if indexes are used in the query, the performance is better.
Optimization cases
Case 1: Query all inventory product information
Original query (using subquery):
SELECT * FROM products WHERE id IN (SELECT product_id FROM inventory WHERE stock > 0);
This query will cause slow query speed and affect user experience.
Optimization solution (using EXISTS):
SELECT * FROM products WHERE EXISTS (SELECT 1 FROM inventory WHERE inventory.product_id = AND > 0);
This optimization solution can greatly improve query speed and improve user experience.
Case 2: Optimizing subquery using EXISTS
Original query:
SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE country = 'USA');
Using EXISTS instead of IN subquery can reduceBack to table querytimes to improve query efficiency.
Case 3: Use JOIN instead of subquery
Original query:
SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE country = 'USA');
Using JOIN instead of subquery reduces subquery overhead and makes it easier to utilize indexes.
Case 4: Optimize subqueries to reduce data volume
Original query:
SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers);
Optimization solution:
SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE active = 1);
Limit the amount of data returned by subqueries, reduce the number of rows to be checked by the main query, and improve query efficiency.
Case 5: Using index override
Original query:
SELECT customer_id FROM customers WHERE country = 'USA';
Optimization solution:
CREATE INDEX idx_country ON customers(country); SELECT customer_id FROM customers WHERE country = 'USA';
Create an index for the country field so that subqueries can find data directly in the index, avoiding back to table queries.
Case 6: Optimizing complex queries using temporary tables
Original query:
SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE last_order_date > '2023-01-01');
Optimization solution:
CREATE TEMPORARY TABLE temp_customers AS SELECT customer_id FROM customers WHERE last_order_date > '2023-01-01'; SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM temp_customers);
For complex subqueries, temporary tables are used to store intermediate results, simplifying queries and improving performance.
Case 7: Use window functions instead of subquery
Original query:
SELECT employee_id, salary, (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) AS avg_salary FROM employees e;
Optimization solution:
SELECT employee_id, salary, AVG(salary) OVER (PARTITION BY department_id) AS avg_salary FROM employees;
Replace subquery with window functions to improve query efficiency.
Case 8: Optimize subqueries to avoid full table scanning
Original query:
SELECT * FROM users WHERE username IN (SELECT username FROM orders WHERE order_date = '2024-01-01');
Optimization solution:
CREATE INDEX idx_order_date ON orders(order_date); SELECT * FROM users WHERE username IN (SELECT username FROM orders WHERE order_date = '2024-01-01');
Create indexes for the order_date field to avoid full table scanning and improve subquery efficiency.
Case 9: Use the LIMIT clause to limit the amount of data returned by subqueries
Original query:
SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE country = 'USA');
Optimization solution:
SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE country = 'USA' LIMIT 100);
Use the LIMIT clause to limit the amount of data returned by subqueries, reduce the amount of data that needs to be processed by the main query, and improve query efficiency.
Case 10: Use JOIN instead of subquery to take advantage of indexes
Original query:
SELECT * FROM transactions WHERE product_id IN (SELECT product_id FROM products WHERE category = 'Equity');
Optimization solution:
SELECT t.* FROM transactions t JOIN products p ON t.product_id = p.product_id WHERE = 'Equity';
Replace subquery with JOIN and make it easier to utilize the category index on the products table.
Summarize
These cases show how to improve MySQL query performance through different optimization strategies, especially when processing subqueries. Here are some additional optimization suggestions:
-
Create the right index: Often used
WHERE
andJOIN
The fields should be indexed to avoid indexing on low-selective fields (such as gender fields). -
Avoid index failure: Fields calculated using functions will not use indexes, such as
SELECT * FROM orders WHERE YEAR(order_date) = 2023;
It should be optimized toSELECT * FROM orders WHERE order_date >= '2023-01-01';
。 - The leftmost prefix rule for combined indexes: Make sure the query conditions start from the leftmost column of the combined index.
-
Use EXPLAIN to analyze query execution plans:pass
EXPLAIN
Keywords can help us understand the execution plan of the query and thus identify performance bottlenecks. -
Optimize query statements: Avoid using
SELECT *
,useLIMIT
Limit the number of rows to return, override the subquery to JOIN. - Adjust Join Buffer reasonably: In the case of no index or unavailable, Join Buffer is the key to optimizing Block Nested-Loop Join, and its size directly affects the number of rows loaded by the outer table and the scanning efficiency of the inner table.
Through these optimization strategies, MySQL query performance can be significantly improved and user experience can be improved.
The above is the detailed content of the reason why MySQL does not use subqueries and optimization cases. For more information about the reasons why MySQL does not use subqueries, please pay attention to my other related articles!