SoFunction
Updated on 2025-03-04

Detailed explanation of nested query instances in MySQL database

1. Definition of nested queries

Nested QueryRefers to embed a subquery in a part of a query statement.

The execution process of nested queries follows "Sub-query first, outer-query later"Logistics of ". First, the subquery executes and returns a result set, which may be a value, one row or more rows of data. Then, the outer query continues to filter or process the data using the results of the subquery. In this way, nested queries can handle more complex logic, such as multi-layer conditional filtering, data comparison, etc.

  • Subquery: First execute, return the result that meets the conditions.
  • External query: Use the results returned by the subquery to perform filtering or other logical operations, and finally return the result.

2. Syntax of nested queries

2.1 The basic structure of nested queries

SELECT List name
FROM Table name
WHERE List name Comparison operator (Subquery);

First return the result through the subquery, and then use the comparison operator to determine whether the result returned by the subquery meets the condition. The record of the field that meets the condition will be displayed.

Example

SELECT column_name1 
FROM table_name1 
WHERE column_name2 Comparison operator (SELECT column_name3 FROM table_name2 WHERE condition);
  • The “(SELECT column_name3 FROM table_name2 WHERE condition)” after the comparison operator is used as a subquery when executing SQL statements,The column_name3 field of the record that meets the condition in table table_name2 will be selected for the outer layer query.
  • When the column_name3 field returned by the subquery is obtained,The outer query will first compare the column_name2 fields of all records of table table_name1 with the comparison operator with the returned column_name3 fields.
  • For records that satisfy the comparison operator rules, the column_name1 field of these records will be returned.

2.2 Common comparison operators

  • =: Used to check whether the value of a column in the outer query is equal to the value returned by the subquery.
  • ><>=<=: Used to compare the size relationship between the column value of an outer query and the subquery result.
  • IN: Used to check whether a column value of an outer query is in a set of results returned by the subquery.
  • ANY / SOME: Used to check whether the column value of the outer query satisfies the subquery return resultAny oneThe condition of the value.
  • ALL: Used to check whether the column value of the outer query satisfies the subquery return resultAll valuesconditions.

Notice:ANY / SOME and ALLGenerally, it needs to be combined><>=<=To use

4. ANY / SOMEOperator

SELECT name 
FROM employees 
WHERE salary > ANY (SELECT salary FROM employees WHERE department_id = 1);
  • The outer query filters out those employees whose salary is greater than any employee in Department 1.
  • As long as the employee with the lowest salary in department 1 is eligible.

5. ALL operator

SELECT name 
FROM employees 
WHERE salary > ALL (SELECT salary FROM employees WHERE department_id = 1);
  • The outer query filters out those employees whose salary is greater than all employees in Department 1.
  • The highest-paid employee in Department 1 must be greater than that in order to meet the criteria.

3. Classification based on subquery behavior

  • in accordance with: This category is based onThe form of the return result of the subqueryand between subqueries and outer queriesDependencies
  • Focus: This category focuses more onThe nature of the subquery itself, that is, how much data is returned by the subquery (single row or multiple rows, multiple columns), whether it depends on the outer query, and whether the subquery is executed at one time or re-executes per row.

3.1 Single-line subquery

  • definition: Single-line subquery refers to subqueryReturn a value of a field. This value can be a specific number, date, text, etc. A subquery returns only one result, and the outer query will use this result for comparison, filtering, or calculation.
  • Features: The subquery returnsA single value for a field. Outer query uses this value to make conditional judgments or filters.
  • Application scenarios: When you need to get a specific value (such as maximum, minimum, average, etc.) from the subquery, then the outer query uses this value for comparison.

Syntax Structure

SELECT List name 
FROM Table name 
WHERE List name Comparison operator (Subquery);

Example

SELECT Name, salary 
FROM staff 
WHERE salary = (SELECT MAX(salary) FROM staff);

explain

  • Subquery (SELECT MAX (Salary) FROM Employee)Return to the employee tableMaximum salary
  • External queryAccording to this maximum salary, employees who meet the criteria (i.e. employees whose salary is equal to the maximum salary) are selected.

3.2 Multi-line subquery

  • definition: Multi-line subquery refers to subqueryReturns one or more values ​​of one or more fields. The outer query passes the set operator (e.g.INANYALLetc.) to compare and match the fields of the outer query with multiple results returned by the subquery.
  • Features: The subquery returnsA set of values(can be one or more fields), outer queries use these values ​​for filtering or comparison.
  • Application scenarios: When a subquery returns multiple values, the outer query matches these results through a collection operator. For example, find out whether the value of a field exists in a set of return values.

Syntax Structure

SELECT List name 
FROM Table name 
WHERE List name IN (Subquery);

Example

SELECT Name 
FROM staff 
WHERE Department number IN (SELECT Department number FROM department WHERE City = 'Shanghai');

explain

  • Subquery (SELECT Department No. FROM Department WHERE City = 'Shanghai')Returns all department numbers located in Shanghai, which is a set of values.
  • External queryThese department numbers returned by the subquery are used to filter out employees belonging to these departments.

3.3 Related subqueries

  • definition: Related subqueries refer to subqueriesEach record that depends on the outer query,thereforeEvery time the outer layer query processes a new record, the subquery will be executed again and the result will be calculated based on the data in the current record. This means that the subquery will generate results dynamically based on changes in the outer query.
  • Features: Whenever an outer query processes a new record, the relevant subquery will be re-execute based on the value of the record and return a new result. The results of subqueries change dynamically due to records of outer query.
  • Application scenarios: Used when the result of the subquery depends on the outer query that is currently being processed. For example, calculate data or values ​​related to each record.

Syntax Structure

SELECT List name 
FROM Table name AS Outer table 
WHERE List name Comparison operator (SELECT List name FROM Subquery table WHERE Subquery table.List name = Outer table.List name);

Example

SELECT House number, price 
FROM houses AS 可买houses
WHERE price &gt; (SELECT AVG(price) 
              FROM houses AS 出售houses 
              WHERE 出售houses.City = 可买houses.City);

explain

Outer query:

  • SELECT House Number, Price FROM House AS Home Buy
    Outer query fromhousesThe number and price of each house are retrieved in the table andhousesTable aliasBuy a house. The purpose of this alias is to help distinguish the same table names in outer queries and subqueries for comparison. The purpose of this step is to find a home that meets certain criteria from all the homes.

Subquery

  • (SELECT AVG(Price) FROM House AS House For Sale WHERE House For Sale. City = House available for purchase. City)
    The task of subquery is to calculate the city where the current house is locatedAverage housing price. Subquery is also usedhousesTable, but given an aliasHouses for sale, used to avoid external queryBuy a houseConfused. SubqueryWHEREThe clause specifies that only house records in the same city as the current outer query will participate in the calculation of the average price.

    • Execution process: Every time an outer query processes a new house record, the subquery will dynamically calculate the average price of all houses in the city based on the city recorded by the house. That is, the subquery depends on the housing city information in the outer query. Therefore, when the outer query traverses a house, the subquery will be executed once to calculate the average house price in the city where the house is located.

    • Results: The sub-query returns the average housing price in the city. This value will be used for outer queryWHEREIn the clause, help determine whether the current house price is higher than the average house price in the city where it is located.

Overall logic

  • Dynamic execution of subqueries: For each record of the outer layer query (i.e., each house), the subquery recalculates the average housing price of the city based on the city of the house. For example, when an outer query is processing a house in Beijing, the subquery will retrieve all houses located in Beijing and calculate the average price of those houses.

  • Condition comparison: External queryWHEREThe clause compares the current house price with the average city price returned by the subquery. If the current house price is higher than the average house price in the city, the number and price of the house are returned.

3.4 Non-related subquery

  • definition: Non-related subqueries refer to subqueries and outer queryNo direct dependencies. The subquery is executed independently and returns a result. Then, no matter how many records are processed, the outer query will only be compared with this result.
  • Features: The subquery is executed independently before the outer query, returning a fixed value. This value is used for comparison or filtering in every record in the outer layer query.
  • Application scenarios: Used when the result of a subquery is not related to the outer query, such as a global value or fixed result used in the query.

Syntax Structure

SELECT List name 
FROM Table name 
WHERE List name = (Subquery);

Example

SELECT Name 
FROM staff 
WHERE Department number = (SELECT Department number FROM department WHERE department名称 = 'Market Department');

explain

  • Subquery (SELECT Department No. FROM Department WHERE Department Name = 'Market Department')Execute independently once and return to the department number of the marketing department.
  • External queryNo matter how many employee records are processed, they are always compared with the marketing department number.

3.5 Summary

  • Single-line subquery: Returns a value of a field, and the outer query is compared or filtered.
  • Multi-line subquery: Returns multiple values ​​of one or more fields, and outer queries use set operators to match these values.
  • Related subqueries: Every time the outer layer query processes a new record, the subquery will be re-execute based on the value of the record to generate new results.
  • Non-related subquery: The subquery is only executed once and returns a fixed result. The outer query is compared with the fixed result no matter how many records it processes.

4. Classification based on subquery location

  • in accordance with: This classification is based on subqueryWhere is in SQL statements, that is, the subquery appears inWHEREFROMSELECTstillHAVINGin clause.
  • Focus: This category focuses onThe purpose and function of subqueries in outer query, that is, how subqueries are combined with outer query to achieve specific data processing.

4.1 Nested queries in WHERE clause

grammar

SELECT List name
FROM Table name
WHERE List name Comparison operator (Subquery);

Use scenarios

  • Nested queries in WHERE clauses are often used for conditional filtering. Typically, a nested query returns a single value or set of values, which is then used as a filter condition in the WHERE clause of the external query.
  • For example: filter out records that meet certain specific conditions, such as selecting the highest-paid employees or obtaining customers with a specific status.

Example

SELECT employee_id, first_name, last_name
FROM employees
WHERE department_id = (
    SELECT department_id
    FROM departments
    WHERE department_name = 'Sales'
);
  • Subquery

    • SELECT AVG(salary) FROM employees WHERE department_id = employees.department_idUsed to calculate the average salary of each employee's department.
    • SubqueryWHERE department_id = employees.department_idIt is the key part, which means that the sub-query is used to calculate the average salary of the department for each employee’s department.
    • The subquery is performed once for each employee, returning the average salary for the employee's department.
  • External query

    • Check the employee's name and salarySELECT name, salary FROM employees
    • WHEREThe clause determines the filtering conditions, requiring employees’salaryGreater than the value returned by the subquery.
  • Execution process

    • The outer query traverses each employee line by line.
    • For each employee, nested queries calculate the average salary for their department.
    • Then compare whether the employee's salary is higher than the department's average salary, and only employees who meet the conditions will be returned.

4.2 Nested queries in FROM clauses

grammar

SELECT List name
FROM (Subquery) AS Temporary table

Use scenarios

  • Nested queries in the FROM clause are called derivative tables, similar to creating a temporary table. It can simplify complex aggregation operations, especially when the same intermediate results need to be used multiple times.
  • This method is often used to further query intermediate results, such as filtering or computing a data set that has been aggregated again.

Example

SELECT department_name, avg_salary
FROM (
    SELECT department_id, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
) AS avg_department_salaries
JOIN departments ON avg_department_salaries.department_id = departments.department_id;
  • Subquery

    • SubquerySELECT department_id, salary FROM employeesExtract the department ID and salary of all employees to form a virtual tabledept_employees
    • This virtual table is passed as a data source to the outer query, just like a normal table.
  • External query

    • External querySELECT department_id, AVG(salary) AS average_salary, COUNT(*) AS num_employeesResponsible fordept_employeesThe data is processed.
    • AVG(salary)Calculate the average salary for each department.
    • COUNT(*)Calculate the number of employees in each department.
  • Execution process

    • Subquery generates a subquery that contains onlydepartment_idandsalaryTemporary table of columnsdept_employees
    • The outer layer query groups the data from this temporary table and calculates the number of employees and average salary for each department.

4.3 Nested queries in SELECT clauses

grammar

SELECT (Subquery) AS List name
FROM Table name;

Use scenarios

  • Nested queries in SELECT clauses are often used to dynamically generate new columns. These subqueries usually return a calculation for each row to enrich the original dataset.
  • This method is often used for statistical calculations, data conversion, or to extract additional information from other tables.

Example

SELECT employee_id, first_name, last_name,
    (SELECT department_name 
     FROM departments 
     WHERE departments.department_id = employees.department_id) AS department_name
FROM employees;
  • Subquery

    • SubquerySELECT MAX(degree) FROM education WHERE education.employee_id = Used to find the highest educational qualification corresponding to the current employee.
    • WHERE education.employee_id = Make sure that the subquery matches every employee in the outer query.
    • Subquery returns the highest educational qualification of the current employee.
  • External query

    • Outer querySELECT name, ... FROM employeesSearch the names of all employees.
    • highest_degreeColumns use the results of subquery as additional information.
  • Execution process

    • For each employee in the outer query, the subquery will look for their highest education and use it as the result column of the outer query.highest_degree
    • Each row performs a sub-query, so the highest education of each employee is returned with the employee's name.

4.4 Nested Query in HAVING Clause

grammar

SELECT List name
FROM Table name
GROUP BY List name
HAVING Aggregation function Comparison operator (Subquery);

Use scenarios

  • Nested queries in HAVING clauses are usually used to perform complex filtering based on grouped data. HAVING is a filtering of aggregated results (such as SUM, COUNT, AVG, etc.), and nested queries can be used to compare the results of each group with the data of other tables or specific computed values.
  • A common scenario is when we need to fine-grained filtering of the aggregated result set, for example, filtering out groups that meet a certain condition.

Example

SELECT department_id
FROM employees
GROUP BY department_id
HAVING COUNT(*) > (SELECT AVG(employee_count) FROM (SELECT department_id, COUNT(*) AS employee_count FROM employees GROUP BY department_id) AS dept_counts);
  • Subquery

    • SubquerySELECT AVG(employee_count) FROM (SELECT department_id, COUNT(*) AS employee_count FROM employees GROUP BY department_id) AS dept_countsFirst calculate the number of employees in each department, and then calculate the average of these numbers.
    • This is actually a double nested query. The first layer subquery calculates the number of employees in each department, and the second layer subquery calculates the average number of these employees.
  • External query

    • SELECT department_id FROM employees GROUP BY department_idEmployees are grouped by department.
    • HAVING COUNT(*) > ...It is a filtering condition used to filter out departments whose employees do not meet the conditions.
  • Execution process

    • The outer query is first grouped by department and calculates the number of employees in each department.
    • The subquery calculates the average number of employees across all departments.
    • HAVINGThe clause ensures that only those departments whose employees are larger than the average.

4.5 Summary

  • WHERENested queries in clauses: used to dynamically filter rows of outer queries and compare based on the results of subqueries.
  • FROMNested queries in clauses: Create temporary tables that allow further operations using simplified datasets in outer queries.
  • SELECTNested queries in clauses: generate dynamic column values, and apply the results of the subquery directly to each row of the outer query.
  • HAVINGNested queries in clauses: used to filter after grouping based on aggregate results, and are an effective means to deal with complex grouping statistical scenarios.

5. Performance considerations for nested queries

5.1 Factors influencing performance of nested queries

Nested queries, especially complex nested queries, may have a significant impact on database performance. The following are the main factors that affect performance:

  • Repeated execution of query

    • For related subqueries, the subquery will be executed once for each record that handles an outer query. In this way, the number of executions of subqueries is proportional to the number of records of outer query, resulting in performance degradation, especially when processing large amounts of data.
    • Non-correlated subqueries are relatively faster because the subqueries are only executed once and the results are stored for use by the outer query. However, if the subqueries themselves are complex or the data volume is large, performance will also be affected.
  • Index utilization

    • In nested queries, if subqueries and outer queries do not use the index correctly, the database may require a large number of full table scans (Full Table Scan). This can lead to a large I/O burden, which in turn affects performance.
  • The complexity of the query

    • Nested queries often involve operations such as joining, grouping, sorting of multiple tables. Complex logic may cause the database query plan to become more complex and increase the processing time of query.

5.2 Methods for optimizing nested queries

  • Use JOIN to replace subquery

    • If possible, convert nested queries toJOINQuery.JOINQueries are usually easier to optimize, and the database can handle connection operations more efficiently. For example, a subquery can be rewritten asINNER JOINorLEFT JOIN, which usually significantly improves performance.
  • Index optimization

    • Create indexes on fields participating in nested queries, especially fields used for filtering conditions. For example, inWHEREIf the fields appearing in the clause have appropriate indexes, they can significantly improve the query speed.
  • Avoid related subqueries

    • Avoid using related subqueries as much as possible, because related subqueries will perform subqueries once for each record of the outer query, which is less efficient. You can try to rewrite the related subquery asNon-related subqueryOr connect to query.
  • Decompose complex queries

    • Decompose complex nested queries into multiple simple queries, using temporary tables or views to save intermediate results. This approach can sometimes simplify the processing logic of the database and improve performance.

6. Advantages and disadvantages of nested queries

Advantages

  • Powerful: Nested queries can handle complex business logic, especially when multiple independent calculations are required in the same query (such as calculating aggregate values, conditional filtering, etc.).
  • Concise code structure: In some cases, nested queries can avoid using intermediate results or multiple queries, and the code logic is more compact.
  • Dynamic filtering: Nested queries can calculate results dynamically, suitable for processing data operations that rely on other results.

Disadvantages

  • Performance issues: Nested queries, especially related subqueries, may cause performance problems, especially low efficiency when running on large data sets.
  • Poor readability: Multi-layer nested queries will make query logic complex, difficult to understand, and difficult to maintain and debug.
  • Database support differences: Different databases have different optimization degrees of nested queries. In some databases, nested queries perform poorly and may need to be converted to other query methods.

Summarize

This is the end of this article about nested queries in MySQL database. For more related contents of nested queries in MySQL, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!