SoFunction
Updated on 2025-03-04

Summary of the method of deduplication processing in MySQL

In MySQL, deduplication usually refers to removing duplicate rows from the query result. This can be achieved in a number of ways, depending on your needs. Here are some common methods of deduplication:

1. DISTINCT keywords:

If you just want to simply deduplicate and don't care about grouping, then using DISTINCT might be more appropriate:
useDISTINCTKeywords can return uniquely different values.

SELECT DISTINCT name  FROM employees;

This query returns all non-duplicate employee names.

SELECT DISTINCT column_name FROM table_name;

2. GROUP BY clause:

useGROUP BYThe clause can group the result sets, and each group only returns one record.

SELECT column_name, COUNT(*)
FROM table_name
GROUP BY column_name;
SELECT department_id, name
FROM employees
GROUP BY department_id, name;

This query returns the unique employee name for each department, but please note that if there are multiple people in the department with the same name, they will be listed.

4. Combining GROUP BY and DISTINCT

If you want to group a field first and then deduplicate it within the group, you can do this:

SELECT department_id, DISTINCT name
FROM employees
GROUP BY department_id;

However, this query is invalid in most SQL databases, because DISTINCT should act on the entire SELECT list, not a separate column.

3. Aggregation function:

Use aggregate functions (such asMIN()MAX()SUM()AVG()etc.) The deduplication effect can also be achieved because the aggregate function returns a value for each group.

SELECT MIN(column_name) AS unique_column
FROM table_name
GROUP BY another_column;

4. Subquery:

Use a subquery to create a temporary table with unique records.

SELECT *
FROM (
    SELECT DISTINCT column_name
    FROM table_name
) AS subquery;

If you need to deduplicate after grouping, you can use subqueries:

SELECT department_id, MIN(name) AS name
FROM employees
GROUP BY department_id;

This query returns the employee name of each department, but each department only returns one name (the MIN() function is used here, and in fact, MIN() and MAX() work the same for this purpose, because the name is text, not a numeric value).

5. Temporary table:

Create a temporary table to store the results after deduplication.

CREATE TEMPORARY TABLE temp_table AS
SELECT DISTINCT column_name
FROM table_name;

SELECT * FROM temp_table;
DROP TEMPORARY TABLE temp_table;

6. Window Functions (MySQL 8.0+):

Use window functionsROW_NUMBER()RANK()DENSE_RANK()etc. can assign a unique line number to each group.

SELECT column_name
FROM (
    SELECT column_name, ROW_NUMBER() OVER (PARTITION BY column_name ORDER BY another_column) AS rn
    FROM table_name
) AS subquery
WHERE rn = 1;

In databases that support window functions, you can use ROW_NUMBER() or RANK() to implement more complex deduplication logic:

SELECT department_id, name
FROM (
    SELECT department_id, name,
           ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY name) AS rn
    FROM employees
) t
WHERE  = 1;

This query sorts employees in each department by name and assigns a line number to each name in each department. The outer layer query then selects the first name of each department to achieve deduplication.

7. Use GROUP_CONCAT():

If you want to merge duplicate lines into a string, you can useGROUP_CONCAT()

SELECT column_name, GROUP_CONCAT(another_column SEPARATOR ', ') AS concatenated_values
FROM table_name
GROUP BY column_name;

8. Use DISTINCT ON (MySQL does not support it, but it is suitable for other databases such as PostgreSQL):

No in MySQLDISTINCT ONSyntax, but this is a way to deduplicate in other databases.

Which method to choose depends on your specific needs, such as which columns you want to keep, whether you need to consider sorting, etc. In practical applications, a combination of methods may be required to achieve the desired effect.

This is the article about the summary of the method of deduplication in MySQL. For more related content on MySQL, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!