To delete duplicate data in MySQL and keep only one, you can use the following method (just copy and modify the conditions and table names when you want to use it)
Method 1: Use left join + subquery to delete duplicate data (recommended)
Warm reminder: It takes about 15s-30s to execute this SQL under 500w data
Use left join (the recommended method to delete duplicate data and add a unique combination index. It can be used, and it can also be used with large data volumes)
// Delete the historical data first before adding a unique combination indexDELETE u1 FROM uf_cs_record_batch_detail u1 LEFT JOIN ( SELECT MIN(id) AS min_id FROM uf_cs_record_batch_detail GROUP BY cs_contact_name, cs_safe_remark ) u2 ON = u2.min_id WHERE u2.min_id IS NULL;
Also add a unique combination index to SQL writing method (very practical):
// Add a unique index of the combinationALTER TABLE uf_cs_record_batch_detail ADD UNIQUE KEY idx_uni_contact_safe_stat (cs_contact_name, cs_safe_remark);
Method 2: Create a temporary table (it needs to be executed in multiple steps, the logic is clear, but the ID value will be changed)
This method assumes that you have a tableyour_table
, and you want to determine which data is duplicate based on certain columns.
For example, if you want to delete thecolumn1
andcolumn2
Repeated records, only one record is retained, you can follow the following steps:
- use
CREATE TABLE
The statement creates a temporary table that is used to store unique records. - use
INSERT INTO ... SELECT
The statement inserts a unique record into a temporary table. - Delete all records in the original table.
- use
INSERT INTO ... SELECT
The statement inserts the records in the temporary table back to the original table. - Delete temporary tables.
Here is a complete SQL example:
-- Create a temporary table SQL refer to CREATETABLE temp_table AS SELECT*FROM your_table -- This will be temporarily present without duplicate data temp_table In the temporary table INSERTINTO temp_table SELECT*FROM your_table t1 WHERE = ( SELECTMIN() FROM your_table t2 WHERE t1.column1 = t2.column1 AND t1.column2 = t2.column2 ); -- Then delete the data in the source table DELETEFROM your_table whereWHERE Fields1=value; -- 再将In the temporary table不重复数据重新写回到源表中 INSERTINTO your_table SELECT*FROM temp_table; -- Finally delete the temporary table DROPTABLE temp_table;
In this way, you successfully deleted the duplicate records in the original table, leaving only one unique record.
Note: However, this method will change the original data ID, so this method is used according to the occasion.
Method 3: Use JOIN self-concatenated query (you need to pay attention to performance issues)
To avoid changing the original data ID, we can use a different approach by using a self-connection to mark duplicate data and delete unnecessary records. This method deletes duplicate records while retaining the original ID.
Suppose your table structure is as follows:
- Table name:
your_table
- List:
id
(primary key),column1
,column2
, and other columns.
You can use the following SQL to delete duplicate records, keeping only one (usually the one with the smallest ID):
-- Step 1: Mark duplicate records to be deleted DELETE t1 FROM your_table t1 INNER JOIN your_table t2 WHERE > AND t1.column1 = t2.column1 AND t1.column2 = t2.column2; -- Step 2: Confirm that the deletion is successful,View remaining data SELECT * FROM your_table;
explain:
- Mark duplicate records to be deleted: We use the self-connection INNER JOIN to find duplicate records, and use WHERE > to ensure that only records with larger ids are deleted, thus keeping records with smallest ids.
- Confirm that the deletion is successful: View the remaining data through the SELECT statement to ensure that the deletion operation is correct. The advantage of this method is that the ID of the original data will not be changed. Keep the record with the smallest ID in each set of duplicate records. Simple and efficient operation.
A brief summary
- use
left join
Delete duplicate data (recommended), suitable for large data volume, performance OK - Create temporary tables Suitable for scenarios where data tables need to be rebuilt, suitable for situations with medium data volume, but it is quite cumbersome
- Self-concatenated query can retain the minimum ID, suitable for deleting duplicate data without changing the ID.
This is the end of this article about the three ways to write SQL for deduplicate data in MySQL. For more related content related to deduplicate data in MySQL, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!