Practical records of using SQL to quickly delete millions of rows of data

describe

Delete large batches of data on tables, which is a relatively few event. However, this will also occur in actual business development or data testing. For example, you can delete millions of data records from the log table regularly; there are many ways to delete table data, and it is very simple to operate. But there is a problem here, deleting a large number of rows can be slow. And it may take longer because another session has locked the data you want to delete.

According to what we are familiar with, there are three ways to delete data using SQL:
1: DELETE, you can add where conditions, the speed is slow, and the table is locked
2: truncate will delete all data in the table, and the speed is fast
3: drop, delete data and table structure, use with caution

practice

[1] The truncate and drop are not within the scope of this discussion. Although these two methods are very fast, they are too destructive. Note that all deletion operations in daily development must be added conditions.

[2] It is not recommended to use DELETE FROM TABLE WHERE for deleting data of more than hundreds of thousands of yuan. This operation is very time-consuming and has poor efficiency.

[3] The implementation of the deletion requirement for large-scale data can be processed through Create-Table-as-Select method, inserting rows into tables is faster than deleting them. Use create-table-as-select (CTAS) to load data into new tables faster.

create table table_name_temp
  select * from source_table where XX=?

Through CTAS, the undeleted data is retained in a temporary table, and then the temporary table is used as the original table through SWAP. Large-scale data deletion is completed in this way

[4] I do not recommend the above method to build tables. The above method to build tables will not copy the index structure of the original table. If this is a large table, it is also a problem to add an index separately afterwards. It is recommended to use CREATE TABLE XXX (LIKE XXX); to create tables, which will copy the relevant index structure data.

【5】Specific operation steps

-- Copy the table structure 
CREATE  TABLE  tableB   (LIKE tableA);
-- Insert filter data
INSERT into  tableB    SELECT *  from tableA  where XXX = ?;
-- Rename，replace
rename  table  tableA to tableC;
rename  table  tableB to tableA;  
-- Delete old table  
DROP TABLE tableC;

Note: The two renames can be dropped first and then rename once, but considering the data security, after all, it is a large amount of data deletion, or one more step to operate, check it yourself after replacing, and then delete the old table. It is safer

[6] It takes time to delete millions of data through delete and it takes not to be clear. Anyway, I have waited for more than ten minutes but no results have been achieved. Select * from WHERE conn_id!=connection_id();
The query is being executed. The above method has a 5 million data of less than 1 minute, which is relatively fast.

[7] Tips: If your large table has incremental ID, and the deleted or retained data can be divided by ID, then the select condition can be optimized here, and the operation efficiency will be faster.

【8】If it is an oracle, you can also use alter table … move to change the tablespace of the stored rows

alter table tableName   move including rows where XXX=？

This is the end of this article about how to use SQL to quickly delete millions of rows of data. For more related SQL to delete millions of rows of data, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!