MySQL processing solution for processing duplicate data insertion

1. Introduction

In database operations, handling duplicate data insertion is a common requirement. Especially when inserting data in batches, you may encounter primary key conflicts or unique key conflicts. How to gracefully catch such exceptions and execute corresponding business logic is the key to improving code robustness.

This article will use an actual Python MySQL database operation as an example to analyze how to optimize exception handling logic so that when a Duplicate entry error occurs, specific business methods (such as updating record status) can be executed. At the same time, we will also compare similar processing methods in Java to help readers better understand the exception handling mechanism in different languages.

2. Problem background

2.1 Original code analysis

The function of the original code is to batch insert mobile phone number data into the MySQL database, and its core logic is as follows:

def insert_into_mysql(phone_numbers, prefix, province, city):
    try:
        connection = get_db_connection()
        cursor = ()
        
        data_to_insert = []
        for phone_number in phone_numbers:
            if len(phone_number) == 11:
                suffix = phone_number[-4:]
                data_to_insert.append((prefix, suffix, phone_number, province, city))
        
        (INSERT_QUERY, data_to_insert)
        ()
        return True
    
    except Exception as e:
        print(f"Failed to insert data: {e}")
        if connection:
            ()
        return False
    
    finally:
        if cursor:
            ()
        if connection:
            ()

2.2 Existing problems

Exception handling is not fine enough: only prints error messages and rolls back, without targeting specific errors (e.g.Duplicate entry) special treatment.
Low business logic coupling: When data is repeated, additional operations may be required (such as updating record status), but the original code does not provide such an extension point.

3. Optimization plan

3.1 Goal

Catch a specific exception: When the error message containsDuplicate entryExecute additional logic (such as invocationupdate_is_dealmethod).
Keep your code robust: Ensure transaction rollback and resource release are not affected.

3.2 Optimized Python code

def insert_into_mysql(phone_numbers, prefix, province, city, url=None):
    connection = None
    cursor = None
    try:
        connection = get_db_connection()
        if not connection:
            print("Database connection failed")
            return False

        cursor = ()
        data_to_insert = []
        
        for phone_number in phone_numbers:
            if len(phone_number) == 11:
                suffix = phone_number[-4:]
                data_to_insert.append((prefix, suffix, phone_number, province, city))

        if not data_to_insert:
            print("Warning: There is no valid mobile phone number to plug in")
            return False

        (INSERT_QUERY, data_to_insert)
        ()
        print(f"Successfully inserted {len(data_to_insert)} Data")
        return True

    except Exception as e:
        print(f"Failed to insert data: {e}")
        if connection:
            ()
        
        # Check if it is a unique key conflict        if "Duplicate entry" in str(e):
            if url:  # Make sure the url is valid                update_is_deal(url, province, city)  # Execute additional logic        
        return False
    
    finally:
        if cursor:
            ()
        if connection:
            ()

3.3 Key Optimization Points

Fine exception capture: check whether the exception information is included"Duplicate entry", determine whether it is a unique key conflict.
Support additional parameters: addedurlParameters, ensureupdate_is_dealThe method can be executed correctly.
Transaction security: Even if additional logic is executed, transaction rollback and resource release are guaranteed.

4. Java comparison implementation

In Java, MySQLDuplicate entryErrors usually correspond toSQLIntegrityConstraintViolationException, we can adopt similar optimization strategies.

4.1 Java version optimization code

import .*;
import ;

public class PhoneNumberDao {
    private static final String INSERT_QUERY = 
        "INSERT INTO phone_numbers (prefix, suffix, phone_number, province, city) " +
        "VALUES (?, ?, ?, ?, ?)";

    public boolean insertIntoMysql(List&lt;String&gt; phoneNumbers, String prefix, 
                                  String province, String city, String url) {
        Connection connection = null;
        PreparedStatement statement = null;
        
        try {
            connection = (); // Get the database connection            (false); // Start transactions            
            statement = (INSERT_QUERY);
            
            for (String phoneNumber : phoneNumbers) {
                if (() == 11) {
                    String suffix = (7); // The last 4 digits                    (1, prefix);
                    (2, suffix);
                    (3, phoneNumber);
                    (4, province);
                    (5, city);
                    (); // Add to batch processing                }
            }
            
            (); // Execute batch processing            (); // Submit transaction            return true;
            
        } catch (SQLIntegrityConstraintViolationException e) {
            //Catch unique key conflict exception            ("Insert data failed (unique key conflict): " + ());
            if (connection != null) {
                try {
                    (); // Roll back the transaction                } catch (SQLException ex) {
                    ();
                }
            }
            
            if (url != null) {
                updateIsDeal(url, province, city); // Execute extra logic            }
            return false;
            
        } catch (SQLException e) {
            ("Insert data failed: " + ());
            if (connection != null) {
                try {
                    ();
                } catch (SQLException ex) {
                    ();
                }
            }
            return false;
            
        } finally {
            // Close the resource            try {
                if (statement != null) ();
                if (connection != null) ();
            } catch (SQLException e) {
                ();
            }
        }
    }

    private void updateIsDeal(String url, String province, String city) {
        // Implement update logic        ("Duplicate data was detected, updated status: " + url);
    }
}

4.2 Java optimization points

Accurate captureSQLIntegrityConstraintViolationException, not generalSQLException。
Batch optimization: UseaddBatch()andexecuteBatch()Improve insertion efficiency.
Transaction management: explicit controlcommit()androllback(), ensure data consistency.

5. Summary and best practices

5.1 Key Summary

Optimization point	Python implementation	Java implementation
Exception capture	examine`str(e)`Whether it contains`"Duplicate entry"`	capture`SQLIntegrityConstraintViolationException`
Transaction Management	`()`	`()`
Resource release	`finally`Block Close Connection	`finally`Block Close Resource
Batch Optimization	`()`	`addBatch()` + `executeBatch()`

5.2 Best Practices

Fine exception handling: Don't just catchException, but different error types should be distinguished according to business needs.
Transaction security: Ensure that exceptions can be rolled back correctly when they occur and avoid dirty data.
Resource release: Usetry-finallyortry-with-resources(Java) Ensure that the database connection is closed.
Logging: Record enough information when handling exceptions to facilitate troubleshooting.

6. Expanding thinking

Should I query before inserting?
- If the data volume is large, querying first and then inserting may affect performance and directly captureDuplicate entryMore efficient.
How to optimizeupdate_is_deallogic?
- Asynchronous processing (such as message queues) can be introduced to avoid affecting the performance of the main process.
Is it possible to useINSERT IGNOREorON DUPLICATE KEY UPDATE？
- Depending on business requirements, these SQL syntaxes can be used if duplicate data is required to silently ignore.

7. Conclusion

Through the optimization case in this article, we learned how to finely process MySQL in Python and JavaDuplicate entryError and perform additional business logic. The key is:

Accurately capture exceptions
Ensure transaction security
Rationally optimize batch processing

This is the end of this article about the processing scheme of MySQL processing duplicate data insertion. For more related contents of MySQL duplicate data insertion, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!