summary
We often use useGenerateKeys to return the autoincrement primary key to avoid one more query. You will also often use on duplicate key update to perform insertOrUpdate to avoid querying in insert/update first. It's very good to use, but it's often stumbles, and for some reason. This article is an in-depth analysis of the principle of obtaining the self-increment primary key.
question
First, we will pick two pieces of bugs in our company's old code
Bulk insertion of user collections
for (tries = 0; tries < MAX_RETRY; tries++) { final int result = (collections); if (result == ()) { break; } } if (tries == MAX_RETRY) { throw new RuntimeSqlException("Insert collections error"); } // Rely on the collectionid generated by the databasereturn collections;
method
<insert parameterType="list" useGeneratedKeys="true" keyProperty="collectionId"> INSERT INTO collection( userid, item ) VALUES <foreach collection="list" item="collection" separator=","> (#{}, #{}) </foreach> ON DUPLICATE KEY UPDATE status = 0 </insert>
I wonder if you can find out the problem
analyze
There are two problems
Error in judgment of return value result
useon duplicate key
The number of rows affected by batch update is different from the number of inserted ones. This kind of mistake is mainly to take it for granted and not read the documents
Take a lookOfficial website documentation
Written very clearly
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag to the mysql_real_connect() C API function when connecting to mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values.
There are three return values
0: No update 1: insert 2. update
There is another special case, update a value of the same value to the original value. According to the client configuration, it may be 0 and may be 1.
So this judgment is obviously wrong
Use the userGeneratedKey of batch InsertOrUpdate to return the auto-increment key
This problem will be found when there is an update statement when inserting batches. The returned autoincrement primary key is all wrong. Why is this?
1. First, let’s take a look at mybatis’ description of useGeneratedKey
>This tells MyBatis to use the JDBC getGeneratedKeys method to retrieve keys generated internally by the database (. auto increment fields in RDBMS like MySQL or SQL Server). Default: false.
It is obtained using JDBC's getGeneratedKeys method.
2. Let's look for it againJDBC specifications
Before version 3.0 of the JDBC API, there was no standard way of retrieving key values from databases that supported auto increment or identity columns. With older JDBC drivers for MySQL, you could always use a MySQL-specific method on the Statement interface, or issue the query SELECT LAST_INSERT_ID() after issuing an INSERT to a table that had an AUTO_INCREMENT key. Using the MySQL-specific method call isn't portable, and issuing a SELECT to get the AUTO_INCREMENT key's value requires another round-trip to the database, which isn't as efficient as possible. The following code snippets demonstrate the three different ways to retrieve AUTO_INCREMENT values. First, we demonstrate the use of the new JDBC 3.0 method getGeneratedKeys() which is now the preferred method to use if you need to retrieve AUTO_INCREMENT keys and have access to JDBC 3.0. The second example shows how you can retrieve the same value using a standard SELECT LAST_INSERT_ID() query. The final example shows how updatable result sets can retrieve the AUTO_INCREMENT value when using the insertRow() method.
It means that before JDBC3.0, there were some chaotic definitions and no unified definitions, and then it was unified into the getGeneratedKeys() method. Both sides are consistent. The main principle of implementation is to return a LAST_INSERT_ID on the database side. This is strongly related to auto_increment_id.
Let's take a lookDefinition of auto_increment_id. Focus on batch insertion
For a multiple-row insert, LAST_INSERT_ID() and mysql_insert_id() actually return the AUTO_INCREMENT key from the first of the inserted rows. This enables multiple-row inserts to be reproduced correctly on other servers in a replication setup.
When batch insertion is inserted, only one id will be returned, and this id value is the AUTO_INCREMENT value of the first inserted row. As for why this is done, the reason why mysql-server can also ensure the unification of id values under the master-slave architecture can be read in this article. This article will not be expanded.
Then mysql server only returns one id. How can the client obtain all ids when inserting batches?
3. Client implementation
Let's take a look at the implementation of the client getGeneratedKeys.
JDBC
public synchronized ResultSet getGeneratedKeys() throws SQLException { if (!) { throw ((""), "S1009", ()); } else if ( == null) { // The logic of going here in batches return ? (1) : (); } else { Field[] fields = new Field[]{new Field("", "GENERATED_KEY", -5, 17)}; fields[0].setConnection(); return (, fields, new RowDataStatic(), , this, false); } }
Look at the method called ()
protected ResultSet getGeneratedKeysInternal() throws SQLException { // Get the number of rows affected int numKeys = (); return (numKeys); }
There is an important knowledge point here. First, get the number of rows affected by this batch insertion, and then perform the specific obtaining id operation.
getGeneratedKeysInternal method
protected synchronized ResultSet getGeneratedKeysInternal(int numKeys) throws SQLException { Field[] fields = new Field[]{new Field("", "GENERATED_KEY", -5, 17)}; fields[0].setConnection(); fields[0].setUseOldNameMetadata(true); ArrayList rowSet = new ArrayList(); long beginAt = (); // Increasing step length according to affected range + increment for(int i = 0; i < numKeys; ++i) { if (beginAt > 0L) { // The value is stuffed in row[0] = ((beginAt)); } beginAt += (long)(); } }
The number of rows affected by iterating, and then get the id in turn.
So batch insert is correct to return.
However, there is a problem with batch insertOrUpdate. The number of rows affected by batch insertOrUpdate is not the number of inserted data rows, but may be 0, 1, and 2, which leads to the problem of self-incremental id.
For example, insert 3 pieces of data, 2 pieces will update, 1 piece will insert, at this time the updateCount is 5, generateid will 5, mybatis then get the first 3 pieces and stuff them into the data, which is obviously wrong.
The above is a principle analysis. If you want to know more detailed experimental results, you can check the experiment.
Summarize
batch insert
<insert useGeneratedKeys="true" keyProperty="id"> insert into Author (username, password, email, bio) values <foreach item="item" collection="list" separator=","> (#{}, #{}, #{}, #{}) </foreach> </insert>
From the example from the official website, the @Param parameter cannot be specified in the mapper, otherwise there will be problems.
batch insertOrUpdate
You cannot rely on useGeneratedKey to return the primary key.
OK, the above is the entire content of this article. I hope that the content of this article has certain reference value for your study or work. Thank you for your support.