Preface
In database design, the choice of primary key is crucial. It is not only a unique identifier for data rows, but also directly affects query efficiency, data storage and even the scalability of system architecture. Common primary key generation methods in MySQL include self-increasing ID, UUID, Snowflake algorithm, etc. Each method has its own unique applicable scenarios and advantages and disadvantages. Next, we will analyze the common four common methods of primary key ID generation.
1. Self-increase ID
What is self-increasing ID?
Self-increasing ID is the built-in primary key generation method in MySQL. passAUTO_INCREMENT
Keyword, every time a new data is inserted, the primary key value will be automatically added to 1.
Case: Order Table
CREATE TABLE orders ( id INT AUTO_INCREMENT PRIMARY KEY, user_id INT, amount DECIMAL(10,2) );
When inserting data, no need to specifyid
Fields:
INSERT INTO orders (user_id, amount) VALUES (1001, 99.99);
advantage
- Simple and efficient: Automatically generate databases, low development cost.
- Good query performance: The primary key is incremented in order to reduce index fragmentation, and the B+ tree structure is more compact.
-
Small storage space: Usually used
INT
(4 bytes) orBIGINT
(8 bytes).
shortcoming
- Difficulty in dividing databases and tables: The self-increasing ID may be repeated in distributed systems.
- Expose business information: The continuously increasing ID may be inferred by traffic volume (such as order number).
- Trouble with data migration: It is easy to conflict when merging tables from different databases.
2. UUID
What is UUID?
UUID is a 128-bit string (eg550e8400-e29b-41d4-a716-446655440000
), theoretically unique in the world.
Case: User table
CREATE TABLE users ( uuid CHAR(36) PRIMARY KEY, name VARCHAR(50) );
Generate UUID when inserting data:
INSERT INTO users (uuid, name) VALUES (UUID(), 'Zhang San');
advantage
- Globally unique: There is no need to worry about ID conflicts in distributed systems.
- High safety: Unordered IDs avoid exposing traffic.
shortcoming
- Large storage space: A 36-character string takes up more space (if it is optimized to binary, it still requires 16 bytes).
- Poor query performance: Unordered IDs cause frequent splitting of indexes and slowing down insertion.
- Poor readability: Long strings are difficult to remember and are not intuitive enough when debugging.
3. Snowflake algorithm (Snowflake)
What is the snowflake algorithm?
The Snowflake Algorithm is a distributed ID generation algorithm that opens the source of Twitter, generating a 64-bit long integer number with the following structure:Symbol bit (1 bit) + timestamp (41 bit) + machine ID (10 bit) + serial number (12 bit)
。
Case: Distributed log table
// Java code example (using Hutool tool library)Snowflake snowflake = (1, 1); long id = (); // Generate an ID similar to 6726434627886811136
Used when creating tablesBIGINT
type:
CREATE TABLE logs ( id BIGINT PRIMARY KEY, content TEXT );
advantage
- Distributed friendly: IDs generated by different machines will not be repeated.
- Performance and storage balance: The BIGINT type only needs 8 bytes and maintains an incremental trend.
- Time orderly: The creation time can be directly calculated based on the ID.
shortcoming
- Relying on server clock: Clock callback may cause ID duplication (special processing is required).
- The machine ID needs to be allocated: The machine ID needs to be planned in advance to avoid more than 1024 instances.
4. Business field combination
What is a business field combination?
Directly use business-related fields as primary keys (such as "User ID + Product ID").
Case: Student course selection
CREATE TABLE course_selection ( student_id INT, course_id INT, PRIMARY KEY (student_id, course_id) );
advantage
- Intuitive and clear: The primary key directly reflects the business meaning.
- Avoid redundant fields: No additional primary key columns are required.
shortcoming
- High maintenance cost: The primary key may need to be modified when business rules change.
- Large storage space: The joint primary key may take up more space.
- High query complexity: Multi-condition query may affect performance.
5. How to choose the primary key generation method?
Scene | Recommended method | reason |
---|---|---|
Standalone MySQL | Self-increasing ID | Simple and efficient, suitable for small systems |
Distributed Systems | Snowflake Algorithm | Balanced performance and global uniqueness, suitable for library and table division |
High security requirements | UUID | Avoid ID speculation, suitable for anonymous data |
Strengthen business connections | Business field combination | If the order number contains date, user ID and other business information |
Conclusion
When selecting the primary key generation method, comprehensive consideration must be taken into accountSystem Size、Extensibility、performanceandBusiness Requirements。
- Self-increasing ID is suitable for traditional stand-alone applications, but it is difficult to deal with distributed scenarios.
- UUID solves the global uniqueness problem, but sacrifices performance and storage.
- The snowflake algorithm performs well in distributed systems, but it needs to solve the clock callback problem.
- Business field combinations are simple and effective in specific scenarios, but have less flexibility.
In actual development, there are many ways to combine. For example, in e-commerce systems, the order ID uses the snowflake algorithm to ensure distributed uniqueness, while the order list uses self-increasing ID to improve insertion efficiency.Only by choosing flexibly according to business characteristics can you find the optimal solution。
The above are the four methods of MySQL primary key generation and the detailed explanation. For more information about MySQL primary key generation, please pay attention to my other related articles!