SoFunction
Updated on 2025-04-05

Website Optimization Series Database Optimization Partition Tips

If you have foresight, you will prefix the table name and stored procedure name, for example, the forum table is named BBS_xxx and the blog table is named BLOG_xxx; this will be easier when processing the table. Speaking of this, you may think of what to do with foreign key constraints. My blog table and forum post table all use the primary key of the User table as foreign keys. This is easy to deal with. We need to make a quick decision to delete foreign keys. This quick decision may cause some trouble. Let's analyze what problems we may encounter:

1. Divide into multiple libraries, without foreign keys, do the previous inner join operations cross libraries?

Suppose scenario: the blog table has a foreign key reference to the user table. We need to display the blog list on the homepage. The blog list needs to display the user name and user ID information.

In the past, when the user table and the blog table are in a library, we can obtain the user's associated information through the foreign key inner join. Now the user database and the blog database have been split into two libraries. I want to say no to cross-store inner join; why? Because this is not suitable for expansion. If one day our business volume increases again, we need to move the user database to another machine, which will cause the inner join to cross-server. This is obviously not a good idea. What should we do? I have two solutions, everyone judges whether it is good or bad:

1) Do designs that violate the paradigm, and store the user's unchanged information username and userid in the blog table, so that the username is redundant. This can ensure that the username is very efficient when obtaining the blog data and the username.

2) We no longer take username information from the database, but instead take it from the cache. We can form a recently active user data pool in the cache, and when we need a username, we will get it from this cache area.

Currently, I use the first solution in my application, and the second one is more scalable. The first one can only store usernames with redundant data. Sometimes it is enough to store usernames with just a username, and sometimes there may be insufficient problems.

2. If cascading deletion based on foreign keys is used, it is a nightmare

To deal with this problem, my solution is to modify the program. If cascading deletion is required, it is done in the program logic. Do not do cascading deletion in the database. Cascading deletion is a logic implicit in the database and a bad design solution.

3. Triggers may also cause the same trouble as foreign keys for cascading deletion, and they also modify program logic to replace this database-level implicit logic.

Maybe you would say that the performance improvement will definitely be brought about after the database is divided? This problem needs to be analyzed in detail. It depends on the performance of your server. If the CPU, io and memory pressure of the database are still very high after the database is divided; then you can migrate one of the libraries after the database is divided to another server, so that the two servers can share the pressure of data access, which will definitely improve performance.

Finally, the division of the database and whether it is divided is determined by the data volume and performance requirements. Stay tuned for the next section!