Or is it because of the big data of a table that causes serious performance decline? Do we have to store multiple tables to solve the problem? Below we use an example to parse and optimize the data management performance of dedecms. Don’t let mysql be a scapegoat, it’s a great sin.
The test data is the data of the enterprise yellow pages that were accidentally obtained. The data volume is nearly 900,000, which are completely real data. The program used for the test is dedecms4.0 version. Why don’t you use dedecms5.1? That's because we have made many modifications to dedecms for optimization. If we use dedecms5.1, we are afraid of receiving court summons.... To add, the following optimization methods can be used in dedecms5.1. Please complete them yourself based on understanding its principles.
Before optimization, we found that there are three main frequent operations that affect management performance under the situation of large amount of dede data, namely document generation, list page generation and column listing of all articles. We have carried out optimization practices for these three aspects.
The following are the basic information about the test data:
The number of documents is close to 900,000
Each column contains nearly 30,000 data
1. Improve document generation speed
Question raised
The same results as our previous evaluation, the generation speed of dedecms's document is terrible. Using the default template (article_article.htm), 20 pages can be generated within an average of nearly 30 seconds (as shown in the figure). If you generate at this speed, all 900,000 data can be generated to web pages until your hair turns white. So what is the problem?
The speed of single column document generation before optimization
Problem analysis
First rule out the problem of table indexing, because the dede database has already indexed the main fields of the data main table (dede_archives). The problem of extraction efficiency of main content is eliminated, because during the page generation process, the article data in the page is read, and each time you need to select the data content with the unique id value in the main table and the attached table. The efficiency of this SQL statement is run directly in mysql, and the execution time is very short, so this is not the biggest bottleneck.
Finally, during the page generation process, we found that the program has executed several main table (dede_archives) queries and fetched out operations that meet a set of complex query conditions. The query efficiency is very low. It turns out that it is affecting efficiency! Through debugging and tracking, we have located the key to the problem, and the culprit is the arclist tag in the template. The Arclist tag is a tag that many people like to use because it is more flexible and can extract various types of article lists from the data, but the arclist tag will query the main table with a large number of search conditions. In fact, for generating a large number of articles at once, if the same template is used, arclist's query operation on the database is just a simple mechanical repetition, and it is definitely not worth it to spend a lot of time on this. Next we give suggestions for problem solving.
Solve the problem
Solution 1: Remove the arclist tags in the final page template, or use as little as possible. Although this method can greatly improve efficiency, it is tantamount to splashing water to throw away the child. It is not recommended for websites that attempt to increase access to PV.
Solution 2: Create an arclist cache, put the data generated by each arclist into a temporary directory or cache, and determine whether the cache is updated during the document generation process. If there is no update, use the cache data directly. This method does not require changing the template, and it also has certain effects on improving the generation efficiency. However, due to the large changes to the program, it is considered as appropriate.
Solution 3: It is also the solution suggested by the group, which is to fully explore the functions of existing dedecms and greatly improve efficiency while trying not to change the program. The specific method is to use the freelist function to create a list page of popular articles, latest articles, related articles, etc., and then use the include tag provided by dedecms to directly introduce the document page. The label format is: {dede:include file='list page file name' ismake=' no'/}. The advantage of this solution is that only some operation steps are added, no programs are changed, and the performance improvement is also very obvious. The following figure shows the generation speed after we optimized using this method. It took only 50 seconds to complete the generation of more than 1,500 pages of articles to achieve the target optimization effect. Due to the addition of operational steps, lazy people can use this plan with caution.
Optimized single column document generation speed
2. Improve the speed of list page generation
Question raised
Next, we continue to test the generation of the list page. This time we learned our lesson. We first deleted the arclist tag in the template (list_article.htm) and then tested it, but the generation effect was still very unsatisfactory. As shown in the figure below, the generation time of each list page is nearly 20 seconds (we have modified the page result output prompt to make it easier for everyone to see the generation time of each list page). Calculated based on 50 pieces of data per page, it theoretically takes more than 3 hours to generate 900,000 data for a single column... Speechless. Since the list content uses the list tag, which is a tag that is somewhat similar to arclist, we cannot continue the above approach to solve the problem, we can only find another way.
List page generation speed before optimization
Problem analysis
Since the target is locked in the list tag, the testing process is simple. We directly used the list query statement in dedecms for optimization analysis, and soon discovered the problem. We tested the SQL query statement in list. The following code is the SQL statement used by list to query the corresponding conditions in the database. The execution time is about 15 seconds, and the efficiency is not ideal.
Select ,,,, ,,,,, ,,,, ,,,,, ,,,, ,tp.namerule2,,, from dede_archives arc left join dede_arctype tp on = left join qiye_addonarticle on = qiye_addonarticle.aid where > -1 And ( ( ='1′ ) or arc.typeid2='1′) order by desc limit 0,50
We noticed that the where clause in this SQL statement uses multiple conditional judgments of and and or. Experience tells us that if an in or or statement is used in the query clause, it will lead to full table scanning, so that the index efficiency will not be reflected. We simplified the judgment conditions of the where clause for testing, and found that after deleting the or clause, the query efficiency was greatly improved. The above query statement took less than 1 second to obtain the query result. This is the key to the problem.