The reason why you mention the word "theoretical" here is because if your clustered index is still blindly built on the primary key of ID, your query speed is not that high, even if you create an index on the field "date" (non-aggregated index). Let’s take a look at the speed performance of various queries under the 10 million data volume (250,000 data within 3 months):
(1) Only create clustered indexes on the primary key, and do not divide the time period:
Select gid,fariqi,neibuyonghu,title from tgongwen
Time taken: 128470 milliseconds (ie: 128 seconds)
(2) Create a clustered index on the primary key and create a non-clustered index on fariq:
select gid,fariqi,neibuyonghu,title from Tgongwen
where fariqi> dateadd(day,-90,getdate())
where fariqi> dateadd(day,-90,getdate())
Time taken: 53763 milliseconds (54 seconds)
(3) Create the aggregate index on the date column (fariqi):
select gid,fariqi,neibuyonghu,title from Tgongwen
where fariqi> dateadd(day,-90,getdate())
where fariqi> dateadd(day,-90,getdate())
Time taken: 2423 milliseconds (2 seconds)
Although each statement extracts 250,000 pieces of data, the differences in various situations are huge, especially when building clustered indexes on date columns. In fact, if your database really has 10 million capacity, build the primary key on the ID column, just like in the above 1 and 2 cases, the performance on the web page is a timeout and cannot be displayed at all. This is also one of the most important factors in my abandoning ID columns as clustered indexes. The method to obtain the above speed is: add:
declare @d datetime
set @d=getdate()
set @d=getdate()
And add:
select [Statement execution takes time (ms)]=datediff(ms,@d,getdate())
2. Just create an index to significantly improve query speed
In fact, we can find that in the above example, the statements 2 and 3 are exactly the same, and the fields that create the index are the same; the only difference is that the former establishes a non-aggregated index on the fariqi field, and the latter establishes a aggregate index on this field, but the query speed is huge. Therefore, simply indexing on any field can improve query speed.
From the statement of creating a table, we can see that there are 5003 different records in the fariqi field in this table with 10 million data. It is perfect to create an aggregate index on this field. In reality, we will post several files every day, and the dates of these files are the same, which fully meets the rule of establishing a clustered index: "The vast majority cannot be the same, and there cannot be only a very small number of the same". From this point of view, it is very important for us to establish "appropriate" aggregated indexes to improve query speed.
3. Add all fields that need to increase the query speed to the clustered index to improve the query speed
As mentioned above: What is inseparable from fields when conducting data queries is the "date" and the user's own "user name". Since both fields are so important, we can merge them and create a compound index.
Many people think that as long as any field is added to the clustered index, the query speed can be improved, while others are confused: if the composite clustered index fields are queryed separately, will the query speed be slowed down? With this question, let's take a look at the following query speed (the result set is 250,000 pieces of data): (The date column fariqi is first ranked in the starting column of the composite clustered index, and the user name neibuyonghu is ranked in the bottom column):
(1)select gid,fariqi,neibuyonghu,title from Tgongwen where fariqi>''2004-5-5''
Query speed: 2513 milliseconds
(2)select gid,fariqi,neibuyonghu,title from Tgongwen
where fariqi>''2004-5-5'' and neibuyonghu=''Office''
where fariqi>''2004-5-5'' and neibuyonghu=''Office''
Query speed: 2516 milliseconds
(3) select gid,fariqi,neibuyonghu,title from Tgongwen whereneibuyonghu=''Office''
Query speed: 60280 milliseconds
From the above experiments, we can see that if only the start column of the clustered index is used as the query condition and the query speed of all columns using the composite clustered index is almost the same, and it is even slightly faster than using all the composite index columns (in the case of the same number of query result sets); and if only the non-starting column of the composite clustered index is used as the query condition, this index will not play any role. Of course, the query speed of statements 1 and 2 is the same because the number of entries in the query is the same. If all columns of the composite index are used and there are few query results, a "index coverage" will be formed, so the performance can be optimal. Also, remember: Whether you use other columns of aggregated indexes frequently or not, its leading columns must be the ones that are most frequently used.
Previous page1234Next pageRead the full text