Analysis of problem of writing query of TDengine in time series database

Write problem

A table name must be given for each tag combination

Price paid:

Users must ensure that the table name assigned to each Tag combination is unique, and once the number of tag combinations is too large, it will be difficult for users to remember the corresponding table name of each Tag combination. When querying, they basically rely on the super table STable to query. So for users, this table name is almost useless, but it allows users to name it at a cost.

The ultimate goal of this design is to put together the data of the same tag combination, but when designing the system, you can record a unique id or unique string for this tag combination as the internal hidden table name to replace the operation of letting the user name the table by itself. You only need to present a super table STable to reduce the user's burden.

In fact, we can see that the above is actually transferring the only burden of internal judgments to the user, which has troubled the user. If the system automatically determines whether the tag combination is unique, it is necessary to determine whether the current tag combination exists and find the corresponding underlying unique id or unique string during the data writing process. Letting the user name a table is saved by the above cost, because the table name given by the user is a unique string, so the writing performance is naturally better.

Tag support and management

Support up to 6 tags. If you want to support more, you need to re-source code and compile it.
The index of the super table STable to the Tag combination is full memory and will eventually encounter a bottleneck. InfluxDB has already walked this path, from the previous full memory to the later tsi
The index of the tag combination of the super table STable is just to build a skiplist for the first tag as a key. That is to say, when your query uses the first tag, you can use the above index. When your query does not use the first tag, it is a brute-force scan. Therefore, the filtering and query ability is still very limited when there are too many tag combinations. Other timing databases, InfluxDB and Druid, have built inverted indexes for Tag combinations during the writing process to deal with filtering in any dimensions. The writing performance will naturally be worse than that of TDengine.
Expiration of Tag combinations that are no longer used is also a problem at present.

Unsupported writing

Each table records the maximum time for the current write time of the table. Once the subsequent write time is less than that time, write is not allowed. If you accidentally write data from 2021-07-24 00:00:00 to a certain table, then the data before that time cannot be written.

The benefits of doing this simplify the writing process, which is always an append operation. For example, if an array is used to store memory data, the data in the array is sorted by time. If the time of the subsequent data is not incremented, then the data needs to be inserted into a certain position in the middle of the array, and all the data after that position needs to be moved back. If the time of the subsequent data is incremented, then just put it at the end of the array, so it does not support out-of-order writing, which is to simplify the writing process and improve the writing performance at the expense of user use.

One of the troubles that does not support out-of-order writing is: compact common in LSM. If writing out of order is allowed, there will be overlapping time ranges in the two files. Then you need to perform compacts like RocksDB to eliminate overlaps, thereby reducing the number of files to be queried during query. So you will find that HBase, RocksDB, InfluxDB and other hard-working compacts basically do not exist in TDengine

To sum up, it is not supported to unsorted writes to improve write performance and simplify design at the expense of user usage.

Query questions

Find topN group

order by can only sort time and tags. Top or bottom can only request topN for a certain field

The topN group is very common in the timing field, such as finding the three machines with the largest CPU utilization rate, which is currently not enough.

downsampling and aggregation

downsampling: Collect 1s granular data on the same timeline into 10s granular data

aggregation:Aggregate multiple timelines at the same time into 1 timeline

For example, each appId has multiple machines, and each machine will record the number of connections to the machine every second. Currently, I want to draw a curve of the total number of connections to each appId.

If using standard SQL, it may be expressed as follows:

select sum(avg_host_conn),appid,new_time from (
		select avg(connection) as avg_host_conn,
				appid,host,time/10 as new_time 
		from t1 group by appid,host,time/10
) as t2 group by appid, new_time

The internal subquery will first calculate the average value of the connection within 10s of the host of each appid, that is, downsampling, and the external query will sum the above average value of the host under each appid, that is, aggregation

Since this type of requirement is too common in timing query, using the above SQL writing is very troublesome. Some systems simplify the writing of this type of query through function nesting.

At present, TDengine's aggregate function can either be downsampling or aggregation, and does not support subqueries, so it cannot meet the above requirements.

Query aggregation architecture

The query is divided into two stages: the first stage requests the management node to obtain meta information of all tables that meet the tag filtering (including which data node each table is on). If there are millions of tables that meet the conditions, the query at this stage is basically unbearable. The second stage query is querying the data node from each table, returning it to the client, and the client then performs the final aggregation.

This query solution will eventually face the bottleneck of client aggregation, and it still needs to use multi-machine coordinated distributed query solutions such as Presto, Impala, etc.

The above is the detailed content of the analysis of the writing problem of the timing database TDengine. For more information about the writing of the timing database TDengine, please pay attention to my other related articles!