PostgreSQL's INDEX index detailed explanation

I have summarized the sequence-related knowledge of PostgreSQL before, and today I will summarize the index.

We all know that the main function of database indexing is to improve the speed of retrieving data, but the more indexes, the better. Because indexing will increase the storage space of the database, querying data will take more time.

1. Create an index

The SQL statement is as follows:

CREATE INDEX idx_commodity
 ON commodity  //Table name USING btree   //Implement with B-tree (commodity_id); //Specific columns of functions

2. Delete the index

DROP index idx_commodity;

3. Advantages of increasing index:

Creating indexes can greatly improve system performance.

First, the main reason is that it can greatly speed up the retrieval of data;

Second, by creating a unique index, the uniqueness of each row of data in the database table can be ensured;

Third, it can accelerate the connection between tables and tables, especially in achieving reference integrity of data;

Fourth, when using grouping and sorting clauses for data retrieval, the time for grouping and sorting in a query can also be significantly reduced;

Fifth, by using indexes, you can use optimization hiders during the query process to improve the performance of the system.

4. Disadvantages of increasing index:

First, it takes time to create and maintain indexes, and the time will also increase as the amount of data increases;

Second, the index will occupy physical space. In addition to the data table occupying data space, each index will also occupy a certain amount of physical space;'

Third, when adding, deleting and modifying the data in the table, the index must also be dynamically maintained, which reduces the data maintenance speed.

5. Selection of index

Generally speaking, indexes should be created on these columns:

First, you can speed up the search on columns that often need to be searched;

Second, on the column as the primary key, force the uniqueness of the column and the arrangement structure of the data in the organization table;

Third, in the columns that are often used on connected, these columns are mainly foreign keys, which can speed up the connection;

Fourth, create an index on a column that often needs to search based on a range, because the index is already sorted, and the specified range is continuous;

Fifth, create an index on columns that often need to be sorted, because the index has been sorted, so that the query can use the sorting of the index to speed up the sorting query time;

Sixth, create an index on the columns of the WHERE clause to speed up the judgment of conditions.

Generally speaking, these columns that should not be created have the following characteristics:

First, indexes should not be created for columns that are rarely used or referenced in queries. This is because, since these columns are rarely used, indexed or non-indexed cannot improve query speed. On the contrary, due to the addition of indexes, the system maintenance speed is reduced and the space requirement is increased.

Second, indexes should not be added to columns with few data values. This is because, since these columns have few values, in the results of the query, the data rows of the result set account for a large proportion of the data rows in the table, that is, the proportion of the data rows that need to be searched in the table is very large. Increasing the index will not significantly speed up the search speed.

Third, indexes should not be added to columns defined as text, image and bit data types. This is because the data volume of these columns is either quite large or has very few values.

Fourth, when the modification performance is much greater than the retrieval performance, an index should not be created. This is because modification performance and retrieval performance are contradictory. When the index is increased, the retrieval performance is improved, but the modification performance is reduced. When indexing is reduced, modification performance will be improved and retrieval performance will be reduced. Therefore, when the modification performance is much greater than the retrieval performance, an index should not be created.

Supplement: PostgreSQL index classification and use

1. Indexing method

PostgreSQL database supports single column index, multiple column composite index, partial index, unique index, expression index, implicit index, and concurrent index.

2. Indexing method

PostgreSQL supports B-tree, hash, GiST, and GIN index methods.

3. Index usage range

1).B-tree

B-tree can be used effectively when a query contains equal sign (=) and range operators (<, <=, >, >=, BETWEEN, and IN).

2).hash

An equal sign operator (=), not suitable for range operators.

3).GiST

Suitable for custom complex types, including rtree_gist, btree_gist, intarray, tsearch, ltree and cube.

4).GIN

GIN takes up more than three times more space than GiST, and is suitable for complex likes, such as like ‘%ABC12%’.

4. Precautions for index use

1). When a table has many rows, it is important to index a table column.

2). When retrieving data, you should select a good alternative column as the index, a foreign key, or a key with the maximum and minimum value. The selectivity of the column is very important for the index validity.

3). For better performance, remove unused indexes, and to clear unusable rows, rebuild all indexes every other month.

4). If there is a very large amount of data, use the table partition index.

5) When the column contains NULL values, you can consider creating a conditional index that does not contain NULL.

The above is personal experience. I hope you can give you a reference and I hope you can support me more. If there are any mistakes or no complete considerations, I would like to give you advice.