1. Use EXPLAIN
The EXPLAIN command can view the execution plan, which has been introduced in the previous blog. This method is our main debugging tool.
2. Timely update the statistical information used in the execution plan
Since statistical information is not updated every time the database is operated, the statistical information will generally be updated when DDL such as VACUUM, ANALYZE, CREATE INDEX, etc. is executed.
Therefore, the statistical information used to implement the plan is likely to be older. The analysis results of the plan in this way may become larger.
The following are some statistics related to table tenk1.
SELECT relname, relkind, reltuples, relpages
FROM pg_class
WHERE relname LIKE 'tenk1%';
relname | relkind | reltuples | relpages
----------------------+---------+-----------+----------
tenk1 | r | 10000 | 358
tenk1_hundred | i | 10000 | 30
tenk1_thous_tenthous | i | 10000 | 30
tenk1_unique1 | i | 10000 | 30
tenk1_unique2 | i | 10000 | 30
(5 rows)
where relkind is type, r is its own table, i is the index index; reltuples is the number of items; relpages is the number of blocks occupied by the hard disk.
3. Clearly use join to associate tables
General writing method: SELECT * FROM a, b, c WHERE = AND = ;
If you explicitly use join, it is relatively easy to control the execution plan during execution.
example:
SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE = AND = ;
SELECT * FROM a JOIN (b JOIN c ON ( = )) ON ( = );
4. Turn off automatic commit (autocommit=false)
5. Inserting data multiple times is more efficient using the copy command
Some of our processes require many insert operations on the same table. At this time, we use the copy command to be more efficient. Because insert once, the relevant indexes need to be done once, which takes more time.
6. Temporarily delete index
Sometimes when we backup and reimport data, if the amount of data is large, it will take several hours to complete. At this time, you can delete the index first. Import the under-constructed index.
7. Deletion of foreign key associations
If there are foreign keys in the table, the foreign key integration will not be checked in every operation. Therefore it is slower. It is also an option to establish foreign keys after data is imported.
8. Increase the maintenance_work_mem parameter size
Adding this parameter can improve the execution efficiency of CREATE INDEX and ALTER TABLE ADD FOREIGN KEY.
9. Increase the size of the checkpoint_segments parameter
Adding this parameter can improve the speed of importing a large amount of data.
10. Setting archive_mode is invalid
When this parameter is set to invalid, the following operation speed can be increased
・CREATE TABLE AS SELECT
・CREATE INDEX
・ALTER TABLE SET TABLESPACE
・CLUSTER, etc.
11. Finally execute VACUUM ANALYZE
When the data in the table changes a lot, it is recommended to execute VACUUM ANALYZE.
The EXPLAIN command can view the execution plan, which has been introduced in the previous blog. This method is our main debugging tool.
2. Timely update the statistical information used in the execution plan
Since statistical information is not updated every time the database is operated, the statistical information will generally be updated when DDL such as VACUUM, ANALYZE, CREATE INDEX, etc. is executed.
Therefore, the statistical information used to implement the plan is likely to be older. The analysis results of the plan in this way may become larger.
The following are some statistics related to table tenk1.
SELECT relname, relkind, reltuples, relpages
FROM pg_class
WHERE relname LIKE 'tenk1%';
relname | relkind | reltuples | relpages
----------------------+---------+-----------+----------
tenk1 | r | 10000 | 358
tenk1_hundred | i | 10000 | 30
tenk1_thous_tenthous | i | 10000 | 30
tenk1_unique1 | i | 10000 | 30
tenk1_unique2 | i | 10000 | 30
(5 rows)
where relkind is type, r is its own table, i is the index index; reltuples is the number of items; relpages is the number of blocks occupied by the hard disk.
3. Clearly use join to associate tables
General writing method: SELECT * FROM a, b, c WHERE = AND = ;
If you explicitly use join, it is relatively easy to control the execution plan during execution.
example:
SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE = AND = ;
SELECT * FROM a JOIN (b JOIN c ON ( = )) ON ( = );
4. Turn off automatic commit (autocommit=false)
5. Inserting data multiple times is more efficient using the copy command
Some of our processes require many insert operations on the same table. At this time, we use the copy command to be more efficient. Because insert once, the relevant indexes need to be done once, which takes more time.
6. Temporarily delete index
Sometimes when we backup and reimport data, if the amount of data is large, it will take several hours to complete. At this time, you can delete the index first. Import the under-constructed index.
7. Deletion of foreign key associations
If there are foreign keys in the table, the foreign key integration will not be checked in every operation. Therefore it is slower. It is also an option to establish foreign keys after data is imported.
8. Increase the maintenance_work_mem parameter size
Adding this parameter can improve the execution efficiency of CREATE INDEX and ALTER TABLE ADD FOREIGN KEY.
9. Increase the size of the checkpoint_segments parameter
Adding this parameter can improve the speed of importing a large amount of data.
10. Setting archive_mode is invalid
When this parameter is set to invalid, the following operation speed can be increased
・CREATE TABLE AS SELECT
・CREATE INDEX
・ALTER TABLE SET TABLESPACE
・CLUSTER, etc.
11. Finally execute VACUUM ANALYZE
When the data in the table changes a lot, it is recommended to execute VACUUM ANALYZE.