Viewing and analyzing skills for Oracle database execution plan

In Oracle database, execution plans can help us gain insight into the execution details of SQL statements inside the database, thereby optimizing query performance and improving system efficiency. Whether a newbie in the database field or an experienced engineer, it is crucial to master the viewing and analysis methods of execution plans.

1. What is an execution plan

Execution Plan is an execution blueprint generated by the Oracle Database Optimizer for SQL statements that describes how the database will retrieve data to meet query requirements. Simply put, the execution plan tells us the steps of SQL statements, such as the order of operations such as which indexes are used to search data, what connections are used to correlate between tables, how to sort data, etc. The optimizer will comprehensively consider the execution plan it considers to be the best based on factors such as the statistical information of the database object, the syntax structure of the SQL statement, and the configuration parameters of the database.

2. How to view the execution plan

(I) Use the EXPLAIN PLAN command

This is one of the most basic and most commonly used ways to view execution plans. Its syntax is as follows:

EXPLAIN PLAN FOR
<your_sql_statement>;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

For example, we have a simple query statement that retrieves employee information for a specific department from employees and department tables:

EXPLAIN PLAN FOR
SELECT e.employee_id, e.first_name, e.last_name, d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE d.department_name = 'Sales';

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

After executing the above code, the second query will display the detailed execution plan in a table form. This includes key information such as the ID of each operation, the operation name (such as TABLE ACCESS FULL represents a full table scan, INDEX RANGE SCAN represents an index range scan, etc.), object name (table or index involved), and execution order.

(II) View through SQL Developer tool

SQL Developer is an official database development tool provided by Oracle. When using it to execute SQL statements, you can easily view the corresponding execution plan at the same time. Simply click the "Explanation Plan" button (usually an icon with a magnifying glass and lightning logo) in the window where SQL is executed, and the tool will display the execution plan in a visual tree structure in the panel below. This method is more intuitive and easy to understand than the command line. Each node displays detailed operation information and can view more details through mouse hovering, such as predicate information (filter conditions in the WHERE clause).

(III) Enable the AUTOTRACE function

In the SQL*Plus environment, we can enable AUTOTRACE to view execution plans and related execution statistics, such as physical reading, logical reading, execution time, etc. First, you need to make sure that the current user has permissions to execute AUTOTRACE and that the database instance is correctly configured. The command to enable AUTOTRACE is as follows:

SET AUTOTRACE ON;

Then execute the SQL statement, for example:

SELECT * FROM customers WHERE customer_city = 'New York';

After executing SQL, in addition to returning the query results, the summary information of the execution plan and the statistical information mentioned above will be output. This is very helpful for quickly evaluating the performance overhead of SQL statements. To turn off the AUTOTRACE feature, use:

SET AUTOTRACE OFF;

3. Interpretation of key information in the implementation plan

(I) Operation Type

Full table scan (TABLE ACCESS FULL)

This means that the database reads all rows in the table to satisfy the query criteria. This option is chosen when no suitable index is available or when the optimizer believes that the full table scan is less expensive. For example, if there are no filter conditions or poor filter conditions on a table with a small amount of data, a full table scanning may be the fastest way. However, for large tables, full table scanning usually leads to a large number of I/O operations, seriously affecting performance.

Index Scan (INDEX SCAN)

It is also divided into index unique scan (INDEX UNIQUE SCAN), index range scan (INDEX RANGE SCAN), etc. Index-only scans are used to find rows with unique key values, such as querying a single record with a primary key. Index range scanning is suitable for queries based on a range condition. For example, querying data within a certain time period, it will use the orderliness of the index to quickly locate the start and end positions that meet the conditions and scan the index entries therein.

Nested Loops Connection (NESTED LOOPS)

This is a common way of joining tables. For each row of an external table, matching rows will be found in the inner table. It is suitable for scenarios where connection conditions are high and the data volume of associated tables is small. The advantage is that it can quickly return a small number of precise matching results, but if the table data is large, a large number of loop operations may occur and the performance will drop sharply.

Hash connection (HASH JOIN)

First build a hash table for one table, and then use the hash function to quickly find matching rows in another table. Usually performs better when connecting large data sets, especially when both tables are large and there is no suitable index. Hash connections can improve connection efficiency by reducing the number of data comparisons.

(II) Execution order

The operation ID in the execution plan identifies the execution order of each operation, usually starting from a node with less indentation and gradually advancing to a node with more indentation. The smaller the number, the higher the execution priority. By observing the execution order, we can understand the direction of the data flow, and which operations are the basis, and which are subsequent further processing based on the previous results. For example, first perform access operations on the table to obtain the original data, then perform filtering, joining and other operations, and finally perform steps such as sorting or aggregation to meet the final query needs.

(III) Predicate information

Predicates are filter conditions in the WHERE clause. In the execution plan, which predicates are used for index search and which are used for filtering the final result. If a predicate can effectively utilize the index, it means that the filtering condition has good effect and can quickly narrow the data retrieval scope. Conversely, if the predicate can only be filtered after a full table scan, then it may be necessary to consider optimizing the filtering conditions or adding appropriate indexes. For example, if the range predicate like "WHERE column_name> 100 AND column_name < 200" is on the index column, an index range scan may trigger; while "WHERE function (column_name) = some_value" (the condition in which the function acts on the column), in general, will cause the index to fail and trigger a full table scan.

4. Skills for analyzing the execution plan

(I) Pay attention to high-cost operations

Each operation in the execution plan has a corresponding cost estimate, usually expressed as a COST value, including CPU cost and I/O cost. Focus on costly operations, which are often the performance bottlenecks. For example, when it is found that a full table scan operation accounts for a large proportion and the amount of table data is huge, you need to think about whether you can change the execution plan and reduce costs by creating appropriate indexes, optimizing query conditions, etc. The optimization effect can be evaluated by comparing the cost changes in execution plans under different optimization plans.

(II) Combined data volume and distribution

Understanding the actual data size of a table and the distribution of data on the index column is crucial to accurately analyze the execution plan. For example, an index looks perfect in theory, but if most of the data in the table has the same value (data skew) on the index column, the selectivity of the index is greatly reduced, and the optimizer may mistakenly choose to use this inefficient index, causing performance issues. At this point, it may be necessary to consider collecting more accurate statistics, or adjusting query statements to adapt to data distribution characteristics, such as adding additional filtering conditions to reduce the impact of data skew.

(III) Comparison of different execution plan versions

During the process of optimizing and adjusting SQL statements, such as modifying indexes, adjusting query structures, updating database statistics, etc., re-view and compare the changes in the execution plan. Observe whether the optimization measures achieve the expected results, whether high-cost operations are eliminated in the new execution plan, and whether the data retrieval path is more reasonable. Through this iterative comparison analysis, the optimal query performance is gradually approached.

5. Case Study on Optimizing the Implementation Plan

Suppose we have an e-commerce order database, including order tables, order_items and product tables. A frequently executed query is to get the total amount of orders for a specific product category over a certain time period. The initial query statement is as follows:

SELECT p.product_category, SUM( * oi.unit_price) AS total_amount
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE o.order_date BETWEEN '2024-01-01' AND '2023-01-31'
AND p.product_category = 'Electronics'
GROUP BY p.product_category;

After viewing the execution plan using EXPLAIN PLAN, the following problems were found:
The order table was scanned in full table because the order_date column did not have a suitable index, resulting in a large number of unnecessary I/O operations and inefficient query.
In the join operation, since the join condition selectivity between tables is not particularly high and the index is not fully utilized, the cost of nested loop connections is higher.
Optimization solution:
Create an index on the order_date column of the order table:

CREATE INDEX idx_order_date ON orders(order_date);

Analyzing the data distribution of the product_category column on the product table (products), it was found that the data in this column was skewed, and the amount of data in some categories was much larger than that in other categories. Consider collecting more precise statistics:

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS(ownname => 'your_schema', tabname => 'products');
END;

Re-execute the query and check the execution plan, and found that the order table was changed to using index range scanning, which greatly reduced the amount of data read; the connection operation was also updated by the statistical information, and the optimizer chose a more suitable hash connection method, which improved the overall query performance several times, and the execution time was shortened from the original few dozens of seconds to a few seconds.

Summarize

Viewing and analyzing Oracle database execution plans is the core skill in database optimization work. By mastering a variety of methods of viewing execution plans, deeply interpreting the key information, and using effective analysis techniques, we can accurately locate the performance problems of SQL statements and take targeted optimization measures. From creating appropriate indexes, optimizing query statement structures, to ensuring accurate statistical information, every step may become the key to improving database performance. Continuous practice and experience accumulation will help us optimize query performance easily when facing complex database environments to ensure efficient and stable operation of the system.

The above is the detailed content of the Oracle database execution plan viewing and analysis skills. For more information about Oracle execution plan, please pay attention to my other related articles!