gather_plan_statistics hint tips
Oracle provides several tools to see the internals of the SQL optimizer and reveal execution plan details:
-
TKPROF (SQL*Trace) 10046
-
SQL*Plus “set autotrace”
-
Extended execution plan statistics (v$sql_workarea)
-
The gather_plan_statistics hint (or statistics_level=all)
The Oracle docs note that additional SQL execution statistics are collected when SQL statements are executed, either by using the /*+ gather_plan_statistics */ hint or by setting the parameter statistics_level=all.
However, these statistics are of limited value, as you only see the optimizers estimated number of rows vs. actual number of rows. Hence, using gather_plan_statistics is a great way to diagnose cardinality errors. See my related notes here:
Greg Rahn of Oracle Corporation notes a great example of using the gather_plan_statistics hint:
As a first step of triage, I would suggest executing the query with a GATHER_PLAN_STATISTICS hint followed by a call to DBMS_XPLAN.DISPLAY_CURSOR.
The GATHER_PLAN_STATISTICS hint allows for the collection of extra metrics during the execution of the query. Specifically, it shows us the Optimizer抯 estimated number of rows (E-Rows) and the actual number of rows (A-Rows) for each row source. If the estimates are vastly different from the actual, one probably needs to investigate why. For example: In the below plan, look at line 8.
The Optimizer estimates 5,899 rows and the row source actually returns 5,479,000 rows. If the estimate is off by three orders of magnitude (1000), chances are the plan will be sub-optimal. Do note that with Nested Loop Joins you need to multiply the Starts column by the E-Rows column to get the A-Rows values (see line 10).
As a first step of triage, I would suggest executing the query with a GATHER_PLAN_STATISTICS hint followed by a call to DBMS_XPLAN.DISPLAY_CURSOR.
The GATHER_PLAN_STATISTICS hint allows for the collection of extra metrics during the execution of the query. Specifically, it shows us the Optimizer抯 estimated number of rows (E-Rows) and the actual number of rows (A-Rows) for each row source. If the estimates are vastly different from the actual, one probably needs to investigate why. For example: In the below plan, look at line 8.
The Optimizer estimates 5,899 rows and the row source actually returns 5,479,000 rows. If the estimate is off by three orders of magnitude (1000), chances are the plan will be sub-optimal. Do note that with Nested Loop Joins you need to multiply the Starts column by the E-Rows column to get the A-Rows values (see line 10).
select /*+ gather_plan_statistics */ … from … ;
select * from table(dbms_xplan.display_cursor(null, null, ‘ALLSTATS LAST’));
——————————————————————————————
| Id | Operation | Name | Starts | E-Rows | A-Rows |
——————————————————————————————
| 1 | SORT GROUP BY | | 1 | 1 | 1 |
|* 2 | FILTER | | 1 | | 1728K |
| 3 | NESTED LOOPS | | 1 | 1 | 1728K |
|* 4 | HASH JOIN | | 1 | 1 | 1728K |
| 5 | PARTITION LIST SINGLE | | 1 | 6844 | 3029 |
|* 6 | INDEX RANGE SCAN | PROV_IX13 | 1 | 6844 | 3029 |
| 7 | PARTITION LIST SINGLE | | 1 | 5899 | 5479K |
|* 8 | TABLE ACCESS BY LOCAL INDEX ROWID | SERVICE | 1 | 5899 | 5479K |
|* 9 | INDEX SKIP SCAN | SERVICE_IX8 | 1 | 4934 | 5479K |
| 10 | PARTITION LIST SINGLE | | 1728K | 1 | 1728K |
|* 11 | INDEX RANGE SCAN | CLAIM_IX7 | 1728K | 1 | 1728K |
——————————————————————————————
Other SQL execution statistics
Staring with Oracle 9i we see several exciting new execution plan columns and you can ruin specialized queries to see estimates of resource usage associated with specific steps of Oracle SQL execution:
-
CPU_COST – The CPU cost of the operation as estimated by the optimizer’s cost-based approach. For statements that use the rule-based approach, this column is null. The value of this column is proportional to the number of machine cycles required for the operation.
-
IO_COST – The I/O cost of the operation as estimated by the optimizer’s cost-based approach. For statements that use the rule-based approach, this column is null. The value of this column is proportional to the number of data blocks read by the operation
-
TEMP_SPACE – The temporary space, in bytes, used by the operation as estimated by the optimizer’s cost-based approach. For statements that use the rule-based approach, or for operations that don’t use any temporary space, this column is null.
Oracle also has two new views to show active work area space, the v$sql_workarea and the v$sql_workarea_active views.
| select to_number(decode(SID, 65535, NULL, SID)) sid, operation_type OPERATION, trunc(WORK_AREA_SIZE/1024) WSIZE, trunc(EXPECTED_SIZE/1024) ESIZE, trunc(ACTUAL_MEM_USED/1024) MEM, trunc(MAX_MEM_USED/1024) “MAX MEM”, number_passes PASS from v$sql_workarea_active order by 1,2; |
Here is a sample listing from this script, showing the work area estimated vs. actual size, plus the memory used for the specific operation. This is a great way to see the actual RAM used in a sort or hash join operation.
| SID OPERATION WSIZE ESIZE MEM MAX MEM PASS — ——————— —– ——— ——— ——— —- 27 GROUP BY (SORT) 73 73 64 64 0 44 HASH-JOIN 3148 3147 2437 6342 1 71 HASH-JOIN 13241 19200 12884 34684 1 |
For more details on gathering run-time SQL execution statistics see my book “Oracle Tuning: The Definitive Reference“.
Also see the related statistics_level parameter tips.