Gather Statistics with DBMS_STATS by Jeff
Contents
- Overview
- Missing statistics
- Analyze vs. DBMS_STATS
- What gets collected?
- Where are the statistics stored?
- Compute statistics vs. Estimate statistics
- DBMS_STATS functions and variable definitions
- DBMS_STATS in action (Examples)
- Automated table monitoring and stale statistics gathering example
- How to determine if dictionary statistics are RDBMS-generated or user-defined
Oracle’s cost-based optimizer (COB) uses statistics to calculate the selectivity (the fraction of rows in a table that the SQL statement’s predicate chooses) of predicates and to estimate the “cost” of each execution plan. The COB will use the selectivity of a predicate to estimate the cost of a particular access method and to determin the optimal join order. Statistics are used to quantify the data distribution and storage characteristics of tables, columns, indexes and partitions. The COB uses these statistics to estimate how much I/O and memory are required to execute a SQL statement using a particular execution plan. Statistics are stored in the data dictionary, and they can be exported from one database and imported into another. Situations in where you would want to perform this, might be to transfer production statistics to a test system to simulate the real environment, even though the test system may only have small samples of the data.
Only statistics stored in the dictionary itself have an impact on the cost-based optimizer.
When statistics do not exist on schema objects, the optimizer uses the following default values.
Tables Statistic Default Value Used by Optimizer Cardinality 100 rows Avg. row len 20 bytes No. of blocks 100 Remote cardinality 2000 rows Remote average row length 100 bytes Indexes Statistic Default Value Used by Optimizer Levels 1 Leaf blocks 25 Leaf blocks/key 1 Data blocks/key 1 Distinct keys 100 Clustering factor 800 (8*no. of blocks)
The following is a quick overview of the two.
- Analyze
- The only method available for collecting statistics in Oracle 8.0 and lower.
ANALYZEcan only run serially.ANALYZEcannot overwrite or delete certain types of statistics that where generated byDBMS_STATS. (analyze不能改写或删除某些由dbms_stats收集的统计信息)ANALYZEcalculates global statistics for partitioned tables and indexes instead of gathering them directly. This can lead to inaccuracies for some statistics, such as the number of distinct values. (analyze对分区表和分区索引手机全局的信息而不是直接收集某个分区的信息,这样就会操场某些信息的不准确性,如唯一值的数量)
- For partitioned tables and indexes,
ANALYZEgathers statistics for the individual partitions and then calculates the global statistics from the partition statistics. (先收集独立分区的信息,然后计算分区的全局信息)- For composite partitioning,
ANALYZEgathers statistics for the subpartitions and then calculates the partition statistics and global statistics from the subpartition statistics. (混合分区先收集子分区信息,由子分区信息计算上级分区信息和全局信息)ANALYZEcan gather additional information that is not used by the optimizer, such as information about chained rows and the structural integrity of indexes, tables, and clusters.DBMS_STATSdoes not gather this information. (analyze能收集更多的信息,这些信息并不被优化器所使用,比如迁移行、table/index/cluster的结构完整性)- No easy way of knowing which tables or how much data within the tables have changed. The DBA would generally re-analyze all of their tables on a semi-regular basis.
- DBMS_STATS
- Only available for Oracle 8i and higher.
- Statistics can be generated to a statistics table and can then be imported or exported between databases and re-loaded into the data dictionary at any time. This allows the DBA to experiment with various statistics.
DBMS_STATSroutines have the option to run via parallel query or operate serially.- Can gather statistics for sub-partitions or partitions.
- Certain DDL commands (ie. create index) automatically generate statistics, therefore eliminating the need to generate statistics explicitly after DDL command. (某些DDL命令,如create index能自动生成统计信息,而不用执行完ddl之后再收集)
DBMS_STATSdoes not generate information about chained rows and the structural integrity of segments.- The DBA can set a particular table, a whole schema or the entire database to be automatically monitored when a modification occurs. When enabled, any change (insert, update, delete, direct load, truncate, etc.) that occurs on a table will be tracked in the SGA. This information is incorporated into the data dictionary by the SMON process at a pre-set interval (every 3 hours in Oracle 8.1.x, and every 15 minutes in Oracle 9i). The information collected by this monitoring can be seen in the
DBA_TAB_MODIFICATIONSview. Oracle 9i introduced a new function in theDBMS_STATSpackage called:FLUSH_DATABASE_MONITORING_INFO. The DBA can make use of this function to flush the monitored table data more frequently. Oracle 9i will also automatically call this procedure prior to executingDBMS_STATSfor statistics gathering purposes. Note that this function is not included with Oracle 8i..判断陈旧的统计数据
为了判断数据库对象是否需要新的统计数据,oracle提供一个表监控工具。当statistics_level=typical或all时,监控被开启。监控追踪表自从上次统计数据被收集以来,表被insert,update,delete的近似值,还有表是否被truncate。表的变化信息可以查看user_tab_modifications视图。随着数据修改,当oracle移植信息到视图的时候,可能会有几分钟的迟延。DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO过程来立即反映在内存中显著的监控信息。
当options参数被设置为gather stale或gather auto ,Gather_database_stats或gather_schema_stats过程使用旧的统计数据来收集新的统计数据。如果监控表的已经被修改了超过10%,那么这些统计数据被认为是旧的,需要再次被收集。
DBA_TAB_MODIFICATIONS:This view is populated only for tables with theMONITORINGattribute. It is intended for statistics collection over a long period of time. For performance reasons, the Oracle Database does not populate this view immediately when the actual modifications occur. Run theFLUSH_DATABASE_MONITORING_INFOprocedure in theDIMS_STATSPL/SQL package to populate this view with the latest information. TheANALYZE_ANYsystem privilege is required to run this procedure.DBMS_STATSprovides a more efficient, scalable solution for statistics gathering and should be used over the traditionalANALYZEcommand which does not support features such as parallelism and stale statistics collection.- Use of table monitoring in conjunction with
DBMS_STATSstale object statistics generation is highly recommended for environments with large, random and/or sporadic data changes. These features allow the database to more efficiently determine which tables should be re-analyzed versus the DBA having to force statistics collection for all tables. Including those that have not changed enough to merit a re-scan)
- * Number of data blocks below the high water mark (that is, the number of data blocks that have been formatted to receive data, regardless whether they currently contain data or are empty) (BLOCKS) 高水位下的block数量
- * Number of data blocks allocated to the table that have never been used (EMPTY_BLOCKS) 从未使用的空块数
- Average available free space in each data block in bytes (AVG_SPACE) 平均每个块的可用空间
- Number of chained rows. [Not collected by DBMS_STATS] (CHAIN_COUNT) 行迁移数
- Average row length, including the row’s overhead, in bytes (AVG_ROW_LEN) 平均行长
- * Depth of the index from its root block to its leaf blocks (BLEVEL) 根块到叶块的深度
- Number of leaf blocks (LEAF_BLOCKS) 叶块的数量
- Number of distinct index values (DISTINCT_KEYS) distinct值的数量
- Average number of leaf blocks per index value (AVG_LEAF_BLOCKS_PER_KEY) 平均每个索引值的叶块数量
- Average number of data blocks per index value (for an index on a table) (AVG_DATA_BLOCKS_PER_KEY) 平均每索引值的数据块数。
- Clustering factor (how well ordered the rows are about the indexed values) (CLUSTERING_FACTOR) 聚集因子
Where are the statistics stored?
Statistics are stored into the Oracle Data Dictionary, in tables owned by SYS. Views are created on these tables to retrieve data more easily. These views are prefixed with
DBA_orALL_orUSER_. For ease of reading, we will use DBA_% views, but ALL_% views or USER_% views could be used as well.- Statistics available only since 8.0.X rdbms release : (*) - Statistics available only since 8.1.X rdbms release : (**) - Statistics not available at partition or subpartition level : (G) - Statistics not available at subpartition level : (GP)Table level statistics can be retrieved from:
DBA_ALL_TABLES – (8.X onwards) DBA_OBJECT_TABLES – (8.X onwards DBA_TABLES – (all versions) DBA_TAB_PARTITIONS – (8.X onwards) DBA_TAB_SUBPARTITIONS – (8.1 onwards) Columns to look at are: NUM_ROWS : Number of rows (always exact even when computed with ESTIMATE method) BLOCKS : Number of blocks which have been used even if they are empty due to delete statements EMPTY_BLOCKS : Number of empty blocks (these blocks have never been used) AVG_SPACE : Average amount of FREE space in bytes in blocks allocated to the table : Blocks + Empty Blocks CHAIN_CNT : Number of chained or migrated rows AVG_ROW_LEN : Average length of rows in bytes AVG_SPACE_FREELIST_BLOCKS (*)(G) : Average free space of blocks in the freelist NUM_FREELIST_BLOCKS (*)(G) : Number of blocks in the freelist SAMPLE_SIZE : Sample defined in ESTIMATE method (0 if COMPUTE) LAST_ANALYZED : Timestamp of last analysis GLOBAL_STATS (**) : For partitioned tables, YES means statistics are collected for the TABLE as a whole NO means statistics are estimated from statistics on underlying table partitions or subpartitions USER_STATS (**) : YES if statistics entered directly by the userIndex level statistics can be retrieved from:
DBA_INDEXES – (all versions ) DBA_IND_PARTITIONS – (8.X onwards) DBA_IND_SUBPARTITIONS – (8.1 onwards ) Columns to look at are: BLEVEL : B*Tree level : depth of the index from its root block to its leaf blocks LEAF_BLOCKS : Number of leaf blocks DISTINCT_KEYS : Number of distinct keys AVG_LEAF_BLOCKS_PER_KEY : Average number of leaf blocks in which each distinct key appears (1 for a UNIQUE index) AVG_DATA_BLOCKS_PER_KEY : Average number of data blocks in the table that are pointed to by a distinct key CLUSTERING_FACTOR : - if near the number of blocks, then the table is ordered : index entries in a single leaf block tend to point to rows in same data block - if near the number of rows, the table is randomly ordered : index entries in a single leaf block are unlikely to point to rows in same data block SAMPLE_SIZE : Sample defined in ESTIMATE method (0 if COMPUTE) LAST_ANALYZED : Timestamp of last analysis GLOBAL_STATS (**) : For partitioned indexes, YES means statistics are collected for the INDEX as a whole NO means statistics are estimated from statistics on underlying index partitions or subpartitions USER_STATS (**) : YES if statistics entered directly by the user PCT_DIRECT_ACCESS (**)(GP) : For secondary indexes on IOTs, percentage of rows with VALID guessColumn level statistics can be retrieved from:
DBA_TAB_COLUMNS – (all versions) DBA_TAB_COL_STATISTICS – (Version 8.X onwards) DBA_PART_COL_STATISTICS – (Version 8.X onwards) DBA_SUBPART_COL_STATISTICS – (Version 8.1 onwards) The last three views extract statistics data from DBA_TAB_COLUMNS. Columns to look at are: NUM_DISTINCT : Number of distinct values LOW_VALUE : Lowest value LOW_VALUE : Highest value DENSITY : Density NUM_NULLS : Number of columns having a NULL value AVG_COL_LEN : Average length in bytes NUM_BUCKETS : Number of buckets in histogram for the column SAMPLE_SIZE : Sample defined in ESTIMATE method (0 if COMPUTE) LAST_ANALYZED : Timestamp of last analysis (**)GLOBAL_STATS : For partitioned tables, YES means statistics are collected for the TABLE as a whole NO means statistics are estimated from statistics on underlying table partitions or subpartitions (**)USER_STATS : YES if statistics entered directly by the user
Compute statistics vs. Estimate statistics
Both computed and estimated statistics are used by the Oracle optimizer to choose the execution plan for SQL statements that access analyzed objects. These statistics may also be useful to application developers who write such statements. COMPUTE STATISTICS
COMPUTE STATISTICSinstructs Oracle to compute exact statistics about the analyzed object and store them in the data dictionary. When computing statistics, an entire object is scanned to gather data about the object. This data is used by Oracle to compute exact statistics about the object. Slight variances throughout the object are accounted for in these computed statistics. Because an entire object is scanned to gather information for computed statistics, the larger the size of an object, the more work that is required to gather the necessary information.Some statistics are always computed exactly, such as the number of data blocks currently containing data in a table or the depth of an index from its root block to its leaf blocks. 某些信息总被精确地统计,如表当前使用的block数、index的高度
ESTIMATE STATISTICSinstructs Oracle to estimate statistics about the analyzed object and stores them in the data dictionary. When estimating statistics, Oracle gathers representative information from portions of an object. This subset of information provides reasonable, estimated statistics about the object. The accuracy of estimated statistics depends upon how representative the sampling used by Oracle is. Only parts of an object are scanned to gather information for estimated statistics, so an object can be analyzed quickly. You can optionally specify the number or percentage of rows that Oracle should use in making the estimate.estimate statistics会随机选择样本大小。你可以定义基于row数或block数样本的大小
- Row sampling reads rows without regard to their physical placement on disk. This provides the most random data for estimates, but it can result in reading more data than necessary. For example, in the worst case a row sample might select one row from each block, requiring a full scan of the table or index. 行取样不考虑行的物理分布,这种方式提供了最大的随机数据来估算。但是可能造成比需要的多读了很多数据。
- Block sampling reads a random sample of blocks and uses all of the rows in those blocks for estimates. This reduces the amount of I/O activity for a given sample size, but it can reduce the randomness of the sample if rows are not randomly distributed on disk. Block sampling is not available for index statistics. 索引统计不支持block取样方式。
Notes on estimating statistics
- The default estimate of the analyze command reads the first approx 1064 rows of the table so the results often leave a lot to be desired. 默认大概取1064行进行估算
- The general consensus is that the default value of 1064 is not sufficient for accurate statistics when dealing with tables of any size. Many claims have shown that estimating statistics on 30 percent produces very accurate results. I personally have been running estimate 35 percent. This seems to produce very accurate numbers. It also saves a lot of time over full scans. 1064可能太小,我个人总是选用35%
- Note that if an estimate does 50% or more of a table Oracle converts the estimate to a full compute statistics. 如果取样>=50%,oracle就会做full compute
DBMS_STATS functions and variable definitions
Most of the
DBMS_STATSprocedures include the three parametersstatown,stattab, andstatid. These parameters allow you to store statistics in your own tables (outside of the dictionary), which does not affect the optimizer. Therefore, you can maintain and experiment with sets of statistics. Thestattabparameter specifies the name of a table in which to hold statistics, and it is assumed that it resides in the same schema as the object for which statistics are collected (unless thestatownparameter is specified). Users may create multiple tables with differentstattabidentifiers to hold separate sets of statistics.DBMS_STATS.CREATE_STAT_TABLE ( ownname VARCHAR2, stattab VARCHAR2, tblspace VARCHAR2 DEFAULT NULL);
- ownname : Name of the schema.
- stattab : Name of the table to create. This value should be passed as the stattab parameter to other procedures when the user does not want to modify the dictionary statistics directly.
- tblspace : Tablespace in which to create the stat tables. If none is specified, then they are created in the user’s default tablespace.
DBMS_STATS.drop_stat_table ( ownname VARCHAR2, stattab VARCHAR2);DBMS_STATS.gather_schema_stats ( ownname VARCHAR2, estimate_percent NUMBER DEFAULT NULL, block_sample BOOLEAN DEFAULT FALSE, method_opt VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE 1', degree NUMBER DEFAULT NULL, granularity VARCHAR2 DEFAULT 'DEFAULT', cascade BOOLEAN DEFAULT FALSE, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, options VARCHAR2 DEFAULT 'GATHER', objlist OUT ObjectTab, statown VARCHAR2 DEFAULT NULL);
- ownname : Schema to analyze (NULL means current schema).
- estimate_percent : Percentage of rows to estimate (NULL means compute): The valid range is [0.000001,100).
- block_sample : Whether or not to use random block sampling instead of random row sampling. Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated. Only pertinent when doing an estimate statistics.
- method_opt : Method options of the following format (the phrase 'SIZE 1' is required to ensure gathering statistics in parallel and for use with the phrase hidden):
FOR ALL [INDEXED | HIDDEN] COLUMNS [SIZE integer] SIZE :specifies the maximum number of partitions (buckets) in the histogram.
Default value: 75
Range of values: 1 – 254
- degree : Degree of parallelism (NULL means use table default value).
- granularity : Granularity of statistics to collect (only pertinent if the table is partitioned). 仅在分区表中生效
- cascade : Gather statistics on the indexes as well. Index statistics gathering is not parallelized. Using this option is equivalent to running the gather_index_stats procedure on each of the indexes in the schema in addition to gathering table and column statistics. cascade时索引信息统计不会使用并行方式
- stattab : User stat table identifier describing where to save the current statistics.
- statid : Identifier (optional) to associate with these statistics within stattab.
- options : Further specification of which objects to gather statistics for:
- GATHER: Gather statistics on all objects in the schema.
- GATHER STALE: Gather statistics on stale objects as determined by looking at the *_tab_modifications views. Also, return a list of objects found to be stale.
- GATHER EMPTY: Gather statistics on objects which currently have no statistics. also, return a list of objects found to have no statistics.
- LIST STALE: Return list of stale objects as determined by looking at the *_tab_modifications views.
- LIST EMPTY: Return list of objects which currently have no statistics.
- objlist : List of objects found to be stale or empty.
- statown : Schema containing stattab (if different than ownname).
DBMS_STATS.export_schema_stats ( ownname VARCHAR2, stattab VARCHAR2, statid VARCHAR2 DEFAULT NULL, statown VARCHAR2 DEFAULT NULL);
- ownname : Name of the schema.
- stattab : User stat table identifier describing where to store the statistics.
- statid : Identifier (optional) to associate with these statistics within stattab.
- statown : Schema containing stattab (if different than ownname).
DBMS_STATS.import_schema_stats ( ownname VARCHAR2, stattab VARCHAR2, statid VARCHAR2 DEFAULT NULL, statown VARCHAR2 DEFAULT NULL);
- ownname : Name of the schema.
- stattab : User stat table identifier describing from where to retrieve the statistics.
- statid : Identifier (optional) to associate with these statistics within stattab.
- statown : Schema containing stattab (if different than ownname).
DBMS_STATS.delete_schema_stats ( ownname VARCHAR2, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, statown VARCHAR2 DEFAULT NULL);
- ownname : Name of the schema.
- stattab : User stat table identifier describing from where to delete the statistics. If stattab is NULL, then the statistics are deleted directly in the dictionary.
- statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).
- statown : Schema containing stattab (if different than ownname).
Set Table Stats 手工设置统计信息
DBMS_STATS.set_table_stats ( ownname VARCHAR2, tabname VARCHAR2, partname VARCHAR2 DEFAULT NULL, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, numrows NUMBER DEFAULT NULL, numblks NUMBER DEFAULT NULL, avgrlen NUMBER DEFAULT NULL, flags NUMBER DEFAULT NULL, statown VARCHAR2 DEFAULT NULL);
- ownname : Name of the schema.
- tabname : Name of the table.
- partname : Name of the table partition in which to store the statistics. If the table is partitioned and partname is NULL, then the statistics are stored at the global table level.
- stattab : User stat table identifier describing where to store the statistics. If stattab is NULL, then the statistics are stored directly in the dictionary.
- statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).
- numrows : Number of rows in the table (partition).
- numblks : Number of blocks the table (partition) occupies.
- avgrlen : Average row length for the table (partition).
- flags : For internal Oracle use (should be left as NULL).
- statown : Schema containing stattab (if different than ownname).
DBMS_STATS.get_table_stats ( ownname VARCHAR2, tabname VARCHAR2, partname VARCHAR2 DEFAULT NULL, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, numrows OUT NUMBER, numblks OUT NUMBER, avgrlen OUT NUMBER, statown VARCHAR2 DEFAULT NULL);
- ownname : Name of the schema.
- tabname : Name of the table to which this column belongs.
- partname : Name of the table partition from which to get the statistics. If the table is partitioned and if partname is NULL, then the statistics are retrieved from the global table level.
- stattab : User stat table identifier describing from where to retrieve the statistics. If stattab is NULL, then the statistics are retrieved directly from the dictionary.
- statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).
- numrows : Number of rows in the table (partition).
- numblks : Number of blocks the table (partition) occupies.
- avgrlen : Average row length for the table (partition).
- statown : Schema containing stattab (if different than ownname).
DBMS_STATS.GET_INDEX_STATS ( ownname VARCHAR2, indname VARCHAR2, partname VARCHAR2 DEFAULT NULL, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, numrows OUT NUMBER, numlblks OUT NUMBER, numdist OUT NUMBER, avglblk OUT NUMBER, avgdblk OUT NUMBER, clstfct OUT NUMBER, indlevel OUT NUMBER, statown VARCHAR2 DEFAULT NULL);
- ownname : Name of the schema.
- indname : Name of the index.
- partname : Name of the index partition for which to get the statistics. If the index is partitioned and if partname is NULL, then the statistics are retrieved for the global index level.
- stattab : User stat table identifier describing from where to retrieve the statistics. If stattab is NULL, then the statistics are retrieved directly from the dictionary.
- statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).
- numrows : Number of rows in the index (partition).
- numlblks : Number of leaf blocks in the index (partition).
- numdist : Number of distinct keys in the index (partition).
- avglblk : Average integral number of leaf blocks in which each distinct key appears for this index (partition).
- avgdblk : Average integral number of data blocks in the table pointed to by a distinct key for this index (partition).
- clstfct : Clustering factor for the index (partition).
- indlevel : Height of the index (partition).
- statown : Schema containing stattab (if different than ownname).
DBMS_STATS in action (Examples)
BEGIN DBMS_STATS.create_stat_table ( ownname => 'scott', stattab => 'stats_table', tblspace => 'users'); END; /BEGIN DBMS_STATS.drop_stat_table ( ownname => 'scott', stattab => 'stats_table'); END; /Gather Schema Stats to Data Dictionary
BEGIN DBMS_STATS.gather_schema_stats ( ownname => 'scott', estimate_percent => null, block_sample => false, method_opt => 'FOR ALL COLUMNS SIZE 1', degree => null, granularity => 'ALL', cascade => true, options => 'GATHER'); END; /Gather Schema Stats to Stats Table
BEGIN DBMS_STATS.gather_schema_stats ( ownname => 'scott', estimate_percent => null, block_sample => false, method_opt => 'FOR ALL COLUMNS SIZE 1', degree => null, granularity => 'ALL', cascade => true, stattab => 'stats_table', statid => 'TEST1', options => 'GATHER', statown => 'scott'); END; /Export Schema Statistics from Data Dictionary to Stats Table
BEGIN DBMS_STATS.export_schema_stats ( ownname => 'scott', stattab => 'stats_table_backup', statid => 'BACKUP_TEST1', statown => 'scott'); END; /Import Schema Statistics from Data Dictionary to Stats Table
BEGIN DBMS_STATS.import_schema_stats ( ownname => 'scott', stattab => 'stats_table', statid => 'TEST1', statown => 'scott'); END; /Delete Schema Stats from Data Dictionary
BEGIN DBMS_STATS.delete_schema_stats ('scott'); END; /Delete Schema Stats from Stats Table
BEGIN DBMS_STATS.delete_schema_stats ( ownname => 'scott', stattab => 'stats_table_backup', statid => 'BACKUP_TEST1', statown => 'scott'); END; /Set Table Stats (Manually) in the Data Dictionary
BEGIN DBMS_STATS.set_table_stats ( ownname => 'scott', tabname => 'emp', partname => null, numrows => 650000000, numblks => 53455443, avgrlen => 212, flags => null); END; /Set Table Stats (Manually) in the Stats Table
BEGIN DBMS_STATS.set_table_stats ( ownname => 'scott', tabname => 'emp', partname => null, stattab => 'stats_table', statid => 'TEST1', numrows => 650000000, numblks => 53455443, avgrlen => 212, flags => null, statown => 'scott'); END; /Get Table Statistics in SQL*Plus Variables
SQL> variable NUMROWS number SQL> variable NUMBLKS number SQL> variable AVGRLEN number BEGIN DBMS_STATS.get_table_stats ( 'scott', 'emp', NUMROWS=>:numrows, NUMBLKS=>:numblks, AVGRLEN=>:avgrlen); END; / PL/SQL procedure successfully completed. SQL> print NUMROWS NUMBLKS AVGRLEN NUMROWS ---------- 1000 NUMBLKS ---------- 28 AVGRLEN ---------- 92Get Index Statistics in SQL*Plus Variables
SQL> variable NUMROWS number SQL> variable NUMLBLKS number SQL> variable NUMDIST number SQL> variable AVGLBLK number SQL> variable AVGDBLK number SQL> variable CLSTFCT number SQL> variable INDLEVEL number BEGIN DBMS_STATS.get_index_stats ( 'SCOTT', 'EMP_PK', NUMROWS => :NUMROWS, NUMLBLKS => :NUMLBLKS, NUMDIST => :NUMDIST, AVGLBLK => :AVGLBLK, AVGDBLK => :AVGDBLK, CLSTFCT => :CLSTFCT, INDLEVEL => :INDLEVEL); END; / PL/SQL procedure successfully completed. SQL> print NUMROWS NUMLBLKS NUMDIST AVGLBLK AVGDBLK CLSTFCT INDLEVEL NUMROWS ---------- 1000 NUMLBLKS ---------- 3 NUMDIST ---------- 1000 AVGLBLK ---------- 1 AVGDBLK ---------- 1 CLSTFCT ---------- 15 INDLEVEL ---------- 1
Automated table monitoring and stale statistics gathering example
You can automatically gather statistics or create lists of tables that have stale or no statistics.
- GATHER STALE : Gathers statistics on tables with stale statistics.
- GATHER : Gathers statistics on all tables. (default)
- GATHER EMPTY : Gathers statistics only on tables without statistics.
- LIST STALE : Creates a list of tables with stale statistics.
- LIST EMPTY : Creates a list of tables that do not have statistics.
The objlist parameter identifies an output parameter for the
LIST STALEandLIST EMPTYoptions. The objlist parameter is of typeDBMS_STATS.OBJECTTAB.Step 1 : Perform a quick analyze to load in base statistics
BEGIN DBMS_STATS.GATHER_SCHEMA_STATS ( ownname => 'scott', estimate_percent => null, -- Small table, lets compute block_sample => false, method_opt => 'FOR ALL COLUMNS', degree => null, -- No parallelism used in this example granularity => 'ALL', cascade => true, -- Make sure we include indexes options => 'GATHER' -- Gather mode ); END; / PL/SQL procedure successfully completed.Step 2 : Examine the current statistics
SELECT table_name, num_rows, blocks, avg_row_len FROM user_tables WHERE table_name='EMP'; TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN ------------------------------ ---------- ---------- ----------- EMP 1500 28 92Step 3 : Turn on Automatic Monitoring
Now turn on automatic monitoring for the emp table. This can be done using the
alter tablemethod. Starting with Oracle 9i, you can also perform this at the “schema”, and “entire database” level. I provide the syntax for all three methods below. Monitor only the EMP table.alter table emp monitoring; Table altered.Monitor all of the tables within Scott’s schema. (Oracle 9i and higher)
BEGIN DBMS_STATS.alter_schema_tab_monitoring('scott', true); END; / PL/SQL procedure successfully completed.Monitor all of the tables within the database. (Oracle 9i and higher) Note: Although the option to collect statistics for SYS tables is available via ALTER_DATABASE_TAB_MONITORING, Oracle continues to recommend against this practice until the next major release after 9i Release 2. Also note that the ALTER_DATABASE_TAB_MONITORING procedure in the DBMS_STATS package only monitors tables; there is an ALTER INDEX…MONITORING statement which can be used to monitor indexes. Thanks to Nabil Nawaz for providing this and pointing out an error I made in the previous version of this article.
BEGIN DBMS_STATS.alter_database_tab_monitoring ( monitoring => true, sysobjs => false); -- Don't set to true, see note above. END; / PL/SQL procedure successfully completed.Step 4 : Verify that monitoring is turned on.
Note: The results of the following query are from running the
alter table ...statement on the emp table only.SELECT table_name, monitoring FROM user_tables ORDER BY monitoring; TABLE_NAME MONITORING ------------------------------ ---------- DEPT NO EMP YESStep 5 : Delete some rows from the database.
SQL> DELETE FROM emp WHERE rownum < 501; 500 rows deleted. SQL> commit; Commit complete.Step 6 : Wait until the monitered data is flushed.
Data can be flushed in several ways.
- In Oracle 8i, you can wait it out for 3 hours.
- In Oracle 9i and higher, you only need to wait 15 minutes.
- In either version, restart the database.
- For immediate results in Oracle 9i and higher, use the DBMS_STATS.flush_database_monitoring_info package.
.判断陈旧的统计数据
为了判断数据库对象是否需要新的统计数据,oracle提供一个表监控工具。当statistics_level=typical或all时,监控被开启。监控追踪表自从上次统计数据被收集以来,表被insert,update,delete的近似值,还有表是否被truncate。表的变化信息可以查看user_tab_modifications视图。随着数据修改,当oracle移植信息到视图的时候,可能会有几分钟的迟延。DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO过程来立即反映在内存中显著的监控信息。
当options参数被设置为gather stale或gather auto ,Gather_database_stats或gather_schema_stats过程使用旧的统计数据来收集新的统计数据。如果监控表的已经被修改了超过10%,那么这些统计数据被认为是旧的,需要再次被收集。
exec dbms_stats.flush_database_monitoring_info; PL/SQL procedure successfully completed.Step 7 : Check for what it has collected.
As user “scott”, check USER_TAB_MODIFICATIONS to see what it was collected.
SELECT * FROM user_tab_modifications; TABLE_NAME PARTITION_NAME SUBPARTITION_NAME INSERTS UPDATES DELETES TIMESTAMP TRUNCATED ---------- -------------- ----------------- ------- ------- ------- --------- --------- EMP 0 0 500 18-SEP-02 NOStep 8 : Execute DBMS_STATS to gather stats on all “stale” tables.
BEGIN DBMS_STATS.GATHER_SCHEMA_STATS( ownname => 'scott', estimate_percent => null, block_sample => false, method_opt => 'FOR ALL COLUMNS', degree => null, granularity => 'ALL', cascade => true, options => 'GATHER STALE'); END; / PL/SQL procedure successfully completed.Step 9 : Verify that the table is no longer listed in USER_TAB_MODIFICATIONS.
SQL> SELECT * FROM user_tab_modifications; no rows selected.Step 10 : Examine some of new statistics collected.
SELECT table_name, num_rows, blocks, avg_row_len FROM user_tables where table_name='EMP'; TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN ------------------------------ ---------- ---------- ----------- EMP 1000 28 92
How to determine if dictionary statistics are RDBMS-generated or user-defined
The following section explains how to determine if your dictionary statistics are RDBMS-generated or set by users through one of the DBMS_STATS.SET_xx_STATS procedures. This is crucial for development environments that are testing the performance of SQL statements with various sets of statistics. The DBA will need to know if the relying statistics are RDBMS-defined or user-defined.
RDBMS-generated statistics are generated by the following:
- ANALYZE SQL command
- DBMS_UTILITY.ANALYZE_SCHEMA procedure
- DBMS_UTILITY.ANALYZE_DATABASE procedure
- DBMS_DDL.ANALYZE_OBJECT procedure
- 8.1 DBMS_STATS.GATHER_xx_STATS procedures
User generated statistics are only done through the use of the DBMS_STATS.SET_xx_STATS procedures The column USER_STATS from DBA_TABLES, ALL_TABLES, USER_TABLES displays:
Nice summation of stats.
Could use an example of how to list the stale tables.