解释直方图信息 - 数据库编程

--------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1111 | 6666 | 5 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| HTAB1 | 1111 | 6666 | 5 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | HTAB1_B | 1111 | | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------

In fact it may be preferable to use a Full Table Scan for the select where b=5 and index lookups for the others.

Gathering Histogram Statistics

If we collect histogram statistics with the recommended settings:

exec DBMS_STATS.GATHER_TABLE_STATS (NULL,'HTAB1', method_opt => 'FOR ALL COLUMNS SIZE AUTO');

The b=5 query now does a Full Table Scan

 select * from htab1 where b=5;

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |  9991 | 69937 |     7   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| HTAB1 |  9991 | 69937 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

The query where B is 3 still uses an index:

 select * from htab1 where b=3;

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     1 |     7 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| HTAB1   |     1 |     7 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | HTAB1_B |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

This is because a FREQUENCY Histogram has been created:

COL   NUM_DISTINCT        LOW       HIGH DENSITY NUM_NULLS NUM_BUCKETS LAST_ANALYZED        SAMPLE_SIZE HISTOGRAM
----- ------------ ---------- ---------- ------- --------- ----------- -------------------- ----------- ---------------
A            10000          1      10000       0         0           1 31-jan-2013 09:58:01       10000 NONE
B               10          1      10000       0         0          10 31-jan-2013 09:58:01       10000 FREQUENCY

TAB        COL        ENDPOINT_NUMBER ENDPOINT_VALUE
---------- ---------- --------------- --------------
     HTAB1          A               0              1
     HTAB1          A               1          10000
     HTAB1          B               1              1
     HTAB1          B               2              2
     HTAB1          B               3              3
     HTAB1          B               4              4
     HTAB1          B            9995              5
     HTAB1          B            9996           9996
     HTAB1          B            9997           9997
     HTAB1          B            9998           9998
     HTAB1          B            9999           9999
     HTAB1          B           10000          10000

12 rows selected.

On Column B there are 10 buckets matching up with the 10 distinct values.

The ENDPOINT_VALUE shows the column value and the ENDPOINT_NUMBER shows the cumulative number of rows. So the number of rows for ENDPOINT_VALUE 2, it has an ENDPOINT_NUMBER 2, the previous ENDPOINT_NUMBER is 1, hence the number of rows with value 2 is 1. Another example is ENDPOINT_VALUE 5. Its ENDPOINT_NUMBER is 9995. The previous bucket ENDPOINT_NUMBER is 4, so 9995 - 4 = 9991 rows containing the value 5.

Frequency histograms work fine with a low number of distinct values, but when the number exceeds the maximum number of buckets, you cannot create a bucket for each value. In this case the Optimizer creates Height balanced histograms.

Height Balanced Histograms

You can demonstrate this situation by forcing the optimizer to create fewer buckets than the Number of Distinct Values. i.e. using 8 buckets for 1

解释直方图信息(三)

Gathering Histogram Statistics

Height Balanced Histograms