Thursday, November 13, 2008

How Oracle decides whether to use an index or to do a full table scan ?

Let’s directly jump into the code without much bla, bla bla ...….

drop table t1;

create table t1 as
select
trunc((rownum-1)/15) n1,
trunc((rownum-1)/15) n2,
rpad('x',100) v1
from all_objects where rownum <= 3000; drop table t2; create table t2 as select mod(rownum,200) n1, mod(rownum,200) n2, rpad('x',100) v1 from all_objects where rownum <= 3000; create index t_i1 on t1(n1); create index t_i2 on t2(n1); analyze table t1 compute statistics for table for all indexes for all indexed columns; analyze table t2 compute statistics for table for all indexes for all indexed columns; If you look at both the tables, they have pretty much the same data. set autotrace traceonly explain; select * from t1 where n1 = 45; Execution Plan ---------------------------------------------------------- Plan hash value: 1661992259 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 15 | 1020 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | 1020 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | T_I1 | 15 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------

Try this now :

select * from t2 where n1 = 45;

Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 15 | 1020 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T2 | 15 | 1020 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------

Does a full table scan and the cost is 14.

The difference is in the data distribution, a.k.a. the way data is scattered.

Let's look into meta data and see how these tables are scattered.

select * from user_tables where table_name in (‘T1’, 'T2');

select * from USER_TAB_COLUMNS where table_name in (‘T1’, 'T2');

select * from user_indexes u where table_name in (‘T1’, 'T2');

select
table_name tab,
num_rows num_rows,
avg_leaf_blocks_per_key l_blocks,
avg_data_blocks_per_key d_blocks,
clustering_factor cl_fac
from user_indexes
where table_name in ('T1','T2');

Clustering_factor offers information on how the table rows are synchronized with the index. When the clustering factor is close to the number of data blocks and the column value is not row ordered when the clustering_factor approaches the number of rows in the table, the table rows are synchronized with the index.

Basically Four factors synchronize to help the CBO choose whether to use an index or a full-table scan: the selectivity of a column value; the db_block_size ; the avg_row_len ; and the cardinality.

An index scan is usually faster if a data column has high selectivity and a low clustering_factor.

If you look at the user_indexes table these are the facts.

To scan T2 by index : it takes one index block, 15 data blocks = 16

To scan T2 by full table scan : it takes 96 blocks / 8 (multiblock read) = 12
(this is a first approximation)

To scan T1 by index : it takes one index block, one data block = 2

So, since the cost of doing a full table scan (multi block) is lesser than going to an index, CBO choose to go with full table scan.

Let's see what happens if we change it.

alter session set DB_FILE_MULTIBLOCK_READ_COUNT = 4;

SELECT STATEMENT Optimizer=CHOOSE (Cost=16 Card=15)
TABLE ACCESS (BY INDEX ROWID) OF 'T2' (Cost=16 Card=15)
INDEX (RANGE SCAN) OF 'T_I2' (NON-UNIQUE) (Cost=1 Card=15)

alter session set DB_FILE_MULTIBLOCK_READ_COUNT = 16;

SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=15)
TABLE ACCESS (FULL) OF 'T2' (Cost=10 Card=15)

Hope this helps !!!

Long Live "Tom Kyte".

Ravi Vedala (r-a-v-i)

No comments: