You might face this question if you are going though job interviews. and these are some cases where the Optimizer thinks going for a full table scan is better than index scan.
1:- The selectivity of the records is very high:
We have this table and range of contents:
SQL> desc HR.ORDERS;
Name Null? Type
--------------------- --------- ---------------------------------------
ORDER_ID NOT NULL NUMBER(38)
CUSTOMER VARCHAR2(50)
ITEM VARCHAR2(50)
EVENT_DATE DATE
SQL>
SQL> select CUSTOMER, count(*) from ORDERS group by CUSTOMER order by 1,2;
CUSTOMER COUNT(*)
-------------------------------- ----------
JOHN 10
PEDRO 10
PEPE 10
RUBEN 10000
TOM 10
SQL>
We have an index and it is valid:
SQL> CREATE INDEX ORDERS_IDX
ON ORDERS(CUSTOMER);
Index created.
SQL>
col INDEX_NAME format a20
SELECT
index_name,
index_type,
visibility,
status
FROM
all_indexes
WHERE
table_name = 'ORDERS';
INDEX_NAME INDEX_TYPE VISIBILIT STATUS
-------------------- --------------------------- --------- --------
ORDERS_IDX NORMAL VISIBLE VALID
Lets get the plan for a lower count:
EXPLAIN PLAN FOR
select * from ORDERS where CUSTOMER = 'TOM';
SELECT
PLAN_TABLE_OUTPUT
FROM
TABLE(DBMS_XPLAN.DISPLAY());
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------
Plan hash value: 3046898238
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 760 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS | 10 | 760 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | ORDERS_IDX | 10 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CUSTOMER"='TOM')
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
18 rows selected.
SQL>
So far, so good, query is using an index.
Lets try now with the higher count:
EXPLAIN PLAN FOR
select * from ORDERS where CUSTOMER = 'RUBEN';
SELECT
PLAN_TABLE_OUTPUT
FROM
TABLE(DBMS_XPLAN.DISPLAY());
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 1275100350
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 742K| 17 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| ORDERS | 10000 | 742K| 17 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CUSTOMER"='RUBEN')
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
17 rows selected.
SQL>
In this case the OPTIMIZER thinks it is better to do a full table scan. Lets see the COST of forcing to use the index:
EXPLAIN PLAN FOR
select /*+ INDEX(ORDERS ORDERS_IDX) */ * from ORDERS where CUSTOMER = 'RUBEN';
SELECT
PLAN_TABLE_OUTPUT
FROM
TABLE(DBMS_XPLAN.DISPLAY());
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3046898238
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 742K| 84 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS | 10000 | 742K| 84 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | ORDERS_IDX | 10000 | | 29 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CUSTOMER"='RUBEN')
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
18 rows selected.
SQL>
As we can see, the cost of using the index is way higher than doing a full table scan.
Comments