Tuesday, March 4, 2008

Oracle 10g : Case-Insensitive Searching

Say in the famous Scott's "emp" table, I have data like : "Ravi", "rAvi", "RAvi" ...etc.,. So, if I need to fire a query for "ravi", the standard practice is to create a Upper(emp_name) index and fire the following query :

select * from emp where upper(emp_name) like upper('%raVI%');

Prior to Oracle 10g, you had to adopt one of the following strategies:

* Use a function-based index on UPPER (column_name) and modify the queries to use WHERE UPPER (column_name) = value.
* Use a trigger to roll the column value to upper- or lowercase upon modification.
* Use Oracle Text to create a TEXT index on the column; text indexes (which would then mandate the use of the CONTAINS operator in the predicate) can be case-sensitive or -insensitive.

In each of these cases, one of your conditions would have been violated. You would have needed to use triggers, or UPPER() or CONTAINS in the WHERE clause.

In Oracle 10g, you can do this transparently, without effecting your query. Which means that, you don't have to use UPPER() or CONTAINS in your where clause.

Wait ...wait ...I know ... you will believe only if I show you an example. right ?

Here you go ...

drop table cit;

create table cit ( data varchar2(20) );

insert into cit values ( 'Ravi' );
insert into cit values ( 'rAVi' );
insert into cit values ( 'rAvI' );

commit;

/*Create a function-based index on the DATA column and use the binary case insensitive sort*/

create index cit_idx on cit( nlssort( data, 'NLS_SORT=BINARY_CI' ) );

select * from cit where data = 'ravi';
--You will get zero records.

alter session set nls_comp=ansi;
alter session set nls_sort=binary_ci;


select * from cit where data = 'ravi';
--You will get 3 records.

--Now, let's take a look at the explain plan. CBO is using FTS.
--Let's fake CBO that there are 10000000 in the table.
exec dbms_stats.set_table_stats (ownname=>user,tabname=>'CIT',numrows=> 10000000);

--Now run the select again and take a look at the explain plan.
select * from cit where data = 'ravi';

Execution Plan
------------------------------
SELECT STATEMENT (Cost=2)
TABLE ACCESS (BY INDEX ROWID)
INDEX (RANGE SCAN) OF 'CIT_IDX'

Isn't it just awsome !!


Long live "Tom Kyte".


Source : http://www.oracle.com/technology/oramag/oracle/04-jul/o44asktom.html

Good Luck !!

r-a-v-i

No comments: