Thursday, December 11, 2008

Performance Tuning - When to use Bind Variables, when not to ? When histograms work and when they don't ?

The usual nightmare of almost every other Java/Oracle developer ... "ooooh...this query is not coming back...what should I do now? How to tune my query. Where should I start ?".

Though there are many ways to tune a query, n this section we see how usage of bind variables, bind variable peeking and Histograms effect the query performance.

This test case here proves that, "Bind Variable Peeking and Histograms won't go together".

Almost all Java/Oracle developers use bind variables, as a rule of thumb.

But it is important to understand how the query execution plan is determined by bind variables.

It is "VERY" important to know the type of system you are working in - whether it is a transactional(OLTP) system or a reporting system. If you are working on a database where you fire the queries against the same transactional database for regular queries and reports, then this is a must read.

The Java developer(Mr.X) would say : Wait a minute .. I am Java/developer and I don't care what the difference between these two is... to me, database is just a data store ... I just store the data and retrieve it back. Let a Java Persistence framework like Hibernate deal with it.

It is important for a Java/Oracle developer to understand that a database is more than just a data store.

Let's cut the crap and get into action, right away.

-- Create a test table

create table test
as
select 1 X, object_name data from all_objects a
/

insert into test
select 2, object_name data from all_objects a where rownum

Bind Variable Peeking :

The first time a query is HARD PARSED, the bind variable values that are supplied with that query
are taken into consideration. That is:

select * from test t1 where x = :a

when hard parsed with a bind value of 2 will be parsed as if it were

select * from test t1 where x = 2;

Oracle uses all statistics, that are available to optimize that query.

The next time the query

select * from test t1 where x = :a

is soft parsed, it will use the existing plan (else, it would be a hard parse again!)

Mr.X : oooh......what the hell all this crap is ? How it effects me ?

As a Java/Oracle developer, it is important to understand how the data is distributed in a column, against which you are querying.

If you look at our test table, it has lot of rows with X=1 and less rows where X=2

So, logically, if we need to get data where x=2, it makes sense if we go to the index and if we need data where x=1, it makes sense if we go to the table(as we are getting most of the table's data).

Let's see how it works ?

Go to sqlplus and set the following :

>set autot trace exp;

-- Case 1 : Use Bind Variable Peeking

select * from test t1 where x = :a

-- pass 2. Oracle should do an Index Range Scan and it does the same. So - CORRECT PLAN

-- pass 1: Oracle should do a Full table scan and it doesn't. So -- INCORRECT PLAN.

Now, let's generate statistics and give the optimizer more info about the data.

analyze table test compute statistics
for table
for all indexes
for all indexed columns

Now it's important to talk about Histograms here. When there is skewed data in a column, when we generate statistics Oracle generates statistics
called Histograms.

select * from User_histograms h where table_name = 'TEST' ;

select * from DBA_histograms h where h.table_name = 'TEST';

What is Skewed data ? -- If the data distribution is uneven in a column, we call that data as skewed data. In our example, we have about 50K+ values of 1 and 9999 values of 2.

-- Case 2 : Use Bind Peeking, use histograms

-- Pass 2 (should do a Index Range Scan but does a Full Table Scan - WRONG PLAN)

select * from test t1 where x = :a

-- Pass 1 (should do a Full Table Scan and does Full Table Scan - CORRECT PLAN)

select * from test t1 where x = :a

-- So, bind variable peeking and histograms may work some times and may not work some times ...

-- Case 3 : No Binds, use histograms

-- Pass 2 (should do a Index Range Scan and does Index Range Scan - CORRECT PLAN)

select * from test t1 where x = 2

-- Pass 1 (should do a Full Table Scan and does Full Table Scan- CORRECT PLAN)

select * from test t1 where x = 1

Mr.X - So, does it mean that I should STOP using bind variables and use literals ?

The answer is - it depends.

Coming back to my original point, if you are working on a transactional system where thousands of queries execute per second, USE bind variable.

If you are firing the query against a reporting database where each query runs for seconds, DON't use bind variables.

Mr.X - Common, I work on a single database which is trasactional. I use the same for reporting too. What should I do in this case ?

The answer is simple. USE bind variables for all your regular transactional queries and DON'T use bind variables for reports.

Mr.X - Now, I understand when to use bin d variables and when not to use. Also the histograms.

Source : AskTom

Refer which columns need histograms by Tom, to get a picture of which columns need histograms. Also a good explanation on Cardinality.

Long Live "Sir, Tom Kyte".

Good Luck,
r-a-v-i