Saturday, December 19, 2009

Index column ordering and performance

When we are talking about why oracle is not picking up index, one of the well known reason is cardinality. When data is uniform, an index is selective only when cardinality is not very low. When data is not uniform, we can have histograms but here we will keep that feature apart.

So lets assume we want to create a composite index on two columns which both are part of the filter condition. One column (ID) is having high cardinality and another (Status) is having very low cardinality. Then what should be the column sequence for that index? The obivious answer is, for better performance first column should be one which is having high cardinality.

But in reality, any column sequence will give the same performance. To demonstrate lets create following testcase. We will create a table and will choose two columns object_id and status. The column object_id is distinct and update the status column such that it will only two values with same number of rows.


SQL> create table t as select * from dba_objects;

Table created.




SQL> update t set status=decode(mod(rownum,2),0,'VALID','INVALID');

50623 rows updated.




SQL> commit;

Commit complete.




SQL> select status, count(1) from t group by status;

STATUS COUNT(1)
------- ----------
INVALID 25312
VALID 25311




SQL> create index ind_t on t(object_id, status);

Index created.




SQL> exec dbms_stats.gather_table_stats(user,'T',cascade=>true);

PL/SQL procedure successfully completed.





SQL> select * from t where object_id=100 and status='INVALID';


Execution Plan
----------------------------------------------------------
Plan hash value: 4013845416

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 94 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 94 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IND_T | 1 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_ID"=100 AND "STATUS"='INVALID')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
1207 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


The query executed with 4 consistent gets. Now lets create the index with reverse order and check the performance.


SQL> drop index ind_t;

Index dropped.




SQL> create index ind_t on t(status, object_id);

Index created.




SQL> exec dbms_stats.gather_table_stats(user,'T',cascade=>true);

PL/SQL procedure successfully completed.




SQL> select * from t where object_id=100 and status='INVALID';


Execution Plan
----------------------------------------------------------
Plan hash value: 4013845416

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 94 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 94 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IND_T | 1 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("STATUS"='INVALID' AND "OBJECT_ID"=100)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
1207 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed



Even after changing the column order, the query got executed with 4 consistent gets. So no performance degradation.

So now we have prooved that column order is having no performance on the query.

But this column order may cause performance degradation. Lets assume, In the application, we have some query only with object_id as a filter. If there is no individual index on object_id column, then oracle will use this index and here column ordering will impact.

We will execute the following query with index having order (object_id, status) and lets check the execution plan.



SQL> select * from t where object_id=100;


Execution Plan
----------------------------------------------------------
Plan hash value: 4013845416

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 94 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 94 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IND_T | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_ID"=100)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
1207 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed



It is doing Index range scan with 4 consistent gets. Lets reverse the column order (status, object_id) and check the execution plan.


SQL> select * from t where object_id=100;


Execution Plan
----------------------------------------------------------
Plan hash value: 2246305531

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 94 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 94 | 4 (0)| 00:00:01 |
|* 2 | INDEX SKIP SCAN | IND_T | 1 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_ID"=100)
filter("OBJECT_ID"=100)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
8 consistent gets
0 physical reads
0 redo size
1207 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


Oracle has opted for index skip scan and consistent gets are also double against our previous execution plan.

In the above example, other column is having only 4 distinct values, but if the distinct values are higher then oracle may ignore this index and will opt for Full table scan.

To sum up, when query is having multiple filter conditions, column order of a composite index on those columns does not matter. But it may have impact on other queries having not all filter conditions.

Friday, December 18, 2009

Query and Transitive Closure

I have got some of the comments on my last blog - "Query Performance & Join Conditions" which were asking about the transitive closure. The example that I have given was not for transitive closure. Oracle has introduced transitive

closure when the literal values are present for joining conditions. For example when I have join condition on two tables and same column of one of the table is having filter condition then other table will use the same filter condition.

To demonstrate, we will see the following test case.


SQL> create table t1 as select rownum "N", rownum*30 "M" from dual connect by level <=10000;

Table created.


SQL> create table t2 as select rownum+20 "R", rownum*50 "T" from dual connect by level <=10000;

Table created.



SQL> create index ind_t1 on t1(n);

Index created.



SQL> create index ind_t2 on t2(r);

Index created.



SQL> exec dbms_stats.gather_table_stats(user,'T1',cascade=>true);

PL/SQL procedure successfully completed.



SQL> exec dbms_stats.gather_table_stats(user,'T2',cascade=>true);

PL/SQL procedure successfully completed.



SQL> set autot trace exp stat
SQL> select n, m, t
2 from t1, t2
3 where t1.n = t2.r
4 and t1.n=30;


Execution Plan
----------------------------------------------------------
Plan hash value: 863060763

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 4 (0)| 00:00:01 |
| 1 | MERGE JOIN CARTESIAN | | 1 | 16 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 8 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_T1 | 1 | | 1 (0)| 00:00:01 |
| 4 | BUFFER SORT | | 1 | 8 | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 8 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IND_T2 | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("T1"."N"=30)
6 - access("T2"."R"=30)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
512 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

Check for "Predicate Information" in above-mentioned execution plan. When I have assigned n=30, it has also picked up r=30. Now one might have question, what is the use of this and will it impact performance?

Based on the input filter condition, oracle may opt for better execution plan.
To demonstrate, lets create t2 table with R column as 4 distinct values and execute the above query again.


SQL> create table t1 as select rownum "N", rownum*30 "M" from dual connect by level <=10000;

Table created.



SQL> create table t2 as select mod(rownum,4) "R", rownum*50 "T" from dual connect by level <=10000;

Table created.



SQL> create index ind_t1 on t1(n);

Index created.



SQL> create index ind_t2 on t2(r);

Index created.



SQL> exec dbms_stats.gather_table_stats(user,'T1',cascade=>true);

PL/SQL procedure successfully completed.



SQL> exec dbms_stats.gather_table_stats(user,'T2',cascade=>true);

PL/SQL procedure successfully completed.



SQL> set autot trace exp stat
SQL> select n, m, t
2 from t1, t2
3 where t1.n = t2.r
4 and t2.r=3;

2500 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2470703826

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2500 | 37500 | 9 (12)| 00:00:01 |
|* 1 | HASH JOIN | | 2500 | 37500 | 9 (12)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 8 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_T1 | 1 | | 1 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | T2 | 2500 | 17500 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T1"."N"="T2"."R")
3 - access("T1"."N"=3)
4 - filter("T2"."R"=3)


As per the above plan, based on filter condition(N=3), oracle opted for index scan. This is because of transitive closure. Now what would have happen if transitive closure was not taken place.
For this, lets modify query as follows -


SQL> select n, m, t
2 from t1, (select /*+ no_merge */ * from t2 where r=3) t2
3 where t1.n = t2.r;

2500 rows selected.



Execution Plan
----------------------------------------------------------
Plan hash value: 157564253

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2500 | 90000 | 14 (8)| 00:00:01 |
|* 1 | HASH JOIN | | 2500 | 90000 | 14 (8)| 00:00:01 |
| 2 | VIEW | | 2500 | 70000 | 6 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T2 | 2500 | 17500 | 6 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T1 | 10000 | 80000 | 7 (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T1"."N"="T2"."R")
3 - filter("R"=3)


If we execute same query, oracle would have opted for nested loops.

Conclusion : Transitive closure is the feature which works when we are using filter predicate in query with join condition. Oracle will consider transitive closure with all equility, non-equility predicates.

Tuesday, September 22, 2009

Query Performance & Join Conditions

Oracle database performance is measured by application queries those are having impact on host resources. Oracle generates plan as per the input values and that decides the resource usage effeciency.

While working on performance issue, one of the problem reported by the user that one of the module is not functioning and having some issue with query. When we traced the session found one of the query doing FTS and retriving few records only. This was one of the top resource consumer query in the trace file.

The query was looking very normal and checked for the indexes, statistics of the tables etc. Though the statistics were collected recently, it was not the problem with that.

After investigation, it was found that one of the condition was not correctly defined and the same has been confirmed by developer. When you have multi table join in the query, we have to be very careful about equility predicates.

Here I want to remind one of the fact in oracle -
As per mathematics rules, when we say a=b and b=c, we can conclude a=c. But in oracle this is not the case. Oracle never concludes a=c.

To illustrate this, let us create a test case -

Create two tables and insert some random records in it. Also create required indexes. As the query provided was from the production, here I am not using the same table names. Also I have picked up only the problematic part of the query.


SQL> create table cu_all (custid number, addr varchar2(200), ph number, cano number, acctype varchar2(10));

Table created.

SQL> create table ca_receipt (custid number, caamt number, cadt date, totbal number);

Table created.

SQL>
SQL> insert into cu_all
2 select lvl,
3 dbms_random.string('A',30),
4 round(dbms_random.value(1,100000)),
5 round(dbms_random.value(1,10000)),
6 dbms_random.string('A',10)
7 from (select level "LVL" from dual connect by level <=200000);

200000 rows created.

SQL> insert into ca_receipt
2 select round(dbms_random.value(1,10000)),
3 round(dbms_random.value(1,100000)),
4 sysdate - round(dbms_random.value(1,100000)),
5 round(dbms_random.value(1,100000))
6 from (select level "LVL" from dual connect by level <=500000);

500000 rows created.

SQL> create unique index pk_cu_all_ind on cu_all(custid);

Index created.

SQL> create index ind2_cu_all on cu_all(CANO);

Index created.

SQL> create index ind_ca_receipt_custid on ca_receipt(custid);

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'CU_ALL', cascade=>true);

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user,'CA_RECEIPT', cascade=>true);

PL/SQL procedure successfully completed.



Now let us execute the query with trace on. This is the similar query which was provided to me.



SQL> set autot trace
SQL> SELECT ca.*, cu.*
2 FROM ca_receipt CA,
3 cu_all CU
4 WHERE CA.CUSTID = CU.CUSTID
5 AND CA.CUSTID IN (SELECT CUSTID FROM cu_all START WITH custid = 2353
6 CONNECT BY PRIOR CUSTID = CANO)
7 ORDER BY ACCTYPE DESC;

289 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3186098611

-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 81000 | 504 (2)| 00:00:07 |
| 1 | SORT ORDER BY | | 1000 | 81000 | 504 (2)| 00:00:07 |
|* 2 | HASH JOIN | | 1000 | 81000 | 503 (2)| 00:00:07 |
| 3 | NESTED LOOPS | | | | | |
| 4 | NESTED LOOPS | | 1000 | 26000 | 112 (1)| 00:00:02 |
| 5 | VIEW | VW_NSO_1 | 20 | 100 | 21 (0)| 00:00:01 |
| 6 | HASH UNIQUE | | 20 | 180 | | |
|* 7 | CONNECT BY WITH FILTERING | | | | | |
| 8 | TABLE ACCESS BY INDEX ROWID | CU_ALL | 1 | 9 | 2 (0)| 00:00:01 |
|* 9 | INDEX UNIQUE SCAN | PK_CU_ALL_IND | 1 | | 1 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | | | | |
| 11 | CONNECT BY PUMP | | | | | |
| 12 | TABLE ACCESS BY INDEX ROWID| CU_ALL | 20 | 180 | 21 (0)| 00:00:01 |
|* 13 | INDEX RANGE SCAN | IND2_CU_ALL | 20 | | 1 (0)| 00:00:01 |
|* 14 | INDEX RANGE SCAN | IND_CA_RECEIPT_CUSTID | 50 | | 2 (0)| 00:00:01 |
| 15 | TABLE ACCESS BY INDEX ROWID | CA_RECEIPT | 50 | 1050 | 52 (0)| 00:00:01 |
| 16 | TABLE ACCESS FULL | CU_ALL | 200K| 10M| 389 (1)| 00:00:05 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CA"."CUSTID"="CU"."CUSTID")
7 - access("CANO"=PRIOR "CUSTID")
9 - access("CUSTID"=2353)
13 - access("CANO"=PRIOR "CUSTID")
14 - access("CA"."CUSTID"="CUSTID")


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2249 consistent gets
25 physical reads
0 redo size
11748 bytes sent via SQL*Net to client
729 bytes received via SQL*Net from client
21 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
289 rows processed

If you look at the query, it seems to be normal one.

But the problem is here-

Query is having two tables CA and CU. From the inner CU table query, it fetches records and joins with CA table an CA table Joins with CU table using the same column.
Here the inner query joins with CA table and cardinality of the query gets changed. So it is opting FTS when joining to CU table again.
This is causing the performance bottleneck. So to resolve the issue, I have change the joining condition.

Now if we check, following is the proper execution plan. Also the consistents gets have been reduced to 797 against 2249 in original query.


SQL> SELECT ca.*, cu.*
2 FROM ca_receipt CA,
3 cu_all CU
4 WHERE CA.CUSTID = CU.CUSTID
5 AND CU.CUSTID IN (SELECT CUSTID FROM cu_all START WITH custid = 2353
6 CONNECT BY PRIOR CUSTID = CANO)
7 ORDER BY ACCTYPE DESC;

289 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3713271440

-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 81000 | 133 (2)| 00:00:02 |
| 1 | SORT ORDER BY | | 1000 | 81000 | 133 (2)| 00:00:02 |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 1000 | 81000 | 132 (1)| 00:00:02 |
| 4 | NESTED LOOPS | | 20 | 1200 | 42 (3)| 00:00:01 |
| 5 | VIEW | VW_NSO_1 | 20 | 100 | 21 (0)| 00:00:01 |
| 6 | HASH UNIQUE | | 20 | 180 | | |
|* 7 | CONNECT BY WITH FILTERING | | | | | |
| 8 | TABLE ACCESS BY INDEX ROWID | CU_ALL | 1 | 9 | 2 (0)| 00:00:01 |
|* 9 | INDEX UNIQUE SCAN | PK_CU_ALL_IND | 1 | | 1 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | | | | |
| 11 | CONNECT BY PUMP | | | | | |
| 12 | TABLE ACCESS BY INDEX ROWID| CU_ALL | 20 | 180 | 21 (0)| 00:00:01 |
|* 13 | INDEX RANGE SCAN | IND2_CU_ALL | 20 | | 1 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID | CU_ALL | 1 | 55 | 1 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_CU_ALL_IND | 1 | | 0 (0)| 00:00:01 |
|* 16 | INDEX RANGE SCAN | IND_CA_RECEIPT_CUSTID | 50 | | 2 (0)| 00:00:01 |
| 17 | TABLE ACCESS BY INDEX ROWID | CA_RECEIPT | 50 | 1050 | 52 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

7 - access("CANO"=PRIOR "CUSTID")
9 - access("CUSTID"=2353)
13 - access("CANO"=PRIOR "CUSTID")
15 - access("CU"."CUSTID"="CUSTID")
16 - access("CA"."CUSTID"="CU"."CUSTID")


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
797 consistent gets
1 physical reads
0 redo size
11748 bytes sent via SQL*Net to client
729 bytes received via SQL*Net from client
21 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
289 rows processed


Hence to sum up, oracle gives us the output based on our input values/conditions. Any query must have proper joining condition when multiple tables are involved.

Sunday, June 21, 2009

ORA-600 [ktadrprc-1]

During my visit to the one of the customer, I found following error while dropping old partition from partitioned table.

ORA-00600: internal error code, arguments: [ktadrprc-1], [], [], [], [], [], [], []

When we search metalink, it was found that this error can occur when there is mismatch between seg$ and ind$ entries. There was a mismatch because of an index entry with the no-segment. This is the feature introduced by oracle to test the index selectivity without creating an index. Following is the test case for the same –

Create a partitioned table



SQL> create table partt (n number, dt date, v varchar2(20)) partition by range (dt)
2 (
3 partition p1 values less than (to_date('01-JAN-2009','DD-MON-YYYY')),
4 partition p2 values less than (to_date('02-JAN-2009','DD-MON-YYYY')),
5 partition p3 values less than (to_date('03-JAN-2009','DD-MON-YYYY'))
6 );

Table created.





Create one normal index and one nosegment index.



SQL> create index indpart on partt(n) local;

Index created.

SQL> create index indpartx on partt(v) local nosegment;

Index created.



Note that, virtual index entry cannot be found in dba_indexes but is available in dba_ind_columns.



SQL> select owner, index_name, table_name from dba_indexes where table_name='PARTT';

OWNER INDEX_NAME TABLE_NAME
---------- ------------------------------ ------------------------------
SYS INDPART PARTT

1 row selected.

SQL> select index_owner, index_name, table_owner, table_name, column_name
2 from dba_ind_columns where table_name='PARTT';

INDEX_OWNER INDEX_NAME TABLE_OWNER TABLE_NAME COLUMN_NAME
------------ ------------ ------------ ------------ ------------
SYS INDPARTX SYS PARTT V
SYS INDPART SYS PARTT N

2 rows selected.





Now, when we try to drop one of the partitions, we will get the error.



SQL> alter table partt drop partition p1;
alter table partt drop partition p1
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [ktadrprc-1], [], [], [], [], [], [], []



Solution:

To resolve this error, when we drop the virtual index, the partition got dropped.



SQL> drop index indpartx;

Index dropped.

SQL> alter table partt drop partition p1;

Table altered.


Note :

The solution given above is valid for 10gR2 (tested) version. For previous versions (Oracle 8i and 9i), even dropping an index will give following ORA-600 error.



SQL> drop index indpartx;

drop index indpartx
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [4823], [], [], [], [], [], [], []


Whereas in 8.1.7.4 version while dropping partition, you may get following error –


SQL> alter table partt drop partition p1;
alter table partt drop partition p1
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [4610], [], [], [], [], [], [], []



Conclusion:

Be aware of the creating virtual indexes on partitioned tables just for testing purpose. If it is very big production table, you may end up with unwanted troubles.

Monday, January 26, 2009

Missing Tables in Execution Plan

In Oracle 10g, one of the new features introduced for optimizer, named transformation. Using this features it may be possible that oracle internally removes some of the redundant tables those were involved only in join predicates.

To demonstrate the same following test case created –
1. Table with primary key.
2. Table with foreign key referencing the above created table.
3. Query that eliminates the primary key table and hence leads to faster execution.


SQL> create table pk_table (id number, name varchar2(20));

Table created.



SQL> alter table pk_table add constraint pk_id primary key(id);

Table altered.



SQL> create table fk_table (id number, phone number, addr varchar2(200));

Table created.



SQL> alter table fk_table add constraint fk_id foreign key (id) references pk_table (id);

Table altered.



SQL> begin
2 for x in 1..100 loop
3 insert into pk_table values (x, dbms_random.string('A',20));
4 end loop;
5 commit;
6* end;
SQL> /

PL/SQL procedure successfully completed.




SQL> begin
2 for x in 1..500 loop
3 insert into fk_table values (round(dbms_random.value(1,100)), dbms_random.value(1,100000), dbms_random.string('A',100));
4 end loop;
5 commit;
6 end;
7 /

PL/SQL procedure successfully completed.



SQL> exec dbms_stats.gather_table_stats (user,'PK_TABLE', cascade=>true);

PL/SQL procedure successfully completed.



SQL> exec dbms_stats.gather_table_stats (user,'FK_TABLE', cascade=>true);

PL/SQL procedure successfully completed.



SQL> select b.id, b.phone, b.addr
2 from pk_table a,
3 fk_table b
4 where a.id=b.id;



SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1108179040

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 62500 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| FK_TABLE | 500 | 62500 | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

1 - filter("B"."ID" IS NOT NULL)

13 rows selected.

As shown above, the primary key table is missing in the execution plan. The table PK_TABLE is redundant. As all of us know, every record of foreign key column must be present in primary key table. Hence to fetch data from foreign key table we need have to check primary table for that record.

Optimizer added one extra filter condition i.e. “B.ID is not null”. This is to ensure the data consistency when transformation takes place.

Starting from Oracle 11g, the optimizer has advanced to eliminate anti joins as well. For e.g. check the plan for following query in 10g and 11g.

Query
select * from fk_table b where not exists (select 1 from pk_table a where a.id=b.id);

Plan in 10g

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
Plan hash value: 2976039779

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 128 | 5 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 1 | 128 | 5 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| FK_TABLE | 500 | 62500 | 5 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN| PK_ID | 100 | 300 | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."ID"="B"."ID")

Plan in 11g

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 1108179040

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 125 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| FK_TABLE | 1 | 125 | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("B"."ID" IS NULL)

This elimination is applicable to view as well. That means, if my view is fetching all columns from primary and foreign key tables and query is written to access only foreign key columns from this view, then also optimizer will eliminate primary table.

Limitations

As this seems to be very interesting changes in optimizer, it is not applicable for the multi-column primary/foreign key columns.

To sum up, now optimizer itself takes care of such unnecessary joins to eliminate table access to improve the query performance.