Tuesday, September 22, 2009

Query Performance & Join Conditions

Oracle database performance is measured by application queries those are having impact on host resources. Oracle generates plan as per the input values and that decides the resource usage effeciency.

While working on performance issue, one of the problem reported by the user that one of the module is not functioning and having some issue with query. When we traced the session found one of the query doing FTS and retriving few records only. This was one of the top resource consumer query in the trace file.

The query was looking very normal and checked for the indexes, statistics of the tables etc. Though the statistics were collected recently, it was not the problem with that.

After investigation, it was found that one of the condition was not correctly defined and the same has been confirmed by developer. When you have multi table join in the query, we have to be very careful about equility predicates.

Here I want to remind one of the fact in oracle -
As per mathematics rules, when we say a=b and b=c, we can conclude a=c. But in oracle this is not the case. Oracle never concludes a=c.

To illustrate this, let us create a test case -

Create two tables and insert some random records in it. Also create required indexes. As the query provided was from the production, here I am not using the same table names. Also I have picked up only the problematic part of the query.


SQL> create table cu_all (custid number, addr varchar2(200), ph number, cano number, acctype varchar2(10));

Table created.

SQL> create table ca_receipt (custid number, caamt number, cadt date, totbal number);

Table created.

SQL>
SQL> insert into cu_all
2 select lvl,
3 dbms_random.string('A',30),
4 round(dbms_random.value(1,100000)),
5 round(dbms_random.value(1,10000)),
6 dbms_random.string('A',10)
7 from (select level "LVL" from dual connect by level <=200000);

200000 rows created.

SQL> insert into ca_receipt
2 select round(dbms_random.value(1,10000)),
3 round(dbms_random.value(1,100000)),
4 sysdate - round(dbms_random.value(1,100000)),
5 round(dbms_random.value(1,100000))
6 from (select level "LVL" from dual connect by level <=500000);

500000 rows created.

SQL> create unique index pk_cu_all_ind on cu_all(custid);

Index created.

SQL> create index ind2_cu_all on cu_all(CANO);

Index created.

SQL> create index ind_ca_receipt_custid on ca_receipt(custid);

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'CU_ALL', cascade=>true);

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user,'CA_RECEIPT', cascade=>true);

PL/SQL procedure successfully completed.



Now let us execute the query with trace on. This is the similar query which was provided to me.



SQL> set autot trace
SQL> SELECT ca.*, cu.*
2 FROM ca_receipt CA,
3 cu_all CU
4 WHERE CA.CUSTID = CU.CUSTID
5 AND CA.CUSTID IN (SELECT CUSTID FROM cu_all START WITH custid = 2353
6 CONNECT BY PRIOR CUSTID = CANO)
7 ORDER BY ACCTYPE DESC;

289 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3186098611

-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 81000 | 504 (2)| 00:00:07 |
| 1 | SORT ORDER BY | | 1000 | 81000 | 504 (2)| 00:00:07 |
|* 2 | HASH JOIN | | 1000 | 81000 | 503 (2)| 00:00:07 |
| 3 | NESTED LOOPS | | | | | |
| 4 | NESTED LOOPS | | 1000 | 26000 | 112 (1)| 00:00:02 |
| 5 | VIEW | VW_NSO_1 | 20 | 100 | 21 (0)| 00:00:01 |
| 6 | HASH UNIQUE | | 20 | 180 | | |
|* 7 | CONNECT BY WITH FILTERING | | | | | |
| 8 | TABLE ACCESS BY INDEX ROWID | CU_ALL | 1 | 9 | 2 (0)| 00:00:01 |
|* 9 | INDEX UNIQUE SCAN | PK_CU_ALL_IND | 1 | | 1 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | | | | |
| 11 | CONNECT BY PUMP | | | | | |
| 12 | TABLE ACCESS BY INDEX ROWID| CU_ALL | 20 | 180 | 21 (0)| 00:00:01 |
|* 13 | INDEX RANGE SCAN | IND2_CU_ALL | 20 | | 1 (0)| 00:00:01 |
|* 14 | INDEX RANGE SCAN | IND_CA_RECEIPT_CUSTID | 50 | | 2 (0)| 00:00:01 |
| 15 | TABLE ACCESS BY INDEX ROWID | CA_RECEIPT | 50 | 1050 | 52 (0)| 00:00:01 |
| 16 | TABLE ACCESS FULL | CU_ALL | 200K| 10M| 389 (1)| 00:00:05 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CA"."CUSTID"="CU"."CUSTID")
7 - access("CANO"=PRIOR "CUSTID")
9 - access("CUSTID"=2353)
13 - access("CANO"=PRIOR "CUSTID")
14 - access("CA"."CUSTID"="CUSTID")


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2249 consistent gets
25 physical reads
0 redo size
11748 bytes sent via SQL*Net to client
729 bytes received via SQL*Net from client
21 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
289 rows processed

If you look at the query, it seems to be normal one.

But the problem is here-

Query is having two tables CA and CU. From the inner CU table query, it fetches records and joins with CA table an CA table Joins with CU table using the same column.
Here the inner query joins with CA table and cardinality of the query gets changed. So it is opting FTS when joining to CU table again.
This is causing the performance bottleneck. So to resolve the issue, I have change the joining condition.

Now if we check, following is the proper execution plan. Also the consistents gets have been reduced to 797 against 2249 in original query.


SQL> SELECT ca.*, cu.*
2 FROM ca_receipt CA,
3 cu_all CU
4 WHERE CA.CUSTID = CU.CUSTID
5 AND CU.CUSTID IN (SELECT CUSTID FROM cu_all START WITH custid = 2353
6 CONNECT BY PRIOR CUSTID = CANO)
7 ORDER BY ACCTYPE DESC;

289 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3713271440

-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 81000 | 133 (2)| 00:00:02 |
| 1 | SORT ORDER BY | | 1000 | 81000 | 133 (2)| 00:00:02 |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 1000 | 81000 | 132 (1)| 00:00:02 |
| 4 | NESTED LOOPS | | 20 | 1200 | 42 (3)| 00:00:01 |
| 5 | VIEW | VW_NSO_1 | 20 | 100 | 21 (0)| 00:00:01 |
| 6 | HASH UNIQUE | | 20 | 180 | | |
|* 7 | CONNECT BY WITH FILTERING | | | | | |
| 8 | TABLE ACCESS BY INDEX ROWID | CU_ALL | 1 | 9 | 2 (0)| 00:00:01 |
|* 9 | INDEX UNIQUE SCAN | PK_CU_ALL_IND | 1 | | 1 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | | | | |
| 11 | CONNECT BY PUMP | | | | | |
| 12 | TABLE ACCESS BY INDEX ROWID| CU_ALL | 20 | 180 | 21 (0)| 00:00:01 |
|* 13 | INDEX RANGE SCAN | IND2_CU_ALL | 20 | | 1 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID | CU_ALL | 1 | 55 | 1 (0)| 00:00:01 |
|* 15 | INDEX UNIQUE SCAN | PK_CU_ALL_IND | 1 | | 0 (0)| 00:00:01 |
|* 16 | INDEX RANGE SCAN | IND_CA_RECEIPT_CUSTID | 50 | | 2 (0)| 00:00:01 |
| 17 | TABLE ACCESS BY INDEX ROWID | CA_RECEIPT | 50 | 1050 | 52 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

7 - access("CANO"=PRIOR "CUSTID")
9 - access("CUSTID"=2353)
13 - access("CANO"=PRIOR "CUSTID")
15 - access("CU"."CUSTID"="CUSTID")
16 - access("CA"."CUSTID"="CU"."CUSTID")


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
797 consistent gets
1 physical reads
0 redo size
11748 bytes sent via SQL*Net to client
729 bytes received via SQL*Net from client
21 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
289 rows processed


Hence to sum up, oracle gives us the output based on our input values/conditions. Any query must have proper joining condition when multiple tables are involved.

Sunday, June 21, 2009

ORA-600 [ktadrprc-1]

During my visit to the one of the customer, I found following error while dropping old partition from partitioned table.

ORA-00600: internal error code, arguments: [ktadrprc-1], [], [], [], [], [], [], []

When we search metalink, it was found that this error can occur when there is mismatch between seg$ and ind$ entries. There was a mismatch because of an index entry with the no-segment. This is the feature introduced by oracle to test the index selectivity without creating an index. Following is the test case for the same –

Create a partitioned table



SQL> create table partt (n number, dt date, v varchar2(20)) partition by range (dt)
2 (
3 partition p1 values less than (to_date('01-JAN-2009','DD-MON-YYYY')),
4 partition p2 values less than (to_date('02-JAN-2009','DD-MON-YYYY')),
5 partition p3 values less than (to_date('03-JAN-2009','DD-MON-YYYY'))
6 );

Table created.





Create one normal index and one nosegment index.



SQL> create index indpart on partt(n) local;

Index created.

SQL> create index indpartx on partt(v) local nosegment;

Index created.



Note that, virtual index entry cannot be found in dba_indexes but is available in dba_ind_columns.



SQL> select owner, index_name, table_name from dba_indexes where table_name='PARTT';

OWNER INDEX_NAME TABLE_NAME
---------- ------------------------------ ------------------------------
SYS INDPART PARTT

1 row selected.

SQL> select index_owner, index_name, table_owner, table_name, column_name
2 from dba_ind_columns where table_name='PARTT';

INDEX_OWNER INDEX_NAME TABLE_OWNER TABLE_NAME COLUMN_NAME
------------ ------------ ------------ ------------ ------------
SYS INDPARTX SYS PARTT V
SYS INDPART SYS PARTT N

2 rows selected.





Now, when we try to drop one of the partitions, we will get the error.



SQL> alter table partt drop partition p1;
alter table partt drop partition p1
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [ktadrprc-1], [], [], [], [], [], [], []



Solution:

To resolve this error, when we drop the virtual index, the partition got dropped.



SQL> drop index indpartx;

Index dropped.

SQL> alter table partt drop partition p1;

Table altered.


Note :

The solution given above is valid for 10gR2 (tested) version. For previous versions (Oracle 8i and 9i), even dropping an index will give following ORA-600 error.



SQL> drop index indpartx;

drop index indpartx
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [4823], [], [], [], [], [], [], []


Whereas in 8.1.7.4 version while dropping partition, you may get following error –


SQL> alter table partt drop partition p1;
alter table partt drop partition p1
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [4610], [], [], [], [], [], [], []



Conclusion:

Be aware of the creating virtual indexes on partitioned tables just for testing purpose. If it is very big production table, you may end up with unwanted troubles.

Monday, January 26, 2009

Missing Tables in Execution Plan

In Oracle 10g, one of the new features introduced for optimizer, named transformation. Using this features it may be possible that oracle internally removes some of the redundant tables those were involved only in join predicates.

To demonstrate the same following test case created –
1. Table with primary key.
2. Table with foreign key referencing the above created table.
3. Query that eliminates the primary key table and hence leads to faster execution.


SQL> create table pk_table (id number, name varchar2(20));

Table created.



SQL> alter table pk_table add constraint pk_id primary key(id);

Table altered.



SQL> create table fk_table (id number, phone number, addr varchar2(200));

Table created.



SQL> alter table fk_table add constraint fk_id foreign key (id) references pk_table (id);

Table altered.



SQL> begin
2 for x in 1..100 loop
3 insert into pk_table values (x, dbms_random.string('A',20));
4 end loop;
5 commit;
6* end;
SQL> /

PL/SQL procedure successfully completed.




SQL> begin
2 for x in 1..500 loop
3 insert into fk_table values (round(dbms_random.value(1,100)), dbms_random.value(1,100000), dbms_random.string('A',100));
4 end loop;
5 commit;
6 end;
7 /

PL/SQL procedure successfully completed.



SQL> exec dbms_stats.gather_table_stats (user,'PK_TABLE', cascade=>true);

PL/SQL procedure successfully completed.



SQL> exec dbms_stats.gather_table_stats (user,'FK_TABLE', cascade=>true);

PL/SQL procedure successfully completed.



SQL> select b.id, b.phone, b.addr
2 from pk_table a,
3 fk_table b
4 where a.id=b.id;



SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1108179040

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 62500 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| FK_TABLE | 500 | 62500 | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

1 - filter("B"."ID" IS NOT NULL)

13 rows selected.

As shown above, the primary key table is missing in the execution plan. The table PK_TABLE is redundant. As all of us know, every record of foreign key column must be present in primary key table. Hence to fetch data from foreign key table we need have to check primary table for that record.

Optimizer added one extra filter condition i.e. “B.ID is not null”. This is to ensure the data consistency when transformation takes place.

Starting from Oracle 11g, the optimizer has advanced to eliminate anti joins as well. For e.g. check the plan for following query in 10g and 11g.

Query
select * from fk_table b where not exists (select 1 from pk_table a where a.id=b.id);

Plan in 10g

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
Plan hash value: 2976039779

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 128 | 5 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 1 | 128 | 5 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| FK_TABLE | 500 | 62500 | 5 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN| PK_ID | 100 | 300 | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."ID"="B"."ID")

Plan in 11g

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 1108179040

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 125 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| FK_TABLE | 1 | 125 | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("B"."ID" IS NULL)

This elimination is applicable to view as well. That means, if my view is fetching all columns from primary and foreign key tables and query is written to access only foreign key columns from this view, then also optimizer will eliminate primary table.

Limitations

As this seems to be very interesting changes in optimizer, it is not applicable for the multi-column primary/foreign key columns.

To sum up, now optimizer itself takes care of such unnecessary joins to eliminate table access to improve the query performance.

Monday, September 15, 2008

Materialized view fast refresh and performance.

Oracle uses materialized view to replicate data to remote database in distributed environment. Also it can be used for cache expensive queries in a data warehouse environment. Many of the times, an issue is reported for low performance in fast or incremental refresh. There could be many of the reasons, bugs participating performance issues. One of them is to create an index on snaptime$$ column of corresponding MLOG$ table, which helps to fasten the fast refresh. It is required when MLOG$ table size grows due to any of the reason.

Recently when I was looking into one of such issue, it was found that following query was getting executed (internally by Oracle) in primary database.

Query

SELECT /*+ */ "A2"."CO_ID","A2"."TMCODE","A2"."SPCODE","A2"."SNCODE","A2"."CS_SEQNO",
"A2"."CS_CHANNEL_NUM","A2"."CS_STAT_CHNG","A2"."CS_ON_CBB","A2"."CS_DATE_BILLED",
"A2"."CS_REQUEST","A2"."SN_CLASS","A2"."CS_OVW_SUBSCR","A2"."CS_SUBSCRIPT",
"A2"."CS_OVW_ACCESS","A2"."CS_OVW_ACC_PRD","A2"."CS_OVW_ACC_FIRST","A2"."CS_ACCESS",
"A2"."CS_PENDING_STATE","A2"."CS_CHANNEL_EXCL","A2"."CS_DIS_SUBSCR","A2"."CS_ADV_CHARGE",
"A2"."CS_SRV_TYPE","A2"."SUBPAYER","A2"."USGPAYER","A2"."ACCPAYER","A2"."CS_ENTDATE",
"A2"."CS_OVW_LAST","A2"."INSTALL_DATE","A2"."TRIAL_END_DATE","A2"."CS_ADV_CHARGE_END_DATE",
"A2"."PRM_VALUE_ID","A2"."CURRENCY","A2"."CS_ADV_CHARGE_CURRENCY",
"A2"."REC_VERSION","A2"."SRV_SUBTYPE"
FROM "CONTR_SERVICES" "A2",
(SELECT DISTINCT "A3"."CO_ID" "CO_ID","A3"."SNCODE" "SNCODE","A3"."CS_SEQNO" "CS_SEQNO"
FROM "SYSADM"."MLOG$_CONTR_SERVICES" "A3"
WHERE "A3"."SNAPTIME$$">:1
AND "A3"."DMLTYPE$$"<>'D') "A1"
WHERE "A2"."CO_ID"="A1"."CO_ID"
AND "A2"."SNCODE"="A1"."SNCODE"
AND "A2"."CS_SEQNO"="A1"."CS_SEQNO";


Plan Table
-------------------------------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------
| SELECT STATEMENT | | 748K| 87M|1382913 | | |
| HASH JOIN | | 748K| 87M|1382913 | | |
| VIEW | | 751K| 27M| 3585 | | |
| SORT UNIQUE | | 751K| 16M| 3585 | | |
| TABLE ACCESS BY INDEX |MLOG$_CONTR_SER| 751K| 16M| 16 | | |
| INDEX RANGE SCAN |MLOGSNAPTIME | 751K| | 655 | | |
| TABLE ACCESS FULL |CONTR_SERVICES | 125M| 9G| 332467 | | |
-------------------------------------------------------------------------------------

Though there was an index available on SNAPTIME$$ table, the refresh was taking time because the base table size was huge (around 17GB) and it was doing FTS on it.

SQL> select owner, segment_name, bytes/1024/1024 "SIZE" from dba_segments
2 where segment_name='CONTR_SERVICES';

OWNER SEGMENT_NAME SIZE
---------- ------------------------------ ----------
SYSADM CONTR_SERVICES 17216.0156

The mlog$ table size was also very big (around 750 MB) and having millions of records. Oracle opted for FTS because of table statistics available on MLOG$_CONTR_SERVICES table. Following are the available stats –

SQL> select owner, table_name, partitioned, num_rows, blocks, global_stats
2 from dba_tables where table_name='MLOG$_CONTR_SERVICES';

OWNER TABLE_NAME PAR NUM_ROWS BLOCKS GLO
------------ ------------------------------ --- --------------- ---------- ---
SYSADM MLOG$_CONTR_SERVICES NO 22,531,900 95364 YES

Most of the cases, fast refresh materialized view refresh is based on primary key of the base table. If regular fast refresh is happening, the number of rows gets replicated are very less and ideally it should do index scan with primary key. In the above mentioned case, few thousands rows were suppose to replicate.

The customer was having dynamic analyze script wherein they were also analyzing mlog$ tables. To have better performance in fast refresh, we need to purge the mlog$ table regularly and stats on this table must be gathered when the table is empty. To prove this point, I have set customized stats on mlog table.

SQL> exec dbms_stats.set_table_stats ('SYSADM','MLOG$_CONTR_SERVICES',NUMROWS=>0, NUMBLKS=>0);

PL/SQL procedure successfully completed.
.
.
SQL> select owner, table_name, partitioned, num_rows, blocks, global_stats
2 from dba_tables where table_name='MLOG$_CONTR_SERVICES';

OWNER TABLE_NAME PAR NUM_ROWS BLOCKS GLO
------------ ------------------------------ --- --------------- ---------- ---
SYSADM MLOG$_CONTR_SERVICES NO 0 0 YES

After setting above stats, the plan got changed.

Plan Table
-------------------------------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------
| SELECT STATEMENT | | 1 | 122 | 27 | | |
| NESTED LOOPS | | 1 | 122 | 27 | | |
| VIEW | | 1 | 39 | 24 | | |
| SORT UNIQUE | | 1 | 23 | 24 | | |
| TABLE ACCESS BY INDEX |MLOG$_CONTR_SER| 1 | 23 | 1 | | |
| INDEX RANGE SCAN |MLOGSNAPTIME | 1 | | 655 | | |
| TABLE ACCESS BY INDEX RO|CONTR_SERVICES | 125M| 9G| 3 | | |
| INDEX UNIQUE SCAN |PKCONTR_SERVICE| 125M| | 2 | | |
-------------------------------------------------------------------------------------

In this case, it was doing proper index scan on primary key and fast refresh get successfully completed within minutes.

To sum up, statistics play an important role and must have to be used in proper way. In materialized view refresh following things needs to be remembered

1. The MLOG$ table must be analyze when table is empty.
2. Regular MLOG$ purging is required, where size grows.

Sunday, July 27, 2008

Do we need stats on all columns?

Recently I had visited one of the customers for performance review. As per the current stats gathering policy, they use following method to gather stats on the tables.

dbms_stats.gather_table_stats (user, tabname, cascade=>true, method_opt=>'FOR ALL INDEXED COLUMNS 75');

As can be seen from the command, the input from method_opt is to gather stats on all the indexed columns with bucket size as 75. This means the stats on the other non-index columns will not be collected. This is a wrong practice and stats on the columns of the table are crucial, particularly if these are use in queries.

When I was discussed with the customer, the customer had following misconception.

1. To make decision for index scan; the stats on indexed columns are available.
2. When there is no index available on column, why oracle needs stats on it. It will use FTS for such queries.

This is purely a misconception, even if a column, use in a query, is not indexed; the correct cardinality is very crucial in determining a correct join order. To prove this point, I have created following small test case.

I created two tables (test1 and test2) and indexes on it.

SQL> create table test1 as
2 select mod(LVL, 3) ID, LVL VAL from
3 (select level "LVL" from dual connect by level <= 100000);

Table created.

SQL> create index ind_test1 on test1 (id);

Index created.

SQL> create table test2 as
2 select lvl "ID", dbms_random.string('A',10) "STR" from (select
3 level "LVL" from dual connect by level <=50000);

Table created.

SQL> create index ind_test2 on test2 (id);

Index created.

Let us generate optimizer statistics based on the policy as the customer used.

SQL> exec dbms_stats.gather_table_stats (user, 'TEST1', cascade=>true, method_opt=>'FOR ALL INDEXED COLUMNS SIZE 75');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats (user, 'TEST2', cascade=>true, method_opt=>'FOR ALL INDEXED COLUMNS SIZE 75');

PL/SQL procedure successfully completed.

The stats on the table are


SQL> select owner, table_name, partitioned, num_rows, blocks
2 from dba_tables
3 where table_name in ('TEST1','TEST2')
4 and owner = 'TEST';

OWNER TABLE_NAME PAR NUM_ROWS BLOCKS
---------- ---------- --- ---------- ----------
TEST TEST1 NO 100000 186

TEST TEST2 NO 50000 156


break on table_name skip 1
select table_name, column_name,num_distinct, num_nulls, density
from dba_tab_columns
where table_name in ('TEST1','TEST2')
and owner='TEST'
order by table_name;

TABLE_NAME COLUMN_NAM NUM_DISTINCT NUM_NULLS DENSITY
---------- ---------- ------------ ---------- ----------
TEST1 ID 3 0 .000005
VAL

TEST2 ID 50000 0 .00002
STR

Based on the input to dbms_stats, stats are collected only on indexed columns and not for other non-indexed columns.
The following query is used to check the execution plan.

SQL> set autot trace
SQL> select a.id, a.val, b.str
2 from test1 a,
3 test2 b
4 where a.id = b.id
5 and a.val = 40;

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=46 Card=667 Bytes=77372)
1 0 HASH JOIN (Cost=46 Card=667 Bytes=77372)
2 1 TABLE ACCESS (FULL) OF 'TEST1' (Cost=19 Card=1000 Bytes=16000)
3 1 TABLE ACCESS (FULL) OF 'TEST2' (Cost=17 Card=50000 Bytes=5000000)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
327 consistent gets
315 physical reads
0 redo size
489 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


Now, let’s delete and gather new statistics on the same tables.

SQL> exec dbms_stats.delete_table_stats (user, 'TEST1');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.delete_table_stats (user, 'TEST2');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats (user, 'TEST1', cascade=>true, method_opt=>'FOR ALL COLUMNS');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats (user, 'TEST2', cascade=>true, method_opt=>'FOR ALL COLUMNS');

PL/SQL procedure successfully completed.

Let us check the stats again.

TABLE_NAME COLUMN_NAM NUM_DISTINCT NUM_NULLS DENSITY
---------- ---------- ------------ ---------- ----------
TEST1 ID 3 0 .000005
VAL 100000 0 .00001

TEST2 ID 50000 0 .00002
STR 50000 0 .00002

The plan by the same query is

SQL> select a.id, a.val, b.str
2 from test1 a,
3 test2 b
4 where a.id = b.id
5 and a.val = 40;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=1 Bytes=22)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST2' (Cost=2 Card=1 Bytes=15)
2 1 NESTED LOOPS (Cost=21 Card=1 Bytes=22)
3 2 TABLE ACCESS (FULL) OF 'TEST1' (Cost=19 Card=1 Bytes=7)
4 2 INDEX (RANGE SCAN) OF 'IND_TEST2' (NON-UNIQUE) (Cost=1 Card=1)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
183 consistent gets
174 physical reads
0 redo size
489 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

The comparison between both of the above plans is -

Plan when stats are gathered
Cardinality of TEST1 table
Consistent Gets
Only on indexed columns 1000 327
On all columns 1 183


Why such difference in both the execution plans?

Cardinality plays a very big role in opting an execution plan. In first case, as the stats are not available on VAL column of the table TEST1.

The cardinality when statistics are available only on indexed columns

The filter condition is on ID column of table TEST1 and as there are no stats available on this column, oracle will take the default selectivity as 1/100 and calculated cardinality as

num_rows*selectivity = 100000*(1/100) = 1000

As per the plan, the TEST1 is driving table and cardinality is very high and hence oracle opted the FTS for TEST2 table and cardinality is calculated as

num_rows = 50000

When stats are available on both the all column of the tables, for TEST1 table cardinality is calculated as

Num_rows*(“ID column selectivity” * “VAL column Selectivity)
= round (100000 * (1/3 * 1/100000))
= 0

As cardinality cannot be 0 it will consider as 1.

As the cardinality for TEST1 table is very low i.e. 1, TEST2 will be opted for index scan and hence cardinality is calculated as

Num_rows * (1/distinct)
= round (50000 * (1/50000))
= 1

From above discussion it is concluded that the stats are necessary on all columns. In the above-mentioned test case, we have calculated the stats with histograms but the same result can be found without histograms, as the column stats are getting updated.

To sum up, the stats on all columns are required for optimal execution plan.

Sunday, May 18, 2008

Functions and Performance.

Function plays an important role in development environment. The functions can be implicit (defined by oracle) or explicit (defined by user – PL/SQL code). Developers use the PL/SQL functions as they simplify the large code.

The function may accept the value and must return the value. The user-defined functions are easy to use but they may degrade the performance badly. Hence it is always recommended to use the functions only and only when they are unavoidable. I have demonstrated the performance benefits by the simple query over user-defined functions.

If the condition does not satisfy, function will return null value. It means the output rows will be the number of rows satisfied by the main query. For e.g. if suppose the main query returns 1000 rows and function satisfies only 400 rows, then the output will have 1000 rows, wherein the function column will show 600 null values and 400 value returned by the function. Hence we must have to use outer join whenever we are merging function into the simple query. The example below uses select sub-query that plays the role of function.


SQL> create table fn_tables as select * from dba_tables;

Table created.

SQL> create table fn_indexes as select * from dba_indexes;

Table created.

SQL> create index fn_ind_tables on fn_tables (table_name);

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'FN_TABLES', cascade=>true);

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user, 'FN_INDEXES', cascade=>true);

PL/SQL procedure successfully completed.

SQL> create or replace function fn_test (tabowner varchar2, tabname varchar2)
2 return varchar2 is
3 tbsname varchar2(200);
4 begin
5 select tablespace_name into tbsname
6 from fn_tables
7 where owner=tabowner
8 and table_name=tabname;
9 return tbsname;
10 end;
11 /

Function created.


SQL> set autot trace
SQL> select a.owner, a.index_name, a.status,
2 fn_test(a.owner, a.table_name) "TBS_NAME"
3 from fn_indexes a
4 /

1072 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2626245312

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1072 | 51456 | 9 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| FN_INDEXES | 1072 | 51456 | 9 (0)| 00:00:01 |
--------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
1097 recursive calls
0 db block gets
3308 consistent gets
0 physical reads
0 redo size
36706 bytes sent via SQL*Net to client
1162 bytes received via SQL*Net from client
73 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1072 rows processed


SQL> select a.owner, a.index_name, a.status,
2 (select tablespace_name from fn_tables b
3 where b.owner=a.owner
4 and a.table_name=b.table_name) "TBS_NAME"
5 from fn_indexes a
6 /

1072 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 321380953

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1072 | 51456 | 9| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| FN_TABLES | 1 | 29 | 2| 00:00:01 |
|* 2 | INDEX RANGE SCAN | FN_IND_TABLES| 1 | | 1| 00:00:01 |
| 3 | TABLE ACCESS FULL | FN_INDEXES | 1072 | 51456 | 9| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("B"."OWNER"=:B1)
2 - access("B"."TABLE_NAME"=:B1)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1573 consistent gets
0 physical reads
0 redo size
36706 bytes sent via SQL*Net to client
1162 bytes received via SQL*Net from client
73 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1072 rows processed


SQL> select a.owner, a.index_name, a.status, b.tablespace_name "TBS_NAME"
2 from fn_indexes a,
3 fn_tables b
4 where b.owner(+) = a.owner
5 and b.table_name(+) = a.table_name
6 /

1072 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 893717710

------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1072 | 82544 | 18 (6)| 00:00:01 |
|* 1 | HASH JOIN RIGHT OUTER| | 1072 | 82544 | 18 (6)| 00:00:01 |
| 2 | TABLE ACCESS FULL | FN_TABLES | 879 | 25491 | 8 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | FN_INDEXES | 1072 | 51456 | 9 (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("B"."OWNER"(+)="A"."OWNER" AND
"B"."TABLE_NAME"(+)="A"."TABLE_NAME")


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
135 consistent gets
0 physical reads
0 redo size
36706 bytes sent via SQL*Net to client
1162 bytes received via SQL*Net from client
73 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1072 rows processed


Place all above required values in tabular format.


Query No. Query Criteria Cost Recursive Calls Consistent Gets
1 With Function 9 1097 3308
2 With Sub-Query 9 0 1573
3 With Simple Query 18 0 135


As shown in above table, though the cost of the third query is high, the number of consistent gets has been reduced tremendously. When we are using function, the recursive calls and consistent gets are very high. But when using sub-query both the values have been reduced but this query also suffers from performance compared to third query.

Recently I had got the following query from one of our customer. Due to confidentiality, I will not be posting the original query from the customer site but in this example I have made similar kind of query.


create or replace function get_minamount (cust number)
return number is
amt number;

begin
select min (amount) into amt
from contract
where customer=cust and status = 'a';

if (amt is null) then
select min (amount) into amt
from contract
where customer=cust and status = 'd';
end if;

if (amt is null) then
select min (amount) into amt
from contract
where customer=cust and status = 's';
end if;

Return amt;

End;
/


The main query is as follows.

select customer, name, address,
get_minamount (customer) "MIN_AMT"
from customer_master
where custcode is like 'Large%';


The customer_master is having millions of records and to execute this query it was taking around 8 Hrs. The given suggestion on this query is as follows.


select customer, name, address, min_amt, status
from (select a.customer, name, address, status,
min(case when status in ('a','s','d') then amount end)
over (partition by b.customer, b.status) "MIN_AMT",
rank() over (partition by b.customer
order by (case when status = 'a' and amount is not null then 1
when status = 's' and amount is not null then 2
when status = 'd' and amount is not null then 3
end), rownum) "RNK"
from customer_master a,
contract b
where a.customer = b.customer(+))
where rnk = 1;


After implementing the query the performance gain was very high. The modified query is taking around 20 minutes to execute against 8 Hrs.

To sum up, the user-defined functions may contribute in performance degradation of the query. The simple query will give major performance benefits against the query with functions. The performance of the query is inversely proportional to the number of consistent gets. Hence to improve the performance, we must minimize the consistent gets.

Sunday, May 4, 2008

Migration from Filesystem to ASM Files.

This is continuation to my previous blog, wherein we discussed about interacting with ASM and OS database file.

ASM is an integrated volume manager and file system for Oracle Database files. It simplifies, automates storage management, increase storage utilization and delivers predictable performance. Due to this many of us wants to migrate filesystem database to ASM. If the database is very large and critical, we may not be able to afford the migration downtime. In this blog, I would discuss the procedure to dramitically reduce downtime during the process of migration from raw/OS filesystem to ASM using Oracle Data Guard and RMAN. This blog will also be helpful to create standby database. This procedure explains how data guard can help to reduce the downtime of migration.

Following is the summarised procedure

1. Create standby (dataguard) of the production database.
2. Migrate the standby database to ASM
3. Do the switchover to migate the database to ASM.

As mentioned above, it seems, migration is very simple and yes it is. Below mentioned is the detailed procedure.

PREPARE FOR STANDBY DATABASE

In this demonstration, I have named primary instance as “ORA10G” and standby database as “ORA10GSTD”

Create a backup of the database using RMAN.

RMAN> connect target /

connected to target database: ORA10G (DBID=3970225046)

RMAN> backup database include current controlfile for standby;

Starting backup at 08-APR-08
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=154 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=D:\ORACLE\DATAFILES\ORA10G\SYSTEM01.DBF
input datafile fno=00002 name=D:\ORACLE\DATAFILES\ORA10G\UNDOTBS01.DBF
input datafile fno=00003 name=D:\ORACLE\DATAFILES\ORA10G\SYSAUX01.DBF
input datafile fno=00004 name=D:\ORACLE\DATAFILES\ORA10G\USERS01.DBF
channel ORA_DISK_1: starting piece 1 at 08-APR-08
channel ORA_DISK_1: finished piece 1 at 08-APR-08
piece handle=D:\ORACLE\ORA10GHOME\DATABASE\01JDBCU5_1_1 tag=TAG20080408T230957 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:56
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including standby control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 08-APR-08
channel ORA_DISK_1: finished piece 1 at 08-APR-08
piece handle=D:\ORACLE\ORA10GHOME\DATABASE\02JDBCVT_1_1 tag=TAG20080408T230957 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:05
Finished backup at 08-APR-08


Make the tnsnames.ora changes in both (production and standby) servers to make connections between both the servers for communication.

Create and copy the parameter file and make appropriate changes for standby database.

SQL> create pfile from spfile;

We assume that, ASM instance is already created, mounted and ready for use. We are using diskgroup, named as ORADG. Also make sure this diskgroup should have sufficient space available.

Following are the standby parameters, which we have to modify.

Edit control_files parameter to appropriate location. This location should be ASM diskgroup.

control_files = ‘+ORADG\ORA10g\controlfiles\control01.ctl’

Add or edit DB_UNIQUE_NAME parameter to define the database uniqueness.

db_unique_name = ORA10GSTD ---- Instance name of Standby.

Add or edit LOG_ARCHIVE_CONFIG parameter to identify primary and standby database using unique name. The parameter value should be primary and standby instance name.
 
log_archive_config = ’dg_config=(ORA10G,ORA10GSTD)’

Set following parameters to point to ASM diskgroup.

db_create_file_dest = '+ORADG'
db_create_online_log_dest_1 = '+ORADG'

The other parameters that needs to change are –

instance_name = ORA10gSTD
fal_server = ora10g --- TNSNAMES entry for Primary
fal_client = ora10gSTD --- TNSNAMES entry for Standby

Create Password file

$ orapwd file=$ORACLE_HOME/dbs/orapw password=

INITIATE THE STANDBY DATABASE IN ASM

After confirmation ASM instance up and parametter, password file is prepared, now we initiate standby database.

SQL> startup force nomount
ORACLE instance started.

Total System Global Area 301989888 bytes
Fixed Size 1248672 bytes
Variable Size 96469600 bytes
Database Buffers 201326592 bytes
Redo Buffers 2945024 bytes

RMAN provides single command that initiates the standby database using source database information. Connect to the production database using RMAN and initiate standby database.

C:\> rman target sys/oracle@ora10g

Recovery Manager: Release 10.2.0.1.0 - Production

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: ORA10G (DBID=3970225046)

RMAN> connect auxiliary /

connected to auxiliary database: ORA10G (not mounted)

RMAN> duplicate target database for standby;

Starting Duplicate Db at 08-APR-08
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: sid=157 devtype=DISK

contents of Memory Script:
{
restore clone standby controlfile;
sql clone 'alter database mount standby database';
}
executing Memory Script

Starting restore at 08-APR-08
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backupset restore
channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: reading from backup piece D:\ORACLE\ORA10GHOME\DATABASE\02JDBCVT_1_1
channel ORA_AUX_DISK_1: restored backup piece 1
piece handle=D:\ORACLE\ORA10GHOME\DATABASE\02JDBCVT_1_1 tag=TAG20080408T230957
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:17
output filename=+ORADG/ora10g/controlfiles/control01.ctl
Finished restore at 08-APR-08

sql statement: alter database mount standby database
released channel: ORA_AUX_DISK_1

contents of Memory Script:
{
set newname for clone tempfile 1 to new;
switch clone tempfile all;
set newname for clone datafile 1 to new;
set newname for clone datafile 2 to new;
set newname for clone datafile 3 to new;
set newname for clone datafile 4 to new;
restore
check readonly
clone database
;
}
executing Memory Script

executing command: SET NEWNAME

renamed temporary file 1 to +ORADG in control file

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

Starting restore at 08-APR-08
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: sid=157 devtype=DISK

channel ORA_AUX_DISK_1: starting datafile backupset restore
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to +ORADG
restoring datafile 00002 to +ORADG
restoring datafile 00003 to +ORADG
restoring datafile 00004 to +ORADG
channel ORA_AUX_DISK_1: reading from backup piece D:\ORACLE\ORA10GHOME\DATABASE\01JDBCU5_1_1
channel ORA_AUX_DISK_1: restored backup piece 1
piece handle=D:\ORACLE\ORA10GHOME\DATABASE\01JDBCU5_1_1 tag=TAG20080408T230957
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:01:46
Finished restore at 08-APR-08

contents of Memory Script:
{
switch clone datafile all;
}
executing Memory Script

datafile 1 switched to datafile copy
input datafile copy recid=5 stamp=651539654 filename=+ORADG/ora10gstd/datafile/system.257.651539549
datafile 2 switched to datafile copy
input datafile copy recid=6 stamp=651539654 filename=+ORADG/ora10gstd/datafile/undotbs1.258.651539549
datafile 3 switched to datafile copy
input datafile copy recid=7 stamp=651539654 filename=+ORADG/ora10gstd/datafile/sysaux.259.651539551
datafile 4 switched to datafile copy
input datafile copy recid=8 stamp=651539654 filename=+ORADG/ora10gstd/datafile/users.260.651539551
Finished Duplicate Db at 08-APR-08

We have successfully transferred files from production to standby database in ASM datagroup. Modify log_archive_dest_2 parameter so that archive logs file will transfer from primary to standby database. Then place the standby database recovery mode so archive logs can be applied automatically.

SQL> recover managed standby database disconnect;

DO SWITCHOVER TO MIGRATE DATABASE TO ASM

This is final step of the migration. We will do switchover to migrate the database to ASM. Check that, both primary and standby must be in sync.

On Primary Database –

Select max(sequence#) “MAXSEQ” from v$log_history;

MAXSEQ
---------
124

On Standby Database –

Select max(sequence#) “MAXSEQ” from v$log_history;

MAXSEQ
---------
124

As both primary and standby database, we can do switchover.

SQL> alter database commit to switchover to primary;

Database altered.

SQL> shut immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.


SQL> startup
ORACLE instance started.

Total System Global Area 301989888 bytes
Fixed Size 1248672 bytes
Variable Size 96469600 bytes
Database Buffers 201326592 bytes
Redo Buffers 2945024 bytes
Database mounted.
Database opened.

We have completed ASM migration.

Sunday, April 20, 2008

Manual Switchover.

Data Guard maintains the standby databases as transactionally consistent copies of the production database. Then, if the production database becomes unavailable because of a planned or an unplanned outage, Data Guard can switch any standby database to the production role, minimizing the downtime associated with the outage.

Due to low network bandwidth or when using Standard Edition, we maintain manual dataguard, in which, we transfer the archive log files manually to the DR site and then apply them there. In such cases, it is very difficult to use oracle internal switch over mechanism. This document is prepared for the customers having manual dataguard and wants to do the switch over.

This document consider following pre-requisites.

 All Dataguard pre-requisites (same OS and Oracle version etc.) are fulfilled.
 Manual Dataguard setup is in place having standby database with standby controlfile.
 All the archive logs are properly applied on DR site.

Switch-over Process

1. Shutdown the primary database.
2. Make sure all the archive logs are transferred and applied on DR site.
3. Shutdown the standby database
4. Copy the controlfile and online redo log files of primary database to DR site.
5. Copy the standby controlfile from DR site to primary site.
6. Startup the primary database (Current standby database). Here we may have to do recovery to apply changes of online redo log files. This must be a complete recovery.
7. Startup the standby database (Current primary)
8. Check the databases & their role.

We have successfully completed the switchover activity and same steps can be followed for switch back activity.

Following is the demonstration of the above-mentioned steps. We will create a table and check the availability of it after switchover.

Primary Database –

SQL> create table std_test_1 as select * from dba_objects;


Table created.


SQL> insert into std_test_1 select * from std_test_1;


6165 rows created.


SQL> insert into std_test_1 select * from std_test_1;

12330 rows created.

SQL> commit;

Commit complete.

Note down the current sequence number of primary database.

SQL> select max(sequence#) from v$log_history;

MAX(SEQUENCE#)
--------------
235

Now we will check the status of standby database. (Its current sequence no.)

SQL> select max(sequence#) from v$log_history;

MAX(SEQUENCE#)
--------------
233

The standby database is lagging of primary database. So we will recover the
standby database and sync it with primary.

SQL> recover standby database;
ORA-00279: change 50652 generated at 04/02/2008 22:32:37 needed for thread 1
ORA-00289: suggestion : D:\ORACLE\DATAFILES\ORA9ISTD\ARCH\ORA9I_234.DBF
ORA-00280: change 50652 for thread 1 is in sequence #234

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 50750 generated at 04/02/2008 22:36:08 needed for thread 1
ORA-00289: suggestion : D:\ORACLE\DATAFILES\ORA9ISTD\ARCH\ORA9I_235.DBF
ORA-00280: change 50750 for thread 1 is in sequence #235
ORA-00278: log file 'D:\ORACLE\DATAFILES\ORA9ISTD\ARCH\ORA9I_234.DBF' no longer needed for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 50759 generated at 04/02/2008 22:36:14 needed for thread 1
ORA-00289: suggestion : D:\ORACLE\DATAFILES\ORA9ISTD\ARCH\ORA9I_236.DBF
ORA-00280: change 50759 for thread 1 is in sequence #236
ORA-00278: log file 'D:\ORACLE\DATAFILES\ORA9ISTD\ARCH\ORA9I_235.DBF' no longer needed for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00308: cannot open archived log
'D:\ORACLE\DATAFILES\ORA9ISTD\ARCH\ORA9I_236.DBF'
ORA-27041: unable to open file
OSD-04002: unable to open file
O/S-Error: (OS 2) The system cannot find the file specified.

Check the current status of standby database.

SQL> select max(sequence#) from v$log_history;

MAX(SEQUENCE#)
--------------
235

Now we are ready to proceed for switchover process. To start with, we will first
shutdown both primary and standby database.

Primary –

SQL> shut immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

Standby –

SQL> shut immediate;
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.

We are doing manual switchover process, so we need to copy the control and
redo log files from primary to standby database. At the same time, we have to
copy the controlfile from standby database location to primary database.

Assuming copy process is over, we will startup the Current Primary (former
standby) database.

SQL> startup nomount
ORACLE instance started.

Total System Global Area 101784276 bytes
Fixed Size 453332 bytes
Variable Size 75497472 bytes
Database Buffers 25165824 bytes
Redo Buffers 667648 bytes

SQL>
SQL> alter database mount;

Database altered.

SQL> alter database open;

alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\ORACLE\DATAFILES\ORA9ISTD\SYSTEM01.DBF'

Here we are getting recovery error because this is standby database; the changes
lying in current redo logs (which we have copied) are yet to apply on current
primary. The important thing to note here is that, this recovery must be complete
recovery. Post to this, we will open the check the status of the database.

SQL> recover database;

Media recovery complete.

SQL> alter database open;

Database altered.

SQL> select max(sequence#) from v$log_history;

MAX(SEQUENCE#)
--------------
235

As shown above, the recovery has been completed and current primary (former
standby) database is up. We will check for test table created.

SQL> select count(1) from std_test_1;

COUNT(1)
----------
24660

We have successfully completed one phase of the switchover process. Now we
will startup current standby (former primary) database to proceed further.

SQL> startup nomount
ORACLE instance started.

Total System Global Area 101784276 bytes
Fixed Size 453332 bytes
Variable Size 75497472 bytes
Database Buffers 25165824 bytes
Redo Buffers 667648 bytes

SQL> alter database mount standby database;

Database altered.

SQL> select max(sequence#) from v$log_history;

MAX(SEQUENCE#)
--------------
235

SQL> select open_mode, database_role from v$database;

OPEN_MODE DATABASE_ROLE
---------- ----------------
MOUNTED PHYSICAL STANDBY

The primary and standby databases are in sync. We will generate the archive log
files at primary site and will apply these files at standby database.

SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

SQL> alter system switch logfile;

System altered.

SQL> select max(sequence#) from v$log_history;

MAX(SEQUENCE#)
--------------
238

SQL> select open_mode, database_role from v$database;

OPEN_MODE DATABASE_ROLE
---------- ----------------
READ WRITE PRIMARY

We have generated three archive log files here and now we will copy and apply
them at standby database.

SQL> recover standby database;
ORA-00279: change 50759 generated at 04/02/2008 22:36:14 needed for thread 1
ORA-00289: suggestion : D:\ORACLE\DATAFILES\ORA9I\ARCH\ORA9I_236.DBF
ORA-00280: change 50759 for thread 1 is in sequence #236

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 51098 generated at 04/02/2008 22:49:37 needed for thread 1
ORA-00289: suggestion : D:\ORACLE\DATAFILES\ORA9I\ARCH\ORA9I_237.DBF
ORA-00280: change 51098 for thread 1 is in sequence #237
ORA-00278: log file 'D:\ORACLE\DATAFILES\ORA9I\ARCH\ORA9I_236.DBF' no
longer needed for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 51119 generated at 04/02/2008 22:49:50 needed for thread 1
ORA-00289: suggestion : D:\ORACLE\DATAFILES\ORA9I\ARCH\ORA9I_238.DBF
ORA-00280: change 51119 for thread 1 is in sequence #238
ORA-00278: log file 'D:\ORACLE\DATAFILES\ORA9I\ARCH\ORA9I_237.DBF' no
longer needed for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 51126 generated at 04/02/2008 22:49:51 needed for thread 1
ORA-00289: suggestion : D:\ORACLE\DATAFILES\ORA9I\ARCH\ORA9I_239.DBF
ORA-00280: change 51126 for thread 1 is in sequence #239
ORA-00278: log file 'D:\ORACLE\DATAFILES\ORA9I\ARCH\ORA9I_238.DBF' no
longer needed for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00308: cannot open archived log 'D:\ORACLE\DATAFILES\ORA9I\ARCH\ORA9I_239.DBF'
ORA-27041: unable to open file
OSD-04002: unable to open file
O/S-Error: (OS 2) The system cannot find the file specified.

The recovery is completed successfully. Here we have completed our switchover
process.

It is important to note that, in standby database, it is mandatory to have the database in force logging mode. This feature has been introduced by oracle in version 9i. For pre-9i versions, do check UNRECOVERABLE_TIME column value of V$DATAFILE view. This value must be less than the standby database creation time fir all existing datafiles. If any datafile is having value greater than standby creation time, we need to restore the backup of this datafile and continue with recovery. The unrecoverable_time value gets updated by oracle for any nologging operation. To avoid such nologging operations, do ensure that, all objects must be in logging mode.

Note – The above-mentioned steps can be used for switchback process.

Sunday, March 2, 2008

ASM File Handling.

Automatic Storage Management (ASM) is a feature of Oracle Database 10g that provides integrated cluster file system and volume management capabilities at no additional cost. ASM provided a foundation for highly efficient storage management with kernelized asynchronous I/O, direct I/O, redundancy, striping, and an easy way to manage storage.

ASM disk groups are not visible outside the database for regular file system administration tasks such as copying and creating directories. It is always seems to be difficult task to backup or migrating the database. But in reality it is not the case. Oracle has provided various methods to deal with ASM files, discussed in following sections –

ASM FILES HANDLING USING FTP

Starting with oracle 10g Release 2, Oracle has provided an option to transfer the files from ASM to locations outside of the diskgroups via FTP and ussing HTTP web browser.

To access ASM files, file-mapping functionalities can be used provided by Oracle XML DB feature. Following is the method to setup FTP and HTTP to access and trasnfer the ASM files to regular OS files. FTP uses 21 as defualt port and HTTP uses port 80. ASM do not use default services, so to access ASM files, it is required to use two different unused ports. For e.g. we will use 8080 port for HTTP and 2100 port for FTP.

1. Install XMLDB feature.

   # cd $ORACLE_HOME/rdbms/admin
# sqlplus “/ as sysdba”

SQL> @catqm.sql XDB SYSAUX TEMP

Here XDB is username
SYSAUX is default tablespace for xdb user.
TEMP is temporary tablespace for xdb user.

SQL> @catxdbdbca 2100 8080

Here 2100 is FTP port.
8080 is HTTP port.

Check for any invalid XDB objects and compile, if any.


SQL> select object_name, object_type, status
from dba_objects
where owner=’XDB’ and status=’INVALID’;

SQL> select comp_name, status, version from dba_registry
where comp_name=’Oracle XML Database’;


2. Configure FTP and HTTP Ports.

   SQL> execute dbms_xdb.sethttpport(8080);
SQL> execute dbms_xdb.setftpport(2100);


You can also check the ports being used.

   Select dbms_xdb.gethttpport(), dbms_xdb.getftpport() 
from dual;


3. Listener and Instance modifications

For Single Instance –

SQL> alter system set dispatchers = ‘(PROTOCOL=TCP)
(SERVICE=XDB)’ scope=both;


For RAC Instance –

   SQL> alter system set dispatchers = ‘(PROTOCOL=TCP) 
(SERVICE=XDB)’ scope=both sid=INSTANCE1;

SQL> alter system set dispatchers = ‘(PROTOCOL=TCP)
(SERVICE=XDB)’ scope=both sid=INSTANCE2;


Restart the listener

   # lsnrctl stop
# lsnrctl start


Listener should show http and ftp port entries.

# lsnrctl status

………… Truncated some entries ……………

Listening Endpoints Summary...

(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=test)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=test)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=test)(PORT=8080))(Presentation=HTTP)(Session=RAW))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=test)(PORT=2100))(Presentation=FTP)(Session=RAW))


4. Use FTP at O/S level to do interaction with ASM files.

   C:\> ftp 
ftp> open test 2100 ---- Here test is the server name and 2100 is ftp port.
Username: system ---- System is the database user.
Password: ****** ---- Password of database system user.

ftp> cd /sys/asm


Now directory can be changed to the directory containing the files to start FTP service and ls, get or put commands can be used as normal o/s level ftp command.


DBMS_FILE_TRANSFER Utility

Oracle also provides another feature to move files. The dbms_file_transfer utility introduced by oracle to copy/transfer the files to and from ASM. If you want to copy the file that is attached to a tablespace, you must put that tablespace into read-only mode or take it offline. Because of this limitation, you cannot copy the data files associated with the system, undo, sysaux or temporary tablespace. This utility treats the file as binary file during copy also character conversion is not possible during copy.

Following is the method to transfer the files from file-system to ASM.

List the file to copy.

SQL> select file_name from dba_data_files 
2 where tablespace_name=’USERS’;


Create directory for source and destination.

SQL> create directory source_osfs as ‘/oracle/datafile’;
SQL> create directory dest_asmfs as ‘+asmdsk1/orcl/datafile';
SQL> grant read, write on source_osfs to testuser;
SQL> grant read, write on dest_asmfs to testuser;


Connect to the user and transfer the file.

SQL> connect testuser/testuser@ora10gdb
SQL> alter tablespace users read only;
-OR-
SQL> alter tablespace users offline;


SQL> exec dbms_transfer.copy_file (source_directory_object=>’source_osfs’, source_file_name=>’users01.dbf’, destination_directory_object=>’dest_asmfs’, destination_file_name=>’users01.dbf’);


File has been copied so now datafile path can be changed using “alter database rename command”. Finally make the tablespace online for use.

SQL> alter tablespace users read write;
-OR-
SQL> alter tablespace users online;


We can use get_file and put_file procedures of dbms_transfer package to copy the files across the database using database link.

Limitations –

1. File that we want to copy should be multiple of 512 bytes. Otherwise we would get following error. Also maximum file of file to transfer is 2 terabyte.

ERROR at line 1: 
ORA-19505: failed to identify file "/oracle/datafile/users01.dbf"
ORA-27046: file size is not a multiple of logical block size
Additional information: 1
ORA-06512: at "SYS.DBMS_FILE_TRANSFER", line 84
ORA-06512: at "SYS.DBMS_FILE_TRANSFER", line 193
ORA-06512: at line 2


2. Till Oracle 10g Rel 1 only following transfer is allowed.

ASM to ASM.
OS file to OS file.

This restriction is no more in Oracle 10g Rel 2. DBMS_TRANSFER has been enhanced to support following combinations of ASM.

ASM to OS file.
OS file to ASM.

RMAN & ASM FILES

So far we have completed two methods to copy the files from ASM to OS file system. Finally how can we forget RMAN utility to interact with ASM Files? Following is the method to copy the files using RMAN utility. We will copy the files from one disk group to another diskgroup.

1. Identify the file.

SQL> select file_name from dba_data_files 
2 where tablespace_name='TBS_TEST';

FILE_NAME
---------------------------------------------------
+ORA_DG/oraprod/datafile/tbs_test.1442.642809147


2. Take the file offline and copy it using rman utility.

SQL> alter database datafile 
2 ‘+ORA_DG/oraprod/datafile/tbs_test.1442.642809147’ offline;

# rman target /

RMAN> COPY DATAFILE ‘+ORA_DG/oraprod/datafile/tbs_test.1442.642809147’ TO ‘+ORA_DG_NEW’;


3. Now file has been copied, we can update the dictionary with the new ASM Diskgroup location.

SQL> ALTER DATABASE RENAME FILE
2 ‘+ORA_DG/oraprod/datafile/tbs_test.1442.642809147’
3 TO ‘+ORA_DG_NEW/oraprod/datafile/tbs_test.264.59829765’;


We got the destination file name from rman copy output.

4. Use RMAN to rename the ASM database file copy.

RMAN> switch datafile ‘+ORA_DG/oraprod/datafile/tbs_test. 264.59829765’ to copy;


5. Finally recover and take the file online.

SQL> Recover datafile ‘+ORA_DG/oraprod/datafile/tbs_test. 264.59829765’;

SQL> alter database datafile ‘+ORA_DG/oraprod/datafile/tbs_test. 264.59829765’ online;


6. Delete the old file from ASM directory.

$ export ORACLE_SID=+ASM
$ sqlplus “/ as sysdba”
SQL> alter diskgroup ora_dg_new drop file datafile ‘+ORA_DG/oraprod/datafile/tbs_test. 264.59829765’;

Saturday, December 29, 2007

Index Internals.

Tables can grow large, and when they do, it becomes difficult for users to quickly find the data they need. Oracle offers indexes as a method of speeding database performance when accessing tables with a lot of data. The b-tree index is the traditional and most useful indexing mechanism. It stores the data in a tree like structure.

The b-tree index structure is made up of root block, branch blocks and leaf blocks. The root block is an entry point where search for data in index starts. Any index contains only and only one root block. The root block is always physically the same block. The branch blocks are next level of the root block that is having pointers to leaf blocks in the index.

The leaf blocks are the highest level of the index, which contain indexed column values and the corresponding ROWIDs. Each leaf block is comprised of double-linked list structure. It means each leaf block is linked to the other block on its left and right, in order to make it possible to search in both the directions through a range of values in index. The index entries are always in ordered.

The oracle's index is always maintains the balanced structure. To understand this, it is necessary to understand block split operation in index. There are two ways of block splits

50-50 BLOCK SPLIT

The 50-50-block split can occur when there is an insert operation of a non-maximum value and when the corresponding block is full. The indexed column update operation for an index is internally delete followed by an insert. The split operation steps are as follows –

1. Request for new block from free-list/bitmap structure (Depending on non-ASSM and ASSM tablespace option)
2. Distribute existing block so that upper half volume of an index move to the new requested block
3. Insert the column value in appropriate block.
4. Update the leaf block pointers such that previously full block right pointer will point to the new block and new block’s right pointer will point to the right pointed block of previously full block.
5. Finally update the branch block to reference previous full block and add a new entry for to point to new leaf block.

The similar kind of operation is applicable to the branch and root block split. Even branch and root block split is more expensive as it involves corresponding next level pointer updations. Root block split allocate two new blocks wherein data is evenly distributed and root block is updated such that it will now point to these new blocks. So root block will always physically the same block. The root block split can increase the height of the index by 1.

90-10 BLOCK SPLIT

The 90-10 block split can occur, when the new indexed column entry is the maximum value. In this case, new block will be requested and corresponding branch blocks are updated accordingly.

Can deleted space of an index be reused?

There are multiple answers to this question –
1. Index will never use deleted space.
2. Index will use deleted space if the same column value is inserted again.

But in reality, both the above statements are myth. Index will use deleted space even when the new inserted value is not same.

Test Case – 

To validate above statemenet, we will create the test table and insert some records into it. Here temp table contains the serial values.


SQL> create table temp as select rownum "A" from dba_objects a, dba_objects;

Table created.


SQL> create table test (a number, b number);

Table created.

SQL> create index ind_test on test(a);

Index created.

SQL> insert into test select a, a+50000 from temp where a >10000 and a<=20000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> analyze index ind_test validate structure;

Index analyzed.


SQL> select name, lf_rows, del_lf_rows, used_space
From index_stats where name='IND_TEST';

NAME LF_ROWS DEL_LF_ROWS USED_SPACE
------------------------------ ---------- ----------- ----------
IND_TEST 10000 0 160127


As LF_ROWS column shows, there are 10000 records present in the index. Now check the values after deletion of some records.


SQL> delete from test where a > 14000 and a <= 16000;

2000 rows deleted.


SQL> commit;

Commit complete.


SQL> analyze index ind_test validate structure;


Index analyzed.


SQL> select name, lf_rows, del_lf_rows, used_space from index_stats
where name='IND_TEST';


NAME LF_ROWS DEL_LF_ROWS USED_SPACE
---------- ---------- ----------- ----------
IND_TEST 10000 2000 160127


SQL> select count(*) from test;

COUNT(*)
----------
8000


Here the number of rows (LF_ROWS – DEL_LF_ROWS) = (10000-2000) = 8000 which matches the count of the test table.

To check whether index uses the deleted space, we will add new records which are not same as the current or deleted rows.


SQL> insert into test select a, a+50000 from temp where a > 20000 and a<=22000;

2000 rows created.


SQL> commit;

Commit complete.


SQL> analyze index ind_test validate structure;

Index analyzed.



SQL> select name, lf_rows, del_lf_rows, used_space from index_stats
where name='IND_TEST';


NAME LF_ROWS DEL_LF_ROWS USED_SPACE
---------- ---------- ----------- ----------
IND_TEST 10500 500 168146


SQL>
SQL> select count(1) from test;

COUNT(1)
----------
10000



The total added rows are 2000. As the del_lf_rows column shows, there are only 500 deleted records found. Which means, out of 2000 deleted records, the index has used deleted space of 1500 records. Now check the status after insertion of the some previously deleted records. We had deleted records between 14000 and 16000 values, and now will insert any of these 500 values.


SQL> insert into test select a, a+1000
From temp where a > 14500 and a <= 15500 and mod(a,2)=0;

500 rows created.


SQL> commit;

Commit complete.


SQL> analyze index ind_test validate structure;

Index analyzed.


SQL> select name, lf_rows, del_lf_rows, used_space
From index_stats where name='IND_TEST';


NAME LF_ROWS DEL_LF_ROWS USED_SPACE
---------- ---------- ----------- ----------
IND_TEST 10500 0 168141



The above statistics shows that index will always use the deleted space.

Conclusion –

1. Rows deletions will either make the blocks empty or some space in block.
2. The empty blocks can be used for any rows irrespective of previous deleted column values.
3. If there is space available in existing blocks, space can be used if it satifies the column value between lower and upper existing values in that block.

In many cases, it has also been observed that DBA's rebuild indexes to reclaim deleted space, but the test case above shows that the deleted space is reclaimed and hence does not require any rebuilding. Hence, if this is one of the primary reason to rebuild the indexes, then, dba's now can wonder whether do they really require index rebuilding?