Monday, September 15, 2008

Materialized view fast refresh and performance.

Oracle uses materialized view to replicate data to remote database in distributed environment. Also it can be used for cache expensive queries in a data warehouse environment. Many of the times, an issue is reported for low performance in fast or incremental refresh. There could be many of the reasons, bugs participating performance issues. One of them is to create an index on snaptime$$ column of corresponding MLOG$ table, which helps to fasten the fast refresh. It is required when MLOG$ table size grows due to any of the reason.

Recently when I was looking into one of such issue, it was found that following query was getting executed (internally by Oracle) in primary database.

Query

SELECT /*+ */ "A2"."CO_ID","A2"."TMCODE","A2"."SPCODE","A2"."SNCODE","A2"."CS_SEQNO",
"A2"."CS_CHANNEL_NUM","A2"."CS_STAT_CHNG","A2"."CS_ON_CBB","A2"."CS_DATE_BILLED",
"A2"."CS_REQUEST","A2"."SN_CLASS","A2"."CS_OVW_SUBSCR","A2"."CS_SUBSCRIPT",
"A2"."CS_OVW_ACCESS","A2"."CS_OVW_ACC_PRD","A2"."CS_OVW_ACC_FIRST","A2"."CS_ACCESS",
"A2"."CS_PENDING_STATE","A2"."CS_CHANNEL_EXCL","A2"."CS_DIS_SUBSCR","A2"."CS_ADV_CHARGE",
"A2"."CS_SRV_TYPE","A2"."SUBPAYER","A2"."USGPAYER","A2"."ACCPAYER","A2"."CS_ENTDATE",
"A2"."CS_OVW_LAST","A2"."INSTALL_DATE","A2"."TRIAL_END_DATE","A2"."CS_ADV_CHARGE_END_DATE",
"A2"."PRM_VALUE_ID","A2"."CURRENCY","A2"."CS_ADV_CHARGE_CURRENCY",
"A2"."REC_VERSION","A2"."SRV_SUBTYPE"
FROM "CONTR_SERVICES" "A2",
(SELECT DISTINCT "A3"."CO_ID" "CO_ID","A3"."SNCODE" "SNCODE","A3"."CS_SEQNO" "CS_SEQNO"
FROM "SYSADM"."MLOG$_CONTR_SERVICES" "A3"
WHERE "A3"."SNAPTIME$$">:1
AND "A3"."DMLTYPE$$"<>'D') "A1"
WHERE "A2"."CO_ID"="A1"."CO_ID"
AND "A2"."SNCODE"="A1"."SNCODE"
AND "A2"."CS_SEQNO"="A1"."CS_SEQNO";


Plan Table
-------------------------------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------
| SELECT STATEMENT | | 748K| 87M|1382913 | | |
| HASH JOIN | | 748K| 87M|1382913 | | |
| VIEW | | 751K| 27M| 3585 | | |
| SORT UNIQUE | | 751K| 16M| 3585 | | |
| TABLE ACCESS BY INDEX |MLOG$_CONTR_SER| 751K| 16M| 16 | | |
| INDEX RANGE SCAN |MLOGSNAPTIME | 751K| | 655 | | |
| TABLE ACCESS FULL |CONTR_SERVICES | 125M| 9G| 332467 | | |
-------------------------------------------------------------------------------------

Though there was an index available on SNAPTIME$$ table, the refresh was taking time because the base table size was huge (around 17GB) and it was doing FTS on it.

SQL> select owner, segment_name, bytes/1024/1024 "SIZE" from dba_segments
2 where segment_name='CONTR_SERVICES';

OWNER SEGMENT_NAME SIZE
---------- ------------------------------ ----------
SYSADM CONTR_SERVICES 17216.0156

The mlog$ table size was also very big (around 750 MB) and having millions of records. Oracle opted for FTS because of table statistics available on MLOG$_CONTR_SERVICES table. Following are the available stats –

SQL> select owner, table_name, partitioned, num_rows, blocks, global_stats
2 from dba_tables where table_name='MLOG$_CONTR_SERVICES';

OWNER TABLE_NAME PAR NUM_ROWS BLOCKS GLO
------------ ------------------------------ --- --------------- ---------- ---
SYSADM MLOG$_CONTR_SERVICES NO 22,531,900 95364 YES

Most of the cases, fast refresh materialized view refresh is based on primary key of the base table. If regular fast refresh is happening, the number of rows gets replicated are very less and ideally it should do index scan with primary key. In the above mentioned case, few thousands rows were suppose to replicate.

The customer was having dynamic analyze script wherein they were also analyzing mlog$ tables. To have better performance in fast refresh, we need to purge the mlog$ table regularly and stats on this table must be gathered when the table is empty. To prove this point, I have set customized stats on mlog table.

SQL> exec dbms_stats.set_table_stats ('SYSADM','MLOG$_CONTR_SERVICES',NUMROWS=>0, NUMBLKS=>0);

PL/SQL procedure successfully completed.
.
.
SQL> select owner, table_name, partitioned, num_rows, blocks, global_stats
2 from dba_tables where table_name='MLOG$_CONTR_SERVICES';

OWNER TABLE_NAME PAR NUM_ROWS BLOCKS GLO
------------ ------------------------------ --- --------------- ---------- ---
SYSADM MLOG$_CONTR_SERVICES NO 0 0 YES

After setting above stats, the plan got changed.

Plan Table
-------------------------------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------
| SELECT STATEMENT | | 1 | 122 | 27 | | |
| NESTED LOOPS | | 1 | 122 | 27 | | |
| VIEW | | 1 | 39 | 24 | | |
| SORT UNIQUE | | 1 | 23 | 24 | | |
| TABLE ACCESS BY INDEX |MLOG$_CONTR_SER| 1 | 23 | 1 | | |
| INDEX RANGE SCAN |MLOGSNAPTIME | 1 | | 655 | | |
| TABLE ACCESS BY INDEX RO|CONTR_SERVICES | 125M| 9G| 3 | | |
| INDEX UNIQUE SCAN |PKCONTR_SERVICE| 125M| | 2 | | |
-------------------------------------------------------------------------------------

In this case, it was doing proper index scan on primary key and fast refresh get successfully completed within minutes.

To sum up, statistics play an important role and must have to be used in proper way. In materialized view refresh following things needs to be remembered

1. The MLOG$ table must be analyze when table is empty.
2. Regular MLOG$ purging is required, where size grows.