Sunday, May 18, 2008

Functions and Performance.

Function plays an important role in development environment. The functions can be implicit (defined by oracle) or explicit (defined by user – PL/SQL code). Developers use the PL/SQL functions as they simplify the large code.

The function may accept the value and must return the value. The user-defined functions are easy to use but they may degrade the performance badly. Hence it is always recommended to use the functions only and only when they are unavoidable. I have demonstrated the performance benefits by the simple query over user-defined functions.

If the condition does not satisfy, function will return null value. It means the output rows will be the number of rows satisfied by the main query. For e.g. if suppose the main query returns 1000 rows and function satisfies only 400 rows, then the output will have 1000 rows, wherein the function column will show 600 null values and 400 value returned by the function. Hence we must have to use outer join whenever we are merging function into the simple query. The example below uses select sub-query that plays the role of function.


SQL> create table fn_tables as select * from dba_tables;

Table created.

SQL> create table fn_indexes as select * from dba_indexes;

Table created.

SQL> create index fn_ind_tables on fn_tables (table_name);

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'FN_TABLES', cascade=>true);

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user, 'FN_INDEXES', cascade=>true);

PL/SQL procedure successfully completed.

SQL> create or replace function fn_test (tabowner varchar2, tabname varchar2)
2 return varchar2 is
3 tbsname varchar2(200);
4 begin
5 select tablespace_name into tbsname
6 from fn_tables
7 where owner=tabowner
8 and table_name=tabname;
9 return tbsname;
10 end;
11 /

Function created.


SQL> set autot trace
SQL> select a.owner, a.index_name, a.status,
2 fn_test(a.owner, a.table_name) "TBS_NAME"
3 from fn_indexes a
4 /

1072 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2626245312

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1072 | 51456 | 9 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| FN_INDEXES | 1072 | 51456 | 9 (0)| 00:00:01 |
--------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
1097 recursive calls
0 db block gets
3308 consistent gets
0 physical reads
0 redo size
36706 bytes sent via SQL*Net to client
1162 bytes received via SQL*Net from client
73 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1072 rows processed


SQL> select a.owner, a.index_name, a.status,
2 (select tablespace_name from fn_tables b
3 where b.owner=a.owner
4 and a.table_name=b.table_name) "TBS_NAME"
5 from fn_indexes a
6 /

1072 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 321380953

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1072 | 51456 | 9| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| FN_TABLES | 1 | 29 | 2| 00:00:01 |
|* 2 | INDEX RANGE SCAN | FN_IND_TABLES| 1 | | 1| 00:00:01 |
| 3 | TABLE ACCESS FULL | FN_INDEXES | 1072 | 51456 | 9| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("B"."OWNER"=:B1)
2 - access("B"."TABLE_NAME"=:B1)


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1573 consistent gets
0 physical reads
0 redo size
36706 bytes sent via SQL*Net to client
1162 bytes received via SQL*Net from client
73 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1072 rows processed


SQL> select a.owner, a.index_name, a.status, b.tablespace_name "TBS_NAME"
2 from fn_indexes a,
3 fn_tables b
4 where b.owner(+) = a.owner
5 and b.table_name(+) = a.table_name
6 /

1072 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 893717710

------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1072 | 82544 | 18 (6)| 00:00:01 |
|* 1 | HASH JOIN RIGHT OUTER| | 1072 | 82544 | 18 (6)| 00:00:01 |
| 2 | TABLE ACCESS FULL | FN_TABLES | 879 | 25491 | 8 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | FN_INDEXES | 1072 | 51456 | 9 (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("B"."OWNER"(+)="A"."OWNER" AND
"B"."TABLE_NAME"(+)="A"."TABLE_NAME")


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
135 consistent gets
0 physical reads
0 redo size
36706 bytes sent via SQL*Net to client
1162 bytes received via SQL*Net from client
73 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1072 rows processed


Place all above required values in tabular format.


Query No. Query Criteria Cost Recursive Calls Consistent Gets
1 With Function 9 1097 3308
2 With Sub-Query 9 0 1573
3 With Simple Query 18 0 135


As shown in above table, though the cost of the third query is high, the number of consistent gets has been reduced tremendously. When we are using function, the recursive calls and consistent gets are very high. But when using sub-query both the values have been reduced but this query also suffers from performance compared to third query.

Recently I had got the following query from one of our customer. Due to confidentiality, I will not be posting the original query from the customer site but in this example I have made similar kind of query.


create or replace function get_minamount (cust number)
return number is
amt number;

begin
select min (amount) into amt
from contract
where customer=cust and status = 'a';

if (amt is null) then
select min (amount) into amt
from contract
where customer=cust and status = 'd';
end if;

if (amt is null) then
select min (amount) into amt
from contract
where customer=cust and status = 's';
end if;

Return amt;

End;
/


The main query is as follows.

select customer, name, address,
get_minamount (customer) "MIN_AMT"
from customer_master
where custcode is like 'Large%';


The customer_master is having millions of records and to execute this query it was taking around 8 Hrs. The given suggestion on this query is as follows.


select customer, name, address, min_amt, status
from (select a.customer, name, address, status,
min(case when status in ('a','s','d') then amount end)
over (partition by b.customer, b.status) "MIN_AMT",
rank() over (partition by b.customer
order by (case when status = 'a' and amount is not null then 1
when status = 's' and amount is not null then 2
when status = 'd' and amount is not null then 3
end), rownum) "RNK"
from customer_master a,
contract b
where a.customer = b.customer(+))
where rnk = 1;


After implementing the query the performance gain was very high. The modified query is taking around 20 minutes to execute against 8 Hrs.

To sum up, the user-defined functions may contribute in performance degradation of the query. The simple query will give major performance benefits against the query with functions. The performance of the query is inversely proportional to the number of consistent gets. Hence to improve the performance, we must minimize the consistent gets.

No comments: