Saturday, December 29, 2007

Index Internals.

Tables can grow large, and when they do, it becomes difficult for users to quickly find the data they need. Oracle offers indexes as a method of speeding database performance when accessing tables with a lot of data. The b-tree index is the traditional and most useful indexing mechanism. It stores the data in a tree like structure.

The b-tree index structure is made up of root block, branch blocks and leaf blocks. The root block is an entry point where search for data in index starts. Any index contains only and only one root block. The root block is always physically the same block. The branch blocks are next level of the root block that is having pointers to leaf blocks in the index.

The leaf blocks are the highest level of the index, which contain indexed column values and the corresponding ROWIDs. Each leaf block is comprised of double-linked list structure. It means each leaf block is linked to the other block on its left and right, in order to make it possible to search in both the directions through a range of values in index. The index entries are always in ordered.

The oracle's index is always maintains the balanced structure. To understand this, it is necessary to understand block split operation in index. There are two ways of block splits

50-50 BLOCK SPLIT

The 50-50-block split can occur when there is an insert operation of a non-maximum value and when the corresponding block is full. The indexed column update operation for an index is internally delete followed by an insert. The split operation steps are as follows –

1. Request for new block from free-list/bitmap structure (Depending on non-ASSM and ASSM tablespace option)
2. Distribute existing block so that upper half volume of an index move to the new requested block
3. Insert the column value in appropriate block.
4. Update the leaf block pointers such that previously full block right pointer will point to the new block and new block’s right pointer will point to the right pointed block of previously full block.
5. Finally update the branch block to reference previous full block and add a new entry for to point to new leaf block.

The similar kind of operation is applicable to the branch and root block split. Even branch and root block split is more expensive as it involves corresponding next level pointer updations. Root block split allocate two new blocks wherein data is evenly distributed and root block is updated such that it will now point to these new blocks. So root block will always physically the same block. The root block split can increase the height of the index by 1.

90-10 BLOCK SPLIT

The 90-10 block split can occur, when the new indexed column entry is the maximum value. In this case, new block will be requested and corresponding branch blocks are updated accordingly.

Can deleted space of an index be reused?

There are multiple answers to this question –
1. Index will never use deleted space.
2. Index will use deleted space if the same column value is inserted again.

But in reality, both the above statements are myth. Index will use deleted space even when the new inserted value is not same.

Test Case – 

To validate above statemenet, we will create the test table and insert some records into it. Here temp table contains the serial values.


SQL> create table temp as select rownum "A" from dba_objects a, dba_objects;

Table created.


SQL> create table test (a number, b number);

Table created.

SQL> create index ind_test on test(a);

Index created.

SQL> insert into test select a, a+50000 from temp where a >10000 and a<=20000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> analyze index ind_test validate structure;

Index analyzed.


SQL> select name, lf_rows, del_lf_rows, used_space
From index_stats where name='IND_TEST';

NAME LF_ROWS DEL_LF_ROWS USED_SPACE
------------------------------ ---------- ----------- ----------
IND_TEST 10000 0 160127


As LF_ROWS column shows, there are 10000 records present in the index. Now check the values after deletion of some records.


SQL> delete from test where a > 14000 and a <= 16000;

2000 rows deleted.


SQL> commit;

Commit complete.


SQL> analyze index ind_test validate structure;


Index analyzed.


SQL> select name, lf_rows, del_lf_rows, used_space from index_stats
where name='IND_TEST';


NAME LF_ROWS DEL_LF_ROWS USED_SPACE
---------- ---------- ----------- ----------
IND_TEST 10000 2000 160127


SQL> select count(*) from test;

COUNT(*)
----------
8000


Here the number of rows (LF_ROWS – DEL_LF_ROWS) = (10000-2000) = 8000 which matches the count of the test table.

To check whether index uses the deleted space, we will add new records which are not same as the current or deleted rows.


SQL> insert into test select a, a+50000 from temp where a > 20000 and a<=22000;

2000 rows created.


SQL> commit;

Commit complete.


SQL> analyze index ind_test validate structure;

Index analyzed.



SQL> select name, lf_rows, del_lf_rows, used_space from index_stats
where name='IND_TEST';


NAME LF_ROWS DEL_LF_ROWS USED_SPACE
---------- ---------- ----------- ----------
IND_TEST 10500 500 168146


SQL>
SQL> select count(1) from test;

COUNT(1)
----------
10000



The total added rows are 2000. As the del_lf_rows column shows, there are only 500 deleted records found. Which means, out of 2000 deleted records, the index has used deleted space of 1500 records. Now check the status after insertion of the some previously deleted records. We had deleted records between 14000 and 16000 values, and now will insert any of these 500 values.


SQL> insert into test select a, a+1000
From temp where a > 14500 and a <= 15500 and mod(a,2)=0;

500 rows created.


SQL> commit;

Commit complete.


SQL> analyze index ind_test validate structure;

Index analyzed.


SQL> select name, lf_rows, del_lf_rows, used_space
From index_stats where name='IND_TEST';


NAME LF_ROWS DEL_LF_ROWS USED_SPACE
---------- ---------- ----------- ----------
IND_TEST 10500 0 168141



The above statistics shows that index will always use the deleted space.

Conclusion –

1. Rows deletions will either make the blocks empty or some space in block.
2. The empty blocks can be used for any rows irrespective of previous deleted column values.
3. If there is space available in existing blocks, space can be used if it satifies the column value between lower and upper existing values in that block.

In many cases, it has also been observed that DBA's rebuild indexes to reclaim deleted space, but the test case above shows that the deleted space is reclaimed and hence does not require any rebuilding. Hence, if this is one of the primary reason to rebuild the indexes, then, dba's now can wonder whether do they really require index rebuilding?