Let's take a look at the simple table :
create table testts (ide int, name varchar2(100)) create index ideindx on testts(ide)and simple statement:
create table testts (ide int, name varchar2(100)) select * from testts where ide=10In order to execute this statement DB2 (like any other database engine) creates "access plan". Access plan is simply SQL statement decomposed into more atomic operations like "scan table", "apply predicate", "look up in the index", "acquire lock" etc. The simple SQL statement above can be executed in two ways:
- Scan the table and find rows which fulfill the predicate.
- Look up the index and then pick up row(s) from the table.
Problem
But take a closer look at this simple example.
To populate this table with some test data I created a simple procedure
CREATE OR REPLACE PROCEDURE INSERTS (NUMB IN INT) AS MAXI INT; I INT; BEGIN SELECT MAX(IDE) INTO MAXI FROM TESTTS; IF MAXI IS NULL THEN MAXI := 0; END IF; FOR I IN 1..NUMB LOOP INSERT INTO TESTTS VALUES(MAXI + I, 'AUTO NAME ' || (MAXI + I)); END LOOP; END;Add one record to the table and check access plan for this statement.
call inserts(1) runstats on table testts explain plan for select * from testts where ide=10Using db2exfmt utility we can export access plan in human readable format. It is available here.
Access Plan: ----------- Total Cost: 6.87148 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 1 TBSCAN ( 2) 6.87148 1 | 1 TABLE: DB2INST3 TESTTS Q1But there there is a surprise. Although index was created, DB2 optimizer decided that table scanning is a better option. After adding more rows:
call inserts(10) runstats on table testtsthe access plan is as one could expect.
Total Cost: 6.87692 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 1 FETCH ( 2) 6.87692 1 /----+----\ 1 11 IXSCAN TABLE: DB2INST3 ( 3) TESTTS 0.0538374 Q1 0 | 11 INDEX: DB2INST3 IDEINDX Q1What's going on
Obviously DB2 decides that scanning table is better option for one row but worse for eleven rows. But how to check it ? By virtue of optimization guidelines (practical example is described here) we can force DB2 engine to use index for one row and scan table for eleven rows.
Access plan for one row with index lookup is available here. General cost in case of index lookup is 6.87458 and is a little greater then table scanning (6.87148). It is so because index lookup requires two steps : index lookup and later table access to pick up rows. Table scanning requires only one step. Although estimated I/O cost is the same (equal to 1), CPU consumption is higher for two steps (57021 aginsts 54482).
Access plan for eleven rows ignoring index is available here. This time total cost is equal to 6.89792 and is higher then total cost with index lookup : 6.87692. Although the dilemma is the same : two steps (index and table) against one step (table only) the optimizer decides that predicated CPU consumption for scanning through 11 rows in search for one row is higher. Cost for table scanning here is : 76152. CPU cost for index lookup and picking up one row is : 58941.
But add much more rows to the table and again check access plan for this SQL statement with and without index.
call inserts(100000) runstats on table testtsAccess plan with index lookup (default) is available here. The access plan with force table scanning is available here. The difference is overwhelming: 665.297 against 30.3052. But the main factor are not subtle differences in CPU consumption but obviously I/O cost :430 against 1. It goes without saying that I/O access with table scanning is more extensive than index lookup. This difference in also reflected in SQL real execution (DB10.1, 32 bit Ubuntu machine, 2GB memory).
(table scanning)
$ time db2 "select * from testts where ide=10" IDE NAME ----------- ---------------------------------------------------------------------------------------------------- 10 AUTO NAME 10 Wybrano rekordów: 1. real 0m0.155s user 0m0.016s sys 0m0.032s(default, index lookop)
$ time db2 "select * from testts where ide=10" IDE NAME ----------- ---------------------------------------------------------------------------------------------------- 10 AUTO NAME 10 Wybrano rekordów: 1. real 0m0.062s user 0m0.028s sys 0m0.020sExecution is more then 2 times faster with index lookup and probable more rows in the table the difference is more distinct.
Conclusion
In this simple example one can see how DB2 optimizer works and how the decision is taken which access plan to choose. But it is important to underline that the decision is based on resource consumption and it not always goes with execution speed. This distinction is very important because sometimes the best execution time is achieved by more extensive CPU consumption. It does not matter in a test environment but is not always a good decision in production, multiuser and mulitiprocess environment. This fastest and CPU thirsty access plan can starve to death the other processes.