Feeds:
Posts
Comments

Archive for the ‘Index And CBO’ Category

You know that Query optimizer is responsible to determine the best execution/explain plan. In many cases you may be astonished that the data is same in both database and you have gather statistics but both database give different execution plan. The possible reason for it it the variation of initialization parameter between two database. Now in the later section in this post we will have a look at the initialization parameter that affect the optimizer to determine execution plan.

A)Enable Query Optimizer Feature
——————————————–
OPTIMIZER_FEATURES_ENABLE Parameter
————————————————
•It is a string type parameter and takes oracle version number as argument.
•Based on this parameter settings it is determined how oracle optimizer behaves.
•Every new release of oracle version comes with new feature for optimizer. Thus new version of optimizer can collect extra features of a query based on which execution plan can changes. If you upgrade your oracle to newer version and your don’t want to change your execution plan according to new one (keep like older) then you can set this parameter to older one.
•The valid values of this parameter can be,
8.0.0 | 8.0.3 | 8.0.4 | 8.0.5 | 8.0.6 | 8.0.7 | 8.1.0 | 8.1.3 | 8.1.4 | 8.1.5 | 8.1.6 | 8.1.7 | 9.0.0 | 9.0.1 | 9.2.0 | 10.0.0 | 10.1.0 | 10.1.0.3 | 10.1.0.4 | 10.2.0.1|10.2.0.2|10.2.0.3| etc.

B)Control the Behavior of the Query Optimizer
—————————————————————————
Here is the list of initialization parameters that can be used to control the behavior of the query optimizer.

1)CURSOR_SHARING
2)DB_FILE_MULTIBLOCK_READ_COUNT
3)OPTIMIZER_INDEX_CACHING
4)OPTIMIZER_INDEX_COST_ADJ
5)OPTIMIZER_MODE
6)PGA_AGGREGATE_TARGET
7)STAR_TRANSFORMATION_ENABLED

Read Full Post »

By default, the goal of the query optimizer is the best throughput. This means that it chooses the least amount of resources necessary to process all rows accessed by the statement. Oracle can also optimize a statement with the goal of best response time. This means that it uses the least amount of resources necessary to process the first row accessed by a SQL statement.

You always choose a goal for the optimizer based on the needs of your application from two options.

For, Oracle Reports applications throughput is more important ,because the user initiating the application is only concerned with the time necessary for the application to complete. Response time is less important, because the user does not examine the results of individual statements while the application is running.

For Oracle Forms applications or SQL*Plus queries response time is important because the interactive user is waiting to see the first row or first few rows accessed by the statement.

The goal of the optimizer is affected by,

A)OPTIMIZER_MODE Initialization Parameter
B)Optimizer SQL Hints for Changing the Query Optimizer Goal
C)Query Optimizer Statistics in the Data Dictionary

A)OPTIMIZER_MODE Initialization Parameter
—————————————————————————
The OPTIMIZER_MODE initialization parameter establishes the default behavior for choosing an optimization approach for the instance. To know your current settings issue,

SQL> show parameter optimizer_mode
NAME TYPE VALUE
———————————— ———– ——————————
optimizer_mode string ALL_ROWS

This parameter can have three types of values.
1)ALL_ROWS: The optimizer uses a cost-based approach for all SQL statements in the session regardless of the presence of statistics and optimizes with a goal of best throughput (minimum resource use to complete the entire statement). This is the default value.

2)FIRST_ROWS_n: The optimizer uses a cost-based approach, regardless of the presence of statistics, and optimizes with a goal of best response time to return the first n number of rows; n can equal 1, 10, 100, or 1000.

3)FIRST_ROWS: The optimizer uses a mix of cost and heuristics to find a best plan for fast delivery of the first few rows. FIRST_ROWS is available for backward compatibility and plan stability; use FIRST_ROWS_n instead.

B)Optimizer SQL Hints for Changing the Query Optimizer Goal
——————————————————————————————-
To specify the goal of the query optimizer for an individual SQL statement, use one of the hints from FIRST_ROWS(n) or ALL_ROWS. Actually hints in an individual SQL statement override the OPTIMIZER_MODE initialization parameter for that SQL statement.

C)Query Optimizer Statistics in the Data Dictionary
—————————————————————————————
The statistics used by the query optimizer are stored in the data dictionary. If no statistics are available when using query optimization, the optimizer will do dynamic sampling depending on the setting of the OPTMIZER_DYNAMIC_SAMPLING initialization parameter. This may cause slower parse times so for best performance, the optimizer should have representative optimizer statistics.

Read Full Post »

Whenever you write an sql query, the query can be executed in many different ways, such as full table scans, index scans, nested loops, and hash joins. The query optimizer checks the various ways and determines the most efficient way to execute a SQL statement after considering many factors related to the objects referenced and the conditions specified in the query. This determination is an important step in the processing of any SQL statement and can greatly affect execution time.

The output from the optimizer is a plan that describes an optimum method of execution. The Oracle server provides query optimization.

Let’s have a look at the operations that optimizer performs while processing SQL operations.

1)Evaluation of expressions and conditions: The optimizer first evaluates expressions and conditions containing constants as fully as possible.

2)Statement transformation: For complex statements involving, for example, correlated subqueries or views, the optimizer might transform the original statement into an equivalent join statement.

3)Choice of optimizer goals: The optimizer determines the goal of optimization. This topic is discussed later.

4)Choice of access paths: For each table accessed by the statement, the optimizer chooses one or more of the available access paths to obtain table data.

5)Choice of join orders: For a join statement that joins more than two tables, the optimizer chooses which pair of tables is joined first, and then which table is joined to the result, and so on.

Besides optimizer determination, the application designer can use hints in SQL statements to instruct the optimizer about how a statement should be executed.

Note: Before oracle 10g there was available CBO and RBO in case of optimizer optimization.

CBO-Cost Based Optimizer: Execution plan is calculated by taking into account the distribution of data. Starting from 10g optimizer only use CBO while optimization.

RBO-Rule-based optimizer: Chooses an execution plan for SQL statements based on the access paths available and the ranks of these access paths. If there is more than one way, then the RBO uses the operation with the lowest rank. This feature has been desupported.

Read Full Post »

Error Description
Whenever I try to rebuild index online then it fails with message ORA-00604: error occurred at recursive SQL level 1 along with ORA-01450: maximum key length (3215) exceeded. Below is the scenario.
SQL> create table tab1(a varchar2(3000),b varchar2(2000));
Table created.

SQL> create index tab1_I on tab1(a,b);
Index created.

SQL> alter index tab1_I rebuild online;
alter index tab1_I rebuild online
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01450: maximum key length (3215) exceeded

Let’s now create one table with column length 3000+199=3199 bytes and see what happens.
SQL> create table tab3(a varchar2(3000),b varchar2(199));
Table created.

SQL> create index tab3_I on tab3(a,b);
Index created.

Try to rebuild it online and it works.
SQL> alter index tab3_I rebuild online;
Index altered.

Now just add extra 1 bytes on column b. And whenever we try to rebuild it online it will fail.

SQL> alter table tab3 modify b varchar2(200);
Table altered.

SQL> alter index tab3_I rebuild online;
alter index tab3_I rebuild online
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01450: maximum key length (3215) exceeded

Cause of the Problem
When creating a index the total length of the index cannot exceed a certain value. Primarily this value depends on DB_BLOCK_SIZE.
If 2K block size then maximum index key length=758
If 4K block size then maximum index key length=1578
If 8K block size then maximum index key length=3218
If 16K block size then maximum index key length=6498
How the maximum index key length is measured by?
Maximum index key length=Total index length (Sum of width of all indexed column+the number of indexed columns)+Length of the key(2 bytes)+ROWID(6 bytes)+the length of the rowid(1 byte)

The index key size is limited by the value of db_block_size, because a key value may not span multiple blocks. So, based on the size of the block size of index depends. In fact, it is required that any index block must contain at least TWO index entries per block.

So we can say that the maximum key length for an index will be less than half of
the DB_BLOCK_SIZE. But we know that in a block there also needed space for PCTFREE, INITRANS and space for block overhead(Block Header,ROW Directory, Table Directory, etc). After considering these bytes the actual space that can be used for the Index key is actually just over 1/3 of the DB_BLOCK_SIZE.

Solution of the Problem
The causes already indicates what might be the solutions. Solution may be,
1)Increase your database block size. Create a tablespace with bigger block size and create index on that tablespace.

2)If you have index on multiple columns then you can split index to single or 2 columns so that size does not extended over it can handle.

3)Rebuild the index without online clause. That is
ALTER INDEX index_name REBUILD;
Because The online rebuild of the index creates a journal table and index. This internal journal IOT table contains more columns in its index. This is a feature of online rebuild. This time the error arises because that current value of the initialization parameter db_block_size is not large enough to create internal journal IOT.

Expert are always welcome for their valuable comment or suggestion for the above post.

 

Read Full Post »

Out of all Oracle RDBMS modules, optimizer code is actually the most complicated code and different optimizer modes seem like jack while lifting your car in case of a puncture.

This paper focuses on how optimizer behaves differently when you have optimizer mode set to ALL_ROWS or FIRST_ROWS.

Possible values for optimizer_mode = choose/ all_rows/ first_rows/ first_rows[n]

By default, the value of optimizer_mode is CHOOSE which basically means ALL_ROWS (if statistics on underlying tables exist) else RULE (if there are no statistics on underlying tables). So it is very important to have statistics collected on your tables on regular intervals or else you are living in Stone Age.

FIRST_ROWS and ALL_ROWS are both cost based optimizer features. You may use them according to their requirement.

FIRST_ROWS/ FIRST_ROWS[n]

In simple terms it ensures best response time of first few rows (n rows).

This mode is good for interactive client-server environment where server serves first few rows and by the time user scroll down for more rows, it fetches other. So user feels that he has been served the data he requested, but in reality the request is still pending and query is still fetching the data in background.

Best example for this is toad, if you click on data tab, it instantaneously start showing you data and you feel toad is faster than sqlplus, but the fact is if you scroll down, you will see the query is still running.

Ok, let us simulate this on SQLPLUS

Create a table and index over it:
SQL> create table test as select * from all_objects;

Table created.

SQL> create index test_in on test(object_type);

Index created.

SQL> exec dbms_stats.gather_table_stats(‘SAC’,’TEST’)

PL/SQL procedure successfully completed.

SQL> select count(*) from test;

COUNT(*)
———-
37944

SQL> select count(*) from test where object_type=’JAVA CLASS’;

COUNT(*)
———-
14927

You see out of almost 38k records, 15k are of JAVA class. And now if you select the rows having object_type=’JAVA_CLASS’, it should not use index as almost half of the rows are JAVA_CLASS. It will be foolish of optimizer to read the index first and then go to table.

Check out the Explain plans

SQL> set autotrace traceonly exp
SQL> select * from test where object_type=’JAVA CLASS’;

Execution Plan
———————————————————-
Plan hash value: 1357081020

————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 1001 | 94094 | 10 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 1001 | 94094 | 10 (0)| 00:00:01 |
————————————————————————–

As you see above, optimizer has not used Index we created on this table.

Now use FIRST_ROWS hint:
SQL> select /*+ FIRST_ROWS*/ * from test where object_type=’JAVA CLASS’;

Execution Plan
———————————————————-
Plan hash value: 3548301374

—————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
—————————————————————————————
| 0 | SELECT STATEMENT | | 14662 | 1345K| 536 (1)| 00:00:07 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 14662 | 1345K| 536 (1)| 00:00:07 |
|* 2 | INDEX RANGE SCAN | TEST_IN | 14662 | | 43 (3)| 00:00:01 |
—————————————————————————————

In this case, optimizer has used the index.

Q> Why?

Ans> Because you wanted to see first few rows quickly. So, following your instructions oracle delivered you first few rows quickly using index and later delivering the rest.

See the difference in cost, although the response time (partial) of second query was faster but resource consumption was high.

But that does not mean that this optimizer mode is bad. As I said this mode may be good for interactive client-server model. In most of OLTP systems, where users want to see data fast on their screen, this mode of optimizer is very handy.

Important facts about FIRST_ROWS

  1. It gives preference to Index scan Vs Full scan (even when index scan is not good).
  2. It prefers nested loop over hash joins because nested loop returns data as selected (& compared), but hash join hashes one first input in hash table which takes time.
  3. Cost of the query is not the only criteria for choosing the execution plan. It chooses plan which helps in fetching first rows fast.
  4. It may be a good option to use this in an OLTP environment where user wants to see data as early as possible.

 


ALL_ROWS

In simple terms, it means better throughput

While FIRST_ROWS may be good in returning first few rows, ALL_ROWS ensures the optimum resource consumption and throughput of the query. In other words, ALL_ROWS is better to retrieve the last row first.

In above example while explaining FIRST_ROWS, you have already seen how efficient ALL_ROWS is.

Important facts about ALL_ROWS

  1. ALL_ROWS considers both index scan and full scan and based on their contribution to the overall query, it uses them. If Selectivity of a column is low, optimizer may use index to fetch the data (for example ‘where employee_code=7712’), but if selectivity of column is quite high (‘where deptno=10’), optimizer may consider doing Full table scan. With ALL_ROWS, optimizer has more freedom to its job at its best.
  2. Good for OLAP system, where work happens in batches/procedures. (While some of the report may still use FIRST_ROWS depending upon the anxiety level of report reviewers)
  3. Likes hash joins over nested loop for larger data sets.

Conclusion

Cost based optimizer gives you flexibility to choose response time or throughput. So use them based on your business requirement.

Read Full Post »