Feeds:
Posts
Comments

Archive for January, 2011

By default, the goal of the query optimizer is the best throughput. This means that it chooses the least amount of resources necessary to process all rows accessed by the statement. Oracle can also optimize a statement with the goal of best response time. This means that it uses the least amount of resources necessary to process the first row accessed by a SQL statement.

You always choose a goal for the optimizer based on the needs of your application from two options.

For, Oracle Reports applications throughput is more important ,because the user initiating the application is only concerned with the time necessary for the application to complete. Response time is less important, because the user does not examine the results of individual statements while the application is running.

For Oracle Forms applications or SQL*Plus queries response time is important because the interactive user is waiting to see the first row or first few rows accessed by the statement.

The goal of the optimizer is affected by,

A)OPTIMIZER_MODE Initialization Parameter
B)Optimizer SQL Hints for Changing the Query Optimizer Goal
C)Query Optimizer Statistics in the Data Dictionary

A)OPTIMIZER_MODE Initialization Parameter
—————————————————————————
The OPTIMIZER_MODE initialization parameter establishes the default behavior for choosing an optimization approach for the instance. To know your current settings issue,

SQL> show parameter optimizer_mode
NAME TYPE VALUE
———————————— ———– ——————————
optimizer_mode string ALL_ROWS

This parameter can have three types of values.
1)ALL_ROWS: The optimizer uses a cost-based approach for all SQL statements in the session regardless of the presence of statistics and optimizes with a goal of best throughput (minimum resource use to complete the entire statement). This is the default value.

2)FIRST_ROWS_n: The optimizer uses a cost-based approach, regardless of the presence of statistics, and optimizes with a goal of best response time to return the first n number of rows; n can equal 1, 10, 100, or 1000.

3)FIRST_ROWS: The optimizer uses a mix of cost and heuristics to find a best plan for fast delivery of the first few rows. FIRST_ROWS is available for backward compatibility and plan stability; use FIRST_ROWS_n instead.

B)Optimizer SQL Hints for Changing the Query Optimizer Goal
——————————————————————————————-
To specify the goal of the query optimizer for an individual SQL statement, use one of the hints from FIRST_ROWS(n) or ALL_ROWS. Actually hints in an individual SQL statement override the OPTIMIZER_MODE initialization parameter for that SQL statement.

C)Query Optimizer Statistics in the Data Dictionary
—————————————————————————————
The statistics used by the query optimizer are stored in the data dictionary. If no statistics are available when using query optimization, the optimizer will do dynamic sampling depending on the setting of the OPTMIZER_DYNAMIC_SAMPLING initialization parameter. This may cause slower parse times so for best performance, the optimizer should have representative optimizer statistics.

Read Full Post »

Whenever you write an sql query, the query can be executed in many different ways, such as full table scans, index scans, nested loops, and hash joins. The query optimizer checks the various ways and determines the most efficient way to execute a SQL statement after considering many factors related to the objects referenced and the conditions specified in the query. This determination is an important step in the processing of any SQL statement and can greatly affect execution time.

The output from the optimizer is a plan that describes an optimum method of execution. The Oracle server provides query optimization.

Let’s have a look at the operations that optimizer performs while processing SQL operations.

1)Evaluation of expressions and conditions: The optimizer first evaluates expressions and conditions containing constants as fully as possible.

2)Statement transformation: For complex statements involving, for example, correlated subqueries or views, the optimizer might transform the original statement into an equivalent join statement.

3)Choice of optimizer goals: The optimizer determines the goal of optimization. This topic is discussed later.

4)Choice of access paths: For each table accessed by the statement, the optimizer chooses one or more of the available access paths to obtain table data.

5)Choice of join orders: For a join statement that joins more than two tables, the optimizer chooses which pair of tables is joined first, and then which table is joined to the result, and so on.

Besides optimizer determination, the application designer can use hints in SQL statements to instruct the optimizer about how a statement should be executed.

Note: Before oracle 10g there was available CBO and RBO in case of optimizer optimization.

CBO-Cost Based Optimizer: Execution plan is calculated by taking into account the distribution of data. Starting from 10g optimizer only use CBO while optimization.

RBO-Rule-based optimizer: Chooses an execution plan for SQL statements based on the access paths available and the ranks of these access paths. If there is more than one way, then the RBO uses the operation with the lowest rank. This feature has been desupported.

Read Full Post »

When Oracle encounters a table with missing statistics, Oracle dynamically gathers the necessary statistics needed by the optimizer. Based on the settings of OPTIMIZER_DYNAMIC_SAMPLING parameter which is discussed in my previous topic optimizer gathers statistics and generate execution plan for the table with missing statistics.

Though in case of remote tables and external tables oracle does not perform dynamic sampling.

Whether dynamic sampling enabled or disabled the optimizer uses default values for its statistics.

The values are listed below.

A)Default Table Values When Statistics Are Missing
—————————————————————–
1)Cardinality=num_of_blocks * (block_size – cache_layer) / avg_row_len
2)Average row length=100 bytes
3)Number of blocks=100 or actual value based on the extent map
4)Remote cardinality=2000 rows
5)Remote average row length=100 bytes

B)Default Index Values When Statistics Are Missing
——————————————————————
1)Levels=1
2)Leaf blocks=25
3)Leaf blocks/key=1
4)Data blocks/key=1
5)Distinct keys=100
6)Clustering factor=800

Read Full Post »

•Before 10g you enabled DBMS_STATS to automatically gather statistics for a table by specifying the MONITORING keyword in the CREATE (or ALTER) TABLE statement. It is good to remember that starting with Oracle Database 10g, the MONITORING and NOMONITORING keywords have been deprecated and statistics are collected automatically. If you do specify these keywords, they are ignored.

•The job GATHER_STATS_JOB automatically gather optimizer statistics.

•This job gathers statistics on all objects in the database which have either
-Missing statistics or
-Stale statistics

•This job is created automatically at database creation time and is managed by the Scheduler. By default GATHER_STATS_JOB runs every night from 10 P.M. to 6 A.M. and all day on weekends if missing statistics or stale statistics found.

•In fact the Scheduler runs this job when the maintenance window is opened. By default, the maintenance window opens every night from 10 P.M. to 6 A.M. and all day on weekends.

•The stop_on_window_close attribute controls whether the GATHER_STATS_JOB continues when the maintenance window closes. The default setting for the stop_on_window_close attribute is TRUE, causing Scheduler to terminate GATHER_STATS_JOB when the maintenance window closes. The remaining objects are then processed in the next maintenance window.

•Database automatically collects statistics on database objects when the object has no previously gathered statistics or the existing statistics are stale because the underlying object has been modified significantly (more than 10% of the rows).

•If the parameter STATISTICS_LEVEL is set to TYPICAL or ALL then database automatically gather statistics for the objects which has stale statistics. If it is set to BASIC then then the automatic statistics gathering job is not able to detect stale statistics.

•To know about job GATHER_JOB_STATS issue the following query,
SQL>select JOB_TYPE,SCHEDULE_TYPE,START_DATE,REPEAT_INTERVAL,END_DATE,ENABLED,STATE,RUN_COUNT FROM DBA_SCHEDULER_JOBS WHERE JOB_NAME = ‘GATHER_STATS_JOB’;

JOB_TYPE SCHEDULE_TYP START_DATE REPEA END_DATE ENABL STATE RUN_COUNT
—————- ———— ———- —– ———- —– ——— ———-
WINDOW_GROUP TRUE SCHEDULED 31

To know the database creation date issue,
SQL> select created, sysdate from v$database;
CREATED SYSDATE
——— ———
06-MAY-08 12-JUN-08

So between May 06 and Jun 08 this job ran 31 times.

•In order to determine whether or not a given database object needs new database statistics, Oracle provides a table monitoring facility. If STATISTICS_LEVEL is set to TYPICAL or ALL then monitoring is enabled. Monitoring tracks the approximate number of INSERTs, UPDATEs, and DELETEs for that table, as well as whether the table has been truncated, since the last time statistics were gathered. The information about changes of tables can be viewed in the USER_TAB_MODIFICATIONS view.

Like you can query,
select TABLE_NAME,INSERTS,UPDATES,DELETES from USER_TAB_MODIFICATIONS;

•Following a data-modification, there may be a few minutes delay while Oracle propagates the information to this view. Use the DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO procedure to immediately reflect the outstanding monitored information kept in the memory.

Example:
—————
SQL> insert into test values(‘hi’);
1 row created.

SQL> select TABLE_NAME,INSERTS,UPDATES,DELETES from USER_TAB_MODIFICATIONS where table_name=’TEST’;
no rows selected

SQL> exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;
PL/SQL procedure successfully completed.

SQL> select TABLE_NAME,INSERTS,UPDATES,DELETES from USER_TAB_MODIFICATIONS where table_name=’TEST’;
TABLE_NAME INSERTS UPDATES DELETES
—————————— ———- ———- ———-
TEST 1 0 0

After analyze they will disappear,
SQL> analyze table test estimate statistics;
Table analyzed.

SQL> select TABLE_NAME,INSERTS,UPDATES,DELETES from USER_TAB_MODIFICATIONS where table_name=’TEST’;
no rows selected

•If a monitored table has been modified more than 10%, then these statistics are considered stale and gathered again.

Read Full Post »

•Optimizer statistics are the collection of data that describe more details about the database and the objects in the database.

•These statistics are used by the query optimizer to choose the best execution plan for each SQL statement.

•These statistics include tables, column, system, index level statistics. The following is listed with attributes that query optimizer use.

A)Table statistics
————————-
Number of rows
Number of blocks
Average row length

B)Column statistics
——————————
Number of distinct values (NDV) in column
Number of nulls in column
Data distribution (histogram)

C)Index statistics
——————————–
Number of leaf blocks
Levels
Clustering factor

D)System statistics
———————————-
I/O performance and utilization
CPU performance and utilization

•These optimizer statitics are are stored in the data dictionary like DBA_TABLES, DBA_TAB_STATISTICS, DBA_INDEXES, DBA_IND_STATISTICS etc.

•Statistics are maintained automatically by Oracle or you can maintain the optimizer statistics manually using the DBMS_STATS package.

Read Full Post »

Automatic statistics gathering should be sufficient for most database objects which are being modified at a moderate speed. However, there are cases where automatic statistics gathering may not be adequate. Because the automatic statistics gathering runs during an overnight batch window, the statistics on tables which are significantly modified during the day may become stale.

There may be two scenarios in this case.
•Volatile tables that are being deleted or truncated and rebuilt during the course of the day.
•Objects which are the target of large bulk loads which add 10% or more to the object’s total size.

So you may wish to manually gather statistics of those objects in order to choose the optimizer the best execution plan. There are two ways to gather statistics.

1)Using DBMS_STATS package.
2)Using ANALYZE command.

A)Using DBMS_STATS package
——————————————–
The DBMS_STATS package have several procedures which help to generate statistics.

1)GATHER_DATABASE_STATS Procedures-Gathers statistics for all objects in the database

2)GATHER_DICTIONARY_STATS Procedure-Gathers statistics for dictionary schemas ‘SYS’, ‘SYSTEM’ and schemas of RDBMS components.

3)GATHER_FIXED_OBJECTS_STATS Procedure-Gathers statistics of fixed objects.

4)GATHER_INDEX_STATS Procedure-Gathers index statistics.

5)GATHER_SCHEMA_STATS Procedures-Gathers statistics for all objects in a schema.

6)GATHER_SYSTEM_STATS Procedure-Gathers system statistics.

7)GATHER_TABLE_STATS Procedure-Gathers table and column (and index) statistics.

8)GENERATE_STATS Procedure-Generates object statistics from previously collected statistics of related objects. The currently supported objects are only b-tree and bitmap indexes.

Example:
To gather statistics of all objects inside A schema use
SQL>EXEC DBMS_STATS.GATHER_SCHEMA_STATS(‘A’);
PL/SQL procedure successfully completed.

To gather statictics of table test in SAM schema use,
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(‘SAM’,’TEST’);
PL/SQL procedure successfully completed.

2)Using Analyze:
————————
Oracle strongly recommend not use ANALYZE command to estimate statistics. Yet it is supported for backward compatibility. To generate statistics of TEST table using ANALYZE use ANALYZE with estimate statistics keyword.,

In this example I verified that statistics of table column num_rows contain information after analyzing.

SQL> SELECT NUM_ROWS FROM USER_TABLES WHERE TABLE_NAME=’TEST’;
NUM_ROWS
———-
2
SQL> INSERT INTO TEST VALUES(‘before’);
1 row created.

SQL> SELECT NUM_ROWS FROM USER_TABLES WHERE TABLE_NAME=’TEST’;
NUM_ROWS
———-
2

SQL> ANALYZE TABLE TEST COMPUTE STATISTICS;
Table analyzed.

SQL> SELECT NUM_ROWS FROM USER_TABLES WHERE TABLE_NAME=’TEST’;
NUM_ROWS
———-
3

After gather statistics num_rows contain accurate information.

Read Full Post »

1)Verify Optimizer Statistics:
——————————————–
The query optimizer uses statistics gathered on tables and indexes when determining the optimal execution plan. If these statistics have not been gathered, or if the statistics are no longer representative of the data stored within the database, then the optimizer does not have sufficient information to generate the best plan.

So, gather statistics of all tables that are involved in SQL statements. You can check whether your statistics is up to date or not by querying
SELECT COUNT(*) FROM table_name;
and,
select NUM_ROWS from dba_tables where table_name=’TABLE_NAME’;

If they are almost same then you have correct optimizer statistics. If they don’t match then gather new statistics.

2)Review the Execution Plan:
—————————————
When tuning (or writing) a SQL statement, the goal is to drive from the table that has the most selective filter. This means that there are fewer rows passed to the next step. If the next step is a join, then this means that fewer rows are joined. Check to see whether the access paths are optimal.

We can check it by examine the optimizer execution plan following,

•The plan is such that the driving table has the best filter.

•The join order in each step means that the fewest number of rows are being returned to the next step (that is, the join order should reflect, where possible, going to the best not-yet-used filters).

•The join method is appropriate for the number of rows being returned. For example, nested loop joins through indexes may not be optimal when many rows are being returned.

•Views are used efficiently. Look at the SELECT list to see whether access to the view is necessary.

•There are any unintentional Cartesian products (even with small tables).

•Each table is being accessed efficiently:

-Consider the predicates in the SQL statement and the number of rows in the table. Look for suspicious activity, such as a full table scans on tables with large number of rows, which have predicates in the where clause. Determine why an index is not used for such a selective predicate.

-A full table scan does not mean inefficiency. It might be more efficient to perform a full table scan on a small table, or to perform a full table scan to leverage a better join method (for example, hash_join) for the number of rows returned.

If any of these conditions are not optimal, then consider restructuring the SQL statement or the indexes available on the tables.

3)Restructuring the SQL Statements
———————————————-
Often, rewriting an inefficient SQL statement is easier than modifying it. If you understand the purpose of a given statement, then you might be able to quickly and easily write a new statement that meets the requirement.

While restructuring the SQL statements keep in mind of the following issues.

•Use equijoins whenever possible.
That is compose predicate using AND and =.

•Avoid Transformed Columns in the WHERE Clause.
That is use
a=b instead of to_number(a)=to_number(b)

When you need to use SQL functions on filters or join predicates, do not use them on the columns on which you want to have an index; rather, use them on the opposite side of the predicate, as in the following statement; if you have index on varcol

TO_CHAR(numcol) = varcol

rather than

varcol = TO_CHAR(numcol)

•Write Separate SQL Statements for Specific Tasks.
SQL is not a procedural language. Using one piece of SQL to do many different things usually results in a less-than-optimal result for each task. If you want SQL to accomplish different things, then write various statements, rather than writing one statement to do different things depending on the parameters you give it.

It is always better to write separate SQL statements for different tasks, but if you must use one SQL statement, then you can make a very complex statement slightly less complex by using the UNION ALL operator.

•Use of EXISTS versus IN for Subqueries.
In certain circumstances, it is better to use IN rather than EXISTS. In general, if the selective predicate is in the subquery, then use IN. If the selective predicate is in the parent query, then use EXISTS.

4)Controlling the Access Path and Join Order with Hints
————————————————————————————–
You can use hints in SQL statements to instruct the optimizer about how the statement should be executed. Hints, such as /*+FULL */ control access paths.

Join order can have a significant effect on performance. The main objective of SQL tuning is to avoid performing unnecessary work to access rows that do not affect the result. This leads to three general rules:

•Avoid a full-table scan if it is more efficient to get the required rows through an index.

•Avoid using an index that fetches 10,000 rows from the driving table if you could instead use another index that fetches 100 rows.

•Choose the join order so as to join fewer rows to tables later in the join order.

•Be careful when joining views, when performing outer joins to views, and when reusing an existing view for a new purpose.

•Joins to complex views are not recommended, particularly joins from one complex view to another. Often this results in the entire view being instantiated, and then the query is run against the view data.

•Beware of writing a view for one purpose and then using it for other purposes to which it might be ill-suited. Querying from a view requires all tables from the view to be accessed for the data to be returned. Before reusing a view, determine whether all tables in the view need to be accessed to return the data. If not, then do not use the view. Instead, use the base table(s), or if necessary, define a new view. The goal is to refer to the minimum number of tables and views necessary to return the required data.

•An outer join within a view is problematic because the performance implications of the outer join are not visible.

•Consider using materialized views.

5)Restructuring the Indexes
———————————————–
Often, there is a beneficial impact on performance by restructuring indexes. This can involve the following:

Remove nonselective indexes to speed the DML.
Index performance-critical access paths.
Consider reordering columns in existing concatenated indexes.
Add columns to the index to improve selectivity.

6)Modifying or Disabling Triggers and Constraints
———————————————————–
Using triggers consumes system resources. If you use too many triggers, then you can find that performance is adversely affected and you might need to modify or disable them.

7)Restructuring the Data
—————————————–
After restructuring the indexes and the statement, you can consider restructuring the data.

Introduce derived values. Avoid GROUP BY in response-critical code.

Review your data design. Change the design of your system if it can improve performance.

Consider partitioning, if appropriate.

8)Combine Multiples Scans with CASE Statements
———————————————————–
Often, it is necessary to calculate different aggregates on various sets of tables. Usually, this is done with multiple scans on the table, but it is easy to calculate all the aggregates with one single scan. Eliminating n-1 scans can greatly improve performance.

9)Maintaining Execution Plans Over Time
——————————————————-
You can maintain the existing execution plan of SQL statements over time either using stored statistics or stored SQL execution plans. Storing optimizer statistics for tables will apply to all SQL statements that refer to those tables. Storing an execution plan (that is, plan stability) maintains the plan for a single SQL statement. If both statistics and a stored plan are available for a SQL statement, then the optimizer uses the stored plan.

Read Full Post »

While tune the SQL statement it is needed to determine high load sql statements along with their associated indexes. In the following section the information that is needed to gather during tuning is listed.

•Complete SQL text from V$SQLTEXT

•Structure of the table like in SQL*PLUS DESCRIBE table_name;

•Definitions of any indexes (columns, column orderings), and whether the indexes are unique or non-unique.

•Optimizer statistics for the segments (including the number of rows each table, selectivity of the index columns), including the date when the segments were last analyzed.

•Definitions of any views referred to in the SQL statement.

•Repeat steps two, three, and four for any tables referenced in the view definitions found in step five.

•Optimizer plan for the SQL statement (either from EXPLAIN PLAN, V$SQL_PLAN, or the TKPROF output).

•Any previous optimizer plans for that SQL statement.

It is important to generate and review execution plans for all of the key SQL statements in your application. Doing so lets you compare the optimizer execution plans of a SQL statement when the statement performed well to the plan when that the statement is not performing well. Having the comparison, along with information such as changes in data volumes, can assist in identifying the cause of performance degradation.

Read Full Post »

« Newer Posts - Older Posts »