Feeds:
Posts
Comments

Archive for December, 2010

Table fragmentation – when?

If a table is only subject to inserts, there will not be any fragmentation.
Fragmentation comes with when we update/delete data in table.
The space which gets freed up during non-insert DML operations is not immediately re-used (or sometimes, may not get reuse ever at all). This leaves behind holes in table which results in table fragmentation.

To understand it more clearly, we need to be clear on how oracle manages space for tables.

“High water mark” of table actually defines the border line between (ever) used and unused (never) space. While performing full table scan, Oracle will always read the data up to HWM. And if there is lot of free space with-in HWM, that is read too, and hence degrading the performance of FTS.

Now lets see how to identify HWM, unused (never used) space and free space (used but deleted/updated) and then take a call whether the concerned table is candidate for a reorganization or not.

SQL> create table test as select * from dba_tables; — Create a table

Table created.

SQL> analyze table test compute statistics; — Analyze it

Table analyzed.

SQL> select blocks “Ever Used”, empty_blocks “Never Used”, num_rows “Total rows”
2 from user_tables where table_name=’TEST’; — The number of blocks used/free

Ever Used Never Used Total rows
———- ———- ———-
49 6 1680

SQL> delete from test where owner=’SYS’; — Im deleting almost half the number of rows.

764 rows deleted.

SQL> commit;

Commit complete.

SQL> analyze table test compute statistics; — Analyze it again

Table analyzed.

SQL> select blocks “Ever Used”, empty_blocks “Never Used”, num_rows “Total rows”

2 from user_tables where table_name=’TEST’; — No difference in blocks usage

Ever Used Never Used Total rows
———- ———- ———-
49 6 916

PL/SQL procedure successfully completed.

Even though you deleted almost half the rows, the above shows that table HWM is up to 49 blocks, and to perform any FTS, Oracle will go up to 49 blocks to search the data. If your application is so-written that there are many FTS on this table, you may consider, reorganizing this table.

Reasons to reorganization

a) Slower response time (from that table)
b) High number of chained (actually migrated) rows.
c) Table has grown many folds and the old space is not getting reused.

Note: Index based queries may not get that much benefited by reorg as compared to queries which does Full table scan.

How to reorganize?

Before Oracle 10g, there were mainly 2 ways to do the reorganization of the table
a) Export, drop, and import.
b) Alter table move (to another tablespace, or same tablespace).

Oracle 10g provides us a new way of reorganizing the data.

Shrink command: This command is only applicable for tables which are tablespace with auto segment space management.

Before using this command, you should have row movement enabled.

SQL> alter table test enable row movement;

Table altered.

There are 2 ways of using this command.

1. Break in two parts: In first part rearrange rows and in second part reset the HWM.

Part 1: Rearrange (All DML’s can happen during this time)

SQL> alter table test shrink space compact;

Table altered.

Part 2: Reset HWM (No DML can happen. but this is fairly quick, infact goes unnoticed.)

SQL> alter table sa shrink space;

Table altered.

2. Do it in one go:

SQL> alter table sa shrink space; (Both rearrange and restting HWM happens in one statement)

Table altered.

Few advantages over the conventional methods

1. Unlike “alter table move ..”, indexes are not in UNUSABLE state. After shrink command, indexes are updated also.

2. Its an online operation, So you dont need downtime to do this reorg.

3. It doesnot require any extra space for the process to complete.

Conclusion

Its a new 10g feature to shrink (reorg) the tables (almost) online which can be used with automatic segment space management.

Advertisements

Read Full Post »

Top SQL’s

One of the important tasks of the DBA is to know what the high CPU consuming processes on database server are.
In my last organization, we used get number of request saying that DB server is running slow.
Now the problem is that, this server is hosting many databases, and finding out which is the culprit process and database sounds a daunting task (but it isn’t).

See this:

First find out the top CPU processes on your system:

You may use TOP (or ps aux) or any other utility to find the top cpu consuming process.

Here is a sample top output:

bash-3.00$ top
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
17480 oracle 11 59 0 1576M 1486M sleep 0:09 23.51% oracle
9172 oracle 258 59 2 1576M 1476M sleep 0:43 1.33% oracle
9176 oracle 14 59 2 1582M 1472M sleep 0:52 0.43% oracle
17550 oracle 1 59 0 3188K 1580K cpu/1 0:00 0.04% top
9178 oracle 13 59 2 1571M 1472M sleep 2:30 0.03% oracle

You can see the bold section. Process# 17480 is consuming 23.51 % CPU.

Now this process can belong to any process out of many instances on this server.
To find out which instance this process belongs to:

bash-3.00$ ps -ef grep 17480

oracle 17480 17479 0 03:02:03 ? 0:48 oracleorcl (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

The instance name is highlighted in bold.

Now you know which instance is holding that session.

Change your environmental settings (ORACLE_SID, ORACLE_HOME) related to this database.
and connect to the database as SYSDBA

bash-3.00$ sqlplus “/ as sysdba”

SQL*Plus: Release 10.2.0.2.0 – Production on Thu Dec 21 04:03:44 2006

Copyright (c) 1982, 2005, Oracle. All Rights Reserved.

Connected to:
Oracle Database 10g Release 10.2.0.2.0 – Production

SQL> select ses.sid SID,sqa.SQL_TEXT SQL from
2 v$session ses, v$sqlarea sqa, v$process proc
3 where ses.paddr=proc.addr
4 and ses.sql_hash_value=sqa.hash_value
5 and proc.spid=17480;

SID SQL
——— —————–
67 delete from test
Now you have the responsible SQL behind 23% CPU using process.
In my case it was a deliberate DELETE statement to induct this test but in your case it can be a query worth tuning.
Mostly knowing what is the problem is solution of the problem. (At least you know what the issue is).

Read Full Post »

Out of all Oracle RDBMS modules, optimizer code is actually the most complicated code and different optimizer modes seem like jack while lifting your car in case of a puncture.

This paper focuses on how optimizer behaves differently when you have optimizer mode set to ALL_ROWS or FIRST_ROWS.

Possible values for optimizer_mode = choose/ all_rows/ first_rows/ first_rows[n]

By default, the value of optimizer_mode is CHOOSE which basically means ALL_ROWS (if statistics on underlying tables exist) else RULE (if there are no statistics on underlying tables). So it is very important to have statistics collected on your tables on regular intervals or else you are living in Stone Age.

FIRST_ROWS and ALL_ROWS are both cost based optimizer features. You may use them according to their requirement.

FIRST_ROWS/ FIRST_ROWS[n]

In simple terms it ensures best response time of first few rows (n rows).

This mode is good for interactive client-server environment where server serves first few rows and by the time user scroll down for more rows, it fetches other. So user feels that he has been served the data he requested, but in reality the request is still pending and query is still fetching the data in background.

Best example for this is toad, if you click on data tab, it instantaneously start showing you data and you feel toad is faster than sqlplus, but the fact is if you scroll down, you will see the query is still running.

Ok, let us simulate this on SQLPLUS

Create a table and index over it:
SQL> create table test as select * from all_objects;

Table created.

SQL> create index test_in on test(object_type);

Index created.

SQL> exec dbms_stats.gather_table_stats(‘SAC’,’TEST’)

PL/SQL procedure successfully completed.

SQL> select count(*) from test;

COUNT(*)
———-
37944

SQL> select count(*) from test where object_type=’JAVA CLASS’;

COUNT(*)
———-
14927

You see out of almost 38k records, 15k are of JAVA class. And now if you select the rows having object_type=’JAVA_CLASS’, it should not use index as almost half of the rows are JAVA_CLASS. It will be foolish of optimizer to read the index first and then go to table.

Check out the Explain plans

SQL> set autotrace traceonly exp
SQL> select * from test where object_type=’JAVA CLASS’;

Execution Plan
———————————————————-
Plan hash value: 1357081020

————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 1001 | 94094 | 10 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 1001 | 94094 | 10 (0)| 00:00:01 |
————————————————————————–

As you see above, optimizer has not used Index we created on this table.

Now use FIRST_ROWS hint:
SQL> select /*+ FIRST_ROWS*/ * from test where object_type=’JAVA CLASS’;

Execution Plan
———————————————————-
Plan hash value: 3548301374

—————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
—————————————————————————————
| 0 | SELECT STATEMENT | | 14662 | 1345K| 536 (1)| 00:00:07 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 14662 | 1345K| 536 (1)| 00:00:07 |
|* 2 | INDEX RANGE SCAN | TEST_IN | 14662 | | 43 (3)| 00:00:01 |
—————————————————————————————

In this case, optimizer has used the index.

Q> Why?

Ans> Because you wanted to see first few rows quickly. So, following your instructions oracle delivered you first few rows quickly using index and later delivering the rest.

See the difference in cost, although the response time (partial) of second query was faster but resource consumption was high.

But that does not mean that this optimizer mode is bad. As I said this mode may be good for interactive client-server model. In most of OLTP systems, where users want to see data fast on their screen, this mode of optimizer is very handy.

Important facts about FIRST_ROWS

  1. It gives preference to Index scan Vs Full scan (even when index scan is not good).
  2. It prefers nested loop over hash joins because nested loop returns data as selected (& compared), but hash join hashes one first input in hash table which takes time.
  3. Cost of the query is not the only criteria for choosing the execution plan. It chooses plan which helps in fetching first rows fast.
  4. It may be a good option to use this in an OLTP environment where user wants to see data as early as possible.

 


ALL_ROWS

In simple terms, it means better throughput

While FIRST_ROWS may be good in returning first few rows, ALL_ROWS ensures the optimum resource consumption and throughput of the query. In other words, ALL_ROWS is better to retrieve the last row first.

In above example while explaining FIRST_ROWS, you have already seen how efficient ALL_ROWS is.

Important facts about ALL_ROWS

  1. ALL_ROWS considers both index scan and full scan and based on their contribution to the overall query, it uses them. If Selectivity of a column is low, optimizer may use index to fetch the data (for example ‘where employee_code=7712’), but if selectivity of column is quite high (‘where deptno=10’), optimizer may consider doing Full table scan. With ALL_ROWS, optimizer has more freedom to its job at its best.
  2. Good for OLAP system, where work happens in batches/procedures. (While some of the report may still use FIRST_ROWS depending upon the anxiety level of report reviewers)
  3. Likes hash joins over nested loop for larger data sets.

Conclusion

Cost based optimizer gives you flexibility to choose response time or throughput. So use them based on your business requirement.

Read Full Post »

There is one more critical aspect which I wanted to discuss is the cost/effort related to rebuilding indexes.

Rebuilding an index is quite a costly operation and you must evaluate the benefit Vs effort before rebuilding an index.

Rebuilding (online) an index requires additional resources like space, cpu usage, time.

Here is one more option, which is less used or probably less popular “coalesce”.

Rebuild Vs Coalesce

Rebuild

  • Can move an index to a different tablespace
  • Resource consuming process
  • Takes more time
  • Creates a new tree
  • Shortens the height of an index if it was increased due to DML activities
  • Rebuilding can affect future DML’s because index becomes compact and for future DML’s index has to be extend dynamically.

Coalesce

  • Cannot move an index to a different tablespace
  • Comparatively less resource consuming
  • Takes relatively less time
  • Only merge the adjacent free space between leaf blocks within a branch
  • Doesn’t shorten height of index
  • Since coalesce doesn’t effect the total size and only frees up the unused space, it doesn’t affect future DML’s

Coalescing the index, frees up space of adjacent leaf blocks within a branch block. This way the number of blocks or extents which an index is using will not change but there will be more number of free blocks which can be used for future inserts or updates.

In an OLTP environment, where data is highly volatile, coalesce is better because it doesn’t shrink the index and the free space remains with the index segment.

Read Full Post »

Need is necessary for any change. I hope all agree to this. So why many DBA’s (not all) rebuilds indexes on periodical basis without knowing the impact of it?

Let’s revisit the facts stated by many Oracle experts:

– B Tree indexes can never be unbalanced
– The distance of any leaf block (which contains the index value) from the root block is always same.
– B Tree performance is good for both small and large tables and does not degrade with the growth of table

When will rebuild help?

• When the data in index is sparse (lots of holes in index, due to deletes or updates) and your query is usually range based.

If your SQL’s use “=” predicate, rebuilding of index may not help you at all unless the DML’s (updates/deletes) on the table are cause of increasing the height of index. Due to heavy updates/deletes on a table, the space in the index can be left unused (unless it is reused) due to which block splits and if the splitting goes beyond the level of accomodation, the height of index increases by 1.
In simpler terms, unless you reduce the height (BLEVEL) of index, the time taken to reach the leaf block from root block will not change.

As explained above, the height of index can be decreased by rebuild of an index(only if height was increased to updates/deletes).

Only in case of range scan, rebuild (consider coalesce option also) may help.

Select * from emp where empno between 1000 and 2000;

In above case there could be lot of deleted/updated data which will also be read while reading between 1000 to 2000. Rebuilding will reclaim any unused space and hence the select could be faster.

• Index fast full scan/ Index full scan. (rare case)
If your SQL’s are designed in a way that mostly data is selected from index (it doesn’t goes to table). In this case the whole index is read for data not for redirecting to table.

Ex:
Select count(empno) from emp; — you have an index in empno
Select empno from emp — you have an index in empno

Both the above SELECTs will fetch the data from Index only. Since the whole index is getting read, the space used by deleted rows is also read at the time of SELECT operation

Read Full Post »

A materialized view is a database object that contains the results of a query. They are local copies of data located remotely, or are used to create summary tables based on aggregations of a table’s data. Materialized views, which store data based on remote tables are also, know as snapshots.

A materialized view can query tables, views, and other materialized views. Collectively these are called master tables (a replication term) or detail tables (a data warehouse term).

For replication purposes, materialized views allow you to maintain copies of remote data on your local node. These copies are read-only. If you want to update the local copies, you have to use the Advanced Replication feature. You can select data from a materialized view as you would from a table or view.

For data warehousing purposes, the materialized views commonly created are aggregate views, single-table aggregate views, and join views.

In this article, we shall see how to create a Materialized View and discuss Refresh Option of the view.

In replication environments, the materialized views commonly created are primary key, rowid, and subquery materialized views.

Primary Key Materialized Views
The following statement creates the primary-key materialized view on the table emp located on a remote database.

SQL> CREATE MATERIALIZED VIEW mv_emp_pk
REFRESH FAST START WITH SYSDATE
NEXT SYSDATE + 1/48
WITH PRIMARY KEY
AS SELECT * FROM emp@remote_db;

Materialized view created.

Note: When you create a materialized view using the FAST option you will need to create a view log on the master tables(s) as shown below:

SQL> CREATE MATERIALIZED VIEW LOG ON emp;
Materialized view log created.

Rowid Materialized Views
The following statement creates the rowid materialized view on table emp located on a remote database:

SQL> CREATE MATERIALIZED VIEW mv_emp_rowid
REFRESH WITH ROWID
AS SELECT * FROM emp@remote_db;

Materialized view log created.

Subquery Materialized Views

The following statement creates a subquery materialized view based on the emp and dept tables located on the remote database:

SQL> CREATE MATERIALIZED VIEW mv_empdept
AS SELECT * FROM emp@remote_db e
WHERE EXISTS
(SELECT * FROM dept@remote_db d
WHERE e.dept_no = d.dept_no)

REFRESH CLAUSE
[refresh [fast|complete|force]
[on demand | commit]
[start with date] [next date]
[with {primary key|rowid}]]

The refresh option specifies:

The refresh method used by Oracle to refresh data in materialized view
Whether the view is primary key based or row-id based
The time and interval at which the view is to be refreshed
Refresh Method – FAST Clause

The FAST refreshes use the materialized view logs (as seen above) to send the rows that have changed from master tables to the materialized view.

You should create a materialized view log for the master tables if you specify the REFRESH FAST clause.

SQL> CREATE MATERIALIZED VIEW LOG ON emp;

Materialized view log created.

Materialized views are not eligible for fast refresh if the defined subquery contains an analytic function.

Refresh Method – COMPLETE Clause
The complete refresh re-creates the entire materialized view. If you request a complete refresh, Oracle performs a complete refresh even if a fast refresh is possible.

Refresh Method – FORCE Clause
When you specify a FORCE clause, Oracle will perform a fast refresh if one is possible or a complete refresh otherwise. If you do not specify a refresh method (FAST, COMPLETE, or FORCE), FORCE is the default.

PRIMARY KEY and ROWID Clause
WITH PRIMARY KEY is used to create a primary key materialized view i.e. the materialized view is based on the primary key of the master table instead of ROWID (for ROWID clause). PRIMARY KEY is the default option. To use the PRIMARY KEY clause you should have defined PRIMARY KEY on the master table or else you should use ROWID based materialized views.

Primary key materialized views allow materialized view master tables to be reorganized without affecting the eligibility of the materialized view for fast refresh.

Rowid materialized views should have a single master table and cannot contain any of the following:

Distinct or aggregate functions
GROUP BY Subqueries , Joins & Set operations
Timing the refresh
The START WITH clause tells the database when to perform the first replication from the master table to the local base table. It should evaluate to a future point in time. The NEXT clause specifies the interval between refreshes

SQL> CREATE MATERIALIZED VIEW mv_emp_pk
REFRESH FAST
START WITH SYSDATE
NEXT SYSDATE + 2
WITH PRIMARY KEY
AS SELECT * FROM emp@remote_db;

Materialized view created.

In the above example, the first copy of the materialized view is made at SYSDATE and the interval at which the refresh has to be performed is every two days.

Related post:

https://samadhandba.wordpress.com/2010/12/29/materialized-views-in-oracle/

https://samadhandba.wordpress.com/2011/02/04/what-is-view-in-oracle/

https://samadhandba.wordpress.com/2011/02/04/benefits-of-using-views/

Read Full Post »

————————————
DATABASE PERFORMANCE
————————————-
FREQUENTLY ASKED QUESTIONS
———————————————-
QUESTIONS
—————
1. What are the prioritized tuning steps to implement a database with optimal performance?
2. What are the tools that can be used to monitor database performance?
3. Who should be involved in the database tuning process?
4. What are the major tuning areas in database performance tuning?
5. How to identify which tuning area contributes to the database performance problem?
6. How to improve the identified poor performing area?
7. Where can one find the descriptions of v$ views and Wait Events?
8. What is the appropriate size of SGA?
9. How to analyze and interpret results from utlbstat/utlestat or statspack?
10. Why does the database performance change after upgrade or migration?
11. Where can one learn more about database performance tuning?

ANSWERS
————–

1. What are the prioritized tuning steps to implement a database with optimal performance?

Proactive tuning during the database design and development stage is the most effective way to achieve optimal
database performance. The following list has been ordered according to their general return on effort.

– Tune the business rules.
– Tune the data design
– Tune the application design.
– Tune the logical structure of the database.
– Tune the database operations.
– Tune the access paths.
– Tune memory allocation.
– Tune the I/O and physical structure.
– Tune the resource contention.
– Tune the underlying platform(s).

The reactive tuning approach is driven by identifying the most significant bottleneck and making the appropriate
changes to reduce or eliminate the effect of that bottleneck It requires to gather statistical information to detect
where the bottleneck is taking place and then apply corrective actions. It is recommended that changes be made
to a system only after you have confirmed that there is a bottleneck. The detailed steps are described in the section:
How to identify which tuning area contributes to the database performance problem

2. What are the tools that can be used to monitor database performance?

The database has a complete set of statistics and mechanisms to alert when problems are occurring.
Gathering information to tune an Oracle database can be obtained with the following tools and database resources:

– Alert log and trace files: The first step in detecting a performance problem is searching for errors or warnings
issued by the database. These files keep track of this information.
– V$ views: Database statistical information is stored in the V$ Views.
– Utlbstat/utlestat and STATSPACK (available from 8.1.6): These tools bundled with the Oracle Server
generate complete reports of the database activity. The new STATSPACK utility bundled with Oracle 8.1.6
and above provides more flexibility in managing statistical snapshots.
– OEM Performance Pack. The Performance pack offers a complete set of graphical tools to monitor the
performance of the database.

3. Who should be involved in the database tuning process?

The tuning process is usually viewed as a task of the database administrator. Ideally, the database tuning process
should involve database administrators, system administrators, application developers, and end users.
Involving these resources, will ensure to set performance targets and user expectations according with the business
needs and available resources.

4. What are the major tuning areas in database performance tuning?

– Memory – shared pool, large pool, buffer cache, redo log buffer, and sort area size.
– I/O – distributing I/O, striping, multiple DBWn processes, and DBWn I/O slaves.
– CPU – CPU utilization.
– Space management – extent allocation and Oracle block efficiency.
– Redo log and checkpoint – redo log file configuration, redo entries, and checkpoint.
– Rollback segment – sizing rollback segments.
– Network

5. How to identify which tuning area contributes to the database performance problem?

– Gather Database statistical information using STATSPACK or UTLSTAT reports when
the performance is both good and bad.
– Obtain operating system and application statistical information using the same approach.
Oracle relies on the hardware capacity and when the system resources are permanently
busy you can be facing a capacity problem.
– Then Examine the Host System and Oracle Statistics for any evidence.
– Define actions to solve bottlenecks prioritizing those actions that will impact the most
the performance of the application/database. Some simple actions can improve dramatically
the performance, so consider to implement them first. (See Note:148373.1)

6. How to improve the identified poor performing area?

Although each tuning area requires its specific tuning methods, the underlying tuning methodology for all areas is
the same. The general tuning process can be divided into the following steps:

– Set a realistic and quantitative performance target. The target should match the business rules and users’
expectations, but also be bounded by available resources of the system.
– Change one thing at a time. Evaluating the effect from manipulating one variable at a time is more efficient
than manipulating more than one variable at the same time, especially when there could be more than one
factor contributing to the performance problem.
– Evaluate the effect of change. Examine the new results from utlbstat/utlestat or statspack.
Is there any improvement from the change?
> If there is improvement resulting from the change, more change to the same factor can be made.
> If there is no improvement resulting from the change, the change needs to be reversed, and go back to
Step 2 to change another factor.
– Stop when the performance target is reached.

Multiple cycles of the tuning process may be needed to reach the final performance target.

7. Where can one find the descriptions of v$ views and wait events?

8. What is the appropriate size of SGA?

The appropriate size of SGA is system dependent, and it is limited by the available system resources. For optimal
performance, SGA should fit into real memory avoiding the need for swapping. When sizing the SGA consider to
tune all memory structures (the shared pool, the buffer cache and the redolog buffer cache) together so you can
assign appropriate resources to each according with the database requirements.

9. How to analyze and interpret results from utlbstat/utlestat or statspack?

10. Why does the database performance change after an upgrade or migration?

Database performance tuning is an ongoing process throughout the life of a database. Any changes to the system
could disrupt the balance of a previously well-tuned database. Database upgrades or migrations usually come with
a great deal of changes to the system, so one expects to see some changes to the database performance. It is best to
perform intensive testing on a development database before upgrading or migrating the production database.

11. Where can one learn more about database performance tuning?

– Oracle 7/8/8i Designing and Tuning for Performance Manual – Note 152140.1
– Oracle9i Database Performance Methods Manual
– Oracle9i Database Performance Guide and Reference
– Oracle Education Instructor-Led Training, Enterprise DBA Part 2: Performance Tuning.

Read Full Post »

Older Posts »