# Expert Oracle Practices # Understanding Performance Optimization Methods
January 31, 2013 1 Comment
I am reading Expert Oracle Practices by Oracle Database Administration from the Oak Table. I’d like to summary and share what I learned.
This chapter, Understanding Performance Optimization Methods by Charles Hooper and Randolf Geist, has tons of userful information. I think it can be rewrote to a book. I will just list the some useful information:
The following SQL statement shows currently blocked sessions as well as the blocking sessions:
SQL> SELECT S.SID, S.USERNAME, S.PROGRAM, S.SQL_HASH_VALUE SQL_HASH, L.TYPE, L.LMODE LM, L.REQUEST RQ, DECODE(L.BLOCK,1,'Blocker','Blocked') BLOCK, L.ID1, L.ID2 FROM V$LOCK L, V$SESSION S WHERE L.SID=S.SID AND (L.ID1, L.ID2, L.TYPE) IN (SELECT ID1, ID2, TYPE FROM V$LOCK WHERE REQUEST > 0);
Monitoring CPU Utilization
Metalink Doc ID 164768.1, “Diagnosing High CPU Utilization”
Metalink Doc ID 301137.1 describes a set of logging scripts named OS Watcher for the Unix and Linux platforms that logs the output of the ps, top, mpstat, iostat, netstat, traceroute, and vmstat utilities on scheduled time intervals.
Metalink Doc ID 433472.1 describes a similar OS Watcher script for the Windows platform that queries the Windows performance data by using the LOGMAN Windows utility.
Oracle Cluster Health Monitor tool, previously known as the Instantaneous Problem Detection tool, or PD/OS(www.oracle.com/technology/products/database/clustering/ipd_download_homepage.html).
This tool captures a variety of information from nodes participating in a Real Application Clusters (RAC) configuration, storing information about network I/O rates, file I/O rates, CPU usage, CPU run queue, and available memory, with information captured as frequently as once a second typically with a 24-hour rolling history.
Metalink Doc ID 352363.1 : LTOM (The Lite Onboard Monitor)
On Solaris the command prstat -m -s cpu may be used to list the kernel mode and user mode time consumed by each process (developers.sun.com/solaris/articles/prstat.html).
CPU Run Queue
The underlying mechanics of the CPU run queue vary by operating system. For example, on Windows a single run queue is shared by all CPUs in the server.This approach tends to spread the existing CPU load evenly across all CPUs as it allows processes and process threads to run on a different CPU during each time slice. However, this approach also means that the cache memory built into CPUs tends to be a bit less useful than it would be if the same processes always executed on the same CPU.
Linux kernel versions prior to 2.6 also maintained a single run queue. With the 2.6 Linux kernel, each CPU has a separate run queue. An independent run queue potentially permits a process to continue running on the same CPU, but an unfortunate side effect of this approach may occur when two or more critical background processes (such as LGWR and DBWR) must compete for CPU time on a CPU that is heavily loaded with other process activity. With the 2.6 Linux kernel, the process scheduler potentially rebalances processes set to run on a particular CPU, switching processes to less-utilized CPUs every 200ms, and artificially increases the priority of I/O-intensive processes while artificially decreasing the priority of CPU-intensive processes.
Unix and Linux provide run queue and CPU utilization information with the sar command. With the -q 60 5 parameters, the runq-sz column in the following output indicates the average run queue length every 60 seconds with five sampling iterations:
sar –q 60 5 03:58:27 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 03:59:27 PM 42 332 39.21 20.83 8.81 04:00:27 PM 48 348 47.13 26.40 11.47 04:01:27 PM 57 361 52.45 31.67 14.21 04:02:27 PM 45 416 59.42 37.36 17.25 04:03:27 PM 26 338 57.97 41.22 19.85 Average: 44 359 51.24 31.50 14.32
Metalink Doc ID 436797.1, which shows usage of strace
SYSTEMTAP (sourceware.org/systemtap/) on the Linux platform provides a toolset for monitoring system (kernel mode) activity
Metalink Doc ID 94224.1, “FAQ—Statspack Complete Reference”
Metalink Doc ID 394937.1, “Statistics Package (STATSPACK) Guide
Statspack-related entries on Jonathan Lewis’s blog http://jonathanlewis.wordpress.com/category/statspack/
Alberto Dell’Era wrote a utility script, simply named xplan, that lists optimizer parameters in effect during the creation of an execution plan, information from V$SQL (CPU time, elapsed time, consistent gets, physical block reads, and so forth), execution plan statistics, tables accessed by the plan with the table statistics, and indexes accessed along with the statistics for these indexes (www.adellera.it/blog/2009/08/07/xplan-20/).
Metalink Doc IDs 215187.1 and 235530.1 describe SQLTXPLAIN, which generates HTML reports that include a wide range of information such as table statistics, stored outlines, execution plans for child cursors, and initialization parameters (note that the default configuration accesses the SQL Tuning Advisor, which requires an Enterprise Edition license and a license for the Oracle Tuning Pack).
SQL> ALTER SESSION SET STATISTICS_LEVEL='ALL'; SQL> run the sql SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST')); --add hint /*+ GATHER_PLAN_STATISTICS */ to the SQL also works if you cannot set statistics_level to all --This one is one of my favorite. By comparing the oracle estimated rows and actual rows of each step, you can quickly know where is wrong in the plan.
Generating 10053 Cost-Based Optimizer Traces
SQL> ALTER SESSION SET TRACEFILE_IDENTIFIER = '<string_to_identify_the_trace_file>'; SQL> ALTER SESSION SET EVENTS '10053 trace name context forever, level 1'; SQL> ALTER SESSION SET EVENTS '10053 trace name context off';
There are many pages tell us how to read 10053 trace files.
Generating 10046 Extended Traces
Extended 10046 SQL traces, enabled by setting event 10046, or through the use of procedures in the packages DBMS_SUPPORT, DBMS_MONITOR, DBMS_SYSTEM, or SQL*Plus’s ORADEBUG, provide a wealth of performance information specific to an individual session
10046 trace fie analyze tool:
Metalink Doc ID 41634.1 tkprof
Metalink Doc ID 224270.1 TRCANLZR, format output to html
Troubleshooting Oracle Performance by Christian Antognini : TVD$XTAT
Secrets of the Oracle Database by Norbert Debes, : ESQLTRCPROF
Method R : Hotsos Profiler
Providing tkprof with an insert=statfile.sql parameter causes tkprof to generate a file named statfile.sql that includes generated insert statements, allowing the permanent storage of the tkprof output in the database
Providing tkprof with a record=sqlfile.txt parameter causes tkprof to record all user-executed SQL statements (dep=0) found in the 10046 trace file to a log file name sqlfile.txt.
There are many pages tell us how to read 10046 trace files in “Brief Summary of a Raw 10046 Extended Trace File’s Contents”. See Optimizing Oracle Performance or Metalink Doc ID 39817.1 for descriptions of additional keywords found in 10046 extended trace files.
A trigger to enable 10046 by giving filters:
CREATE OR REPLACE TRIGGER LOGON_CAPTURE_10046 AFTER LOGON ON DATABASE DECLARE SHOULD_EXECUTE INTEGER; TRACEFILE VARCHAR2(150); BEGIN SELECT DECODE(SUBSTR(UPPER(PROGRAM),1,5),'MYAPP',1,0) INTO SHOULD_EXECUTE FROM V$SESSION WHERE SID=(SELECT SID FROM V$MYSTAT WHERE ROWNUM=1); IF SHOULD_EXECUTE > 0 THEN TRACEFILE := 'ALTER SESSION SET TRACEFILE_IDENTIFIER = '''||USER||TO_CHAR(SYSDATE,'YYMMDDHH24MI')||''''; EXECUTE IMMEDIATE TRACEFILE; EXECUTE IMMEDIATE 'ALTER SESSION SET MAX_DUMP_FILE_SIZE=UNLIMITED'; EXECUTE IMMEDIATE 'ALTER SESSION SET TIMED_STATISTICS=TRUE'; EXECUTE IMMEDIATE 'ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT FOREVER, LEVEL 12'''; END IF; END; /
Get session trace status
SQL> SELECT SQL_TRACE, SQL_TRACE_WAITS, SQL_TRACE_BINDS FROM V$SESSION WHERE SID=(SELECT SID FROM V$MYSTAT WHERE ROWNUM=1); SQL_TRACE SQL_TRACE_WAITS SQL_TRACE_BINDS --------- --------------- --------------- ENABLED TRUE TRUE
Examining Server Stack Traces
Julian Dyke’s website (www.juliandyke.com) documents most of the ALTER SYSTEM/SESSION commands and their equivalent ORADEBUG syntax used to write trace information.
Tanel Poder’s blog (blog.tanelpoder.com) documents operating system utilities that may be used to retrieve stack traces for Oracle processes
Oracle’s ERRORSTACK level 3 trace dump and PROCESSSTATE trace dump output all open cursors for the session with bind variable values. Such traces are one of the few methods that allow another session’s bind variable values to be viewed.
Oracle is able to automatically generate trace files when a particular error message, such as ORA-01722 Invalid Number, is returned to the client application. To enable this feature including the error stack (level 0), function call stack (level 1—the execution stack for the process), process state (level 2), and context area (level 3), execute the following in the session (note that 1722 refers to the ORA-01722 error):
SQL> ALTER SESSION SET EVENTS ’1722 TRACE NAME ERRORSTACK LEVEL 3′;
HANGANALYZE dumps drill into the specific processes in the database instance that are blocked by another session (an enqueue-type wait, for instance), processes that are spinning because of a problem in the database instance (such as an internal deadlock on pins or latches), and other situations involving severe performance problems.
A level 3 HANGANALYZE dump with a level 2 SYSTEMSTATE dump are typically sufficient for analysis.
SQL> ORADEBUG HANGANALYZE 3
Metalink Doc ID 215858.1 “Interpreting HANGANALYZE Trace Files to Diagnose Hanging and Performance Problems”
HEAPDUMP dumps provide a method to determine the contents of various memory regions in the database instance including the SGA, PGA, large pool, Java pool, and so forth.
HEAPDUMPs are helpful when troubleshooting ORA-4030 (PGA) and ORA-4031 (shared pool) errors.
Attention: the shared pool latch is held while creating the dump file.
Operating-System-Generated Stack Traces
Solaris (and FreeBSD) provides the DTrace utility and the pstack utility to perform stack traces at the operating system level.
The pstack utility on Linux differs from that on other Unix platforms because it is a wrapper script around the GNU Project Debugger (GDB),which must pause the execution of the process to retrieve the process’s stack trace. Doing so might crash either the session or the instance.
Tanel Poder’s blog (blog.tanelpoder.com) for his dstackprof script and various demonstrations of performance diagnostics using operating system stack traces.
Examining Network Packets
Network packet-capture utilities, such as tcpdump and Wireshark/Ethereal provide insight into the performance problems
Investigating Enqueue Waits
Metalink Doc ID 62354.1, “TX Transaction Locks—Example WaitScenarios”;
Doc ID 33453.1, “Referential Integrity and Locking”; and Doc ID 102925.1, “Tracing Sessions: Waiting on an Enqueue.”
Kyle Hailey published a helpful presentation on the topic of enqueue waits (www.perfvision.com/papers/09_enqueues.ppt).
The trick is to make the right choice at any given time.
When you face a performance problem, how do you choose the method to apply?