# Expert Oracle Practices # Battle Against Any Guess

I start to read Expert Oracle Practices by Oracle Database Administration from the Oak Table. I’d like to write a summary for what I read, or I may forget what I learned from it after a few days.

Let’s start by the first chapter Battle Against Any Guess by Alex Gorbachev. The things I learned (although something cannot be more obviously, we do forgot it somehow sometime) is the steps to solve a problem:

1. Understand the Problem — describe the problem in your own words.
Sometimes we waste lots of times without result only because we didn’t really understand the problem itself, especially when the problem is describled by end-users or business owners. In my opinion, the best way to make sure you understand the problem is to describe the problem in DBA words.

2. Analyze the problem — by scientific approaches not by guess-and-try.

I think handling situations based on our past experiences is a human nature. It’s a short-cut reaction to complex or dangerous situations in nature. While when the habit comes to science area, most likely it becomes guess-and-try iterations. If we use scientific techniques to guess-and-try, like deduction(“top-down” method, using theory to explain symptoms) and induction(bottom-up approach, using symptoms to create a theory), I think it’s ok in some cases when we have limited methodology to tackle a problem. But most time, the guess-and-try path is not as good as a scientific analysis because 1) you have no idea how long it will take, most likely it’s not as short as you expected, or forever in some cases 2) it may not get the root cause but just fixed it temporarily. Just like when the TV didn’t work, we hit it in some directions, and it worked again.

3. Fix the problem — based on understanding of business requirements.

Alex Gorbachev gave a good example:

Let’s say we are troubleshooting a significant slowdown in application performance (online orders
in the Internet book shop), including timeouts that happen regularly around lunch time. We’ve gone
through the required troubleshooting and documented our conclusions:
* User activity is growing by 50 percent during the lunch-hour; that is, there are 50 percent more new orders during that time period.
* The number of online orders is growing steadily every month.
* The time spent on physical I/O contributes 90 percent to response time based on a performance profile we’ve built.
* Three SQL statements are responsible for about 90 percent of the I/O.
* Random single-block I/O time almost doubles during the lunch period, from 8 ms to 15 ms.

Can we identify the root cause from all these items?

How do we choose what areas to improve without taking a guess?
The answer is in the requirements—never lose sight of the business requirements.
In this case, our requirement is to handle the current peak number of online orders.
We might also have a requirement to be capable of handling three times the current order rate in one year as our business keeps growing.

The next factor is the cost and—often forgotten—common sense.
If that heavy batch can be moved to another time, then moving it is probably the cheapest approach to a solution.
If SQL tuning is not an option or requires a costly redesign of the underlying database, then adding I/O capacity could be easier at the moment.
On the other hand, sustaining three times more traffic next year might change the equation, and redesigning the data model might end up being the most reasonable thing to do after all. In fact, it might be the only option, as otherwise the bottleneck could move from the I/O subsystem to
contention in the database instance caused by inefficient SQL.

Whenever you meet a chicken-and-egg problem such as I’ve described, make sure you keep the requirements in mind. Think about the requirements and the cost, and apply common sense.

Read The Fine Manual

I cannot agree more with the author’s about the manual:
Oracle Database Concepts Guide

The Oracle Database Concepts Guide has become my favorite book from Oracle’s documentation
set. It’s a great starting point for learning the basic concepts of all database components and how they
fit together. The Oracle Database product has become very complex in the past years, and it’s become
impossible to know in depth every feature and how it works, but every DBA should know where to start
when it comes to something new. The Oracle Database Concepts Guide often serves as that starting
point.

Oracle’s SQL Language Reference

Oracle’s SQL Language Reference is another guide that you should keep handy and review with every
new database release. Its syntax diagrams and comments are priceless, and many times have clued me in to important features that I never knew existed.
The Oracle Database Reference is where you’ll find information about all of the dynamic and static
dictionary views, init.ora parameters, wait events, and other goodies.

Oracle Database Reference

Oracle Database is a complex product. It’s physically impossible for a single person to know all its
bells and whistles and to have in-depth knowledge of each feature and area. Take Oracle XMLDB, Oracle Streams, and Oracle Spatial as examples. Unless you have worked with these features, you are unlikely to know in detail how they work. However, you do want to have a general idea of such features. Most importantly, you want an idea of where to go to learn more. Just knowing where to go can help you get up to speed relatively quickly in any area of Oracle Database.

Last but not least, it’s response time that matters.

One of the most revolutionary paradigm shifts that I personally had about Oracle Databases came
from reading the book Optimizing Oracle Performance (O’Reilly, 2003) by Cary Millsap and Jeff Holt,
fellow members of OakTable Network. That book is what shifted my paradigm of performance
troubleshooting. Performance is all about time, so time is what you need to focus on and analyze. It is
response time that matters. Now that I understand the idea, it seems so natural that I can’t imagine why I ever believed in relying upon indirectly related counters to troubleshoot Oracle database performance
problems.

Advertisements

About Alex Zeng
I would be very happy if this blog can help you. I appreciate every honest comments. Please forgive me if I'm too busy to reply your comments in time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: