Case Study: Buffer-Busy-Waits

In this Buffer Busy Waits CASE pressed by Oracle, we learned that setting larger process freelists didn’t help in some cases, instead we should setting larger FREELIST GROUPs. We can also overcome this problem by using ASSM (Automatic Segment Space Management). Here are some clips:

Symptom:

We have a big 1.4TB table and every day the application inserts around 6 to 9 million rows.
The row contains a long raw column and is around 3KB. During the time of the inserts we see lots of sessions waiting on buffer busy waits on the same datablocks.

The customer reported they were carrying out the inserts from a number of concurrent sessions, and had set the number of process freelists on the table to 23.

Analysis:

It was reported that these indexes showed no signs of contention, and they could always see one session waiting on a db file sequential read and all others are waiting on a buffer busy wait with reason code 120, which indicates the session is waiting for the block to be read into the buffer cache.

Because all processes will move to searching the master freelist if no suitable blocks are found on their process freelist this highlights a possible limitation that can occur when several process freelists are empty resulting in a number of processes trying to search the same master freelist.The severity of this seemed unexpected so a bug (4523986) was opened to get some input from Oracle Development. Development came back with the following thoughts:

We attempt to move a section of the MFL in one go - in this case 5 blocks
and obtain the necessary info with shared locks. We then attempt the move,
taking an exclusive lock. If the sublist we've identified to move has been
changed, we start over.
The crux of the problem is that all sessions are reading the same five blocks at the same time (and only one of them will eventually succeed in moving them to their PFL which means the problem repeats on another five block list for the rest of them).
Your I/O issue is probably making this worse as presumably the cache is running slowly - the five block lists are read in current shared mode.

Conclusion:

Use FREELIST GROUPs (even in single instance this can make a difference), I came across some note stating possible resolutions for high buffer busy waits.
Freelist groups are mapped as (in a non-OPS environment or OPS and Single-Instance environment):
Free list group is: (Process Id % Number of Free group ) + 1

In this customer’s case, the suggested workaround of using freelist groups makes perfect sense.

In conclusion, when seeing a high number of sessions waiting on buffer busy waits for the same datablock, that are continually changing, on a table that has a number of process freelists defined, it is possible you may be running into the serialization problem with searching and moving blocks from the master freelist to the assigned process freelist. 

A workaround is also provided of rebuilding the table with multiple freelist groups, which was demonstrated to relieve the buffer busy waits. It is important to note that removing one area of contention often highlights a different area that needs further optimization.
Advertisements

About Alex Zeng
I would be very happy if this blog can help you. I appreciate every honest comments. Please forgive me if I'm too busy to reply your comments in time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: