Debug a simple PK update SQL takes 500+ seconds in Oracle

We run into a problem recently in Oracle database at Solaris. A simple PK update SQL takes 500+ seconds to finish. It didn’t happen all the time, less than 1% chances. It causes app timeout when it happens.

We observed more wait time on “SQL*Net Message to client” in statspack data, no other obviously abnormal events beside this. We suspect it’s caused by network issue. SA told us the network usage is only 20%~30%. We did ping test from db host to client. It didn’t show any obviously problems.

Later, we did a test to switch 2 DB’s host in a db family. It makes the problem moved to the other host. At this point, we are sure it’s not DB problem. It must be host or network issue. Later we found the “TCP retransmission” rate is high at the problematic host. Finally, SA changed the network interface to fix this issue.

We can check the “TCP retransmission” using below scripts:

while true loop
do
 echo -n "`date` : "
 netstat -s -P tcp | grep tcpRetrans
 sleep 1
done;

You can reference this webpage for more about the TCP retrasmission issue.

About Alex Zeng
I would be very happy if this blog can help you. I appreciate every honest comments. Please forgive me if I'm too busy to reply your comments in time.

One Response to Debug a simple PK update SQL takes 500+ seconds in Oracle

  1. Karthik says:

    Hi Alex,

    Thanks for this post, can you please share more details on this.. with some statistics.

    Thanks
    Karthik

Leave a comment