I had a funny thing happen today. I had recently participated in a forum thread about library cache: mutex X waits. In that discussion the library cache: mutex X waits were on an AWR report but when we dug deeper we realized it was because the database server’s CPU was totally pegged out and overloaded.
Today, I saw the same thing on one of my own servers. Here are the top 5 events:
Event | Waits | Time(s) | Avg wait (ms) | % DB time | Wait Class |
---|---|---|---|---|---|
DB CPU | 53,149 | 8.67 | |||
db file sequential read | 3,188,834 | 21,758 | 7 | 3.55 | User I/O |
library cache: mutex X | 4,260 | 8,943 | 2099 | 1.46 | Concurrency |
log file sync | 1,226,545 | 4,562 | 4 | 0.74 | Commit |
latch: cache buffers chains | 21,097 | 4,281 | 203 | 0.70 | Concurrency |
Notice how small a percentage of the DB time these top events are. This is because most of the time is spent waiting on the CPU queue. Here was the load when I checked it:
$ uptime
11:35am up 268 days, 13:27, 4 users, load average: 14.40, 14.52, 14.76
This server has 2 cores so a load of 14 means a lot of queuing.
The operating system statistics on the AWR report show the same thing – a load of 15 and a lot of OS_CPU_WAIT_TIME.
Operating System Statistics
- *TIME statistic values are diffed. All others display actual values. End Value is displayed if different
- ordered by statistic type (CPU Use, Virtual Memory, Hardware Config), Name
Statistic | Value | End Value |
---|---|---|
AVG_BUSY_TIME | 3,925,986 | |
AVG_IDLE_TIME | 1,115,182 | |
AVG_IOWAIT_TIME | 606,908 | |
AVG_SYS_TIME | 383,368 | |
AVG_USER_TIME | 3,541,505 | |
BUSY_TIME | 7,853,118 | |
IDLE_TIME | 2,231,218 | |
IOWAIT_TIME | 1,214,686 | |
SYS_TIME | 768,403 | |
USER_TIME | 7,084,715 | |
LOAD | 1 | 15 |
OS_CPU_WAIT_TIME | 17,375,799,211,900 | |
RSRC_MGR_CPU_WAIT_TIME | 23,323 | |
VM_IN_BYTES | 442,109,952 | |
VM_OUT_BYTES | 540,237,824 | |
PHYSICAL_MEMORY_BYTES | 68,687,269,888 | |
NUM_CPUS | 2 | |
NUM_CPU_CORES | 2 | |
NUM_CPU_SOCKETS | 1 | |
TCP_RECEIVE_SIZE_DEFAULT | 32,768 | |
TCP_RECEIVE_SIZE_MAX | 1,073,725,440 | |
TCP_RECEIVE_SIZE_MIN | 24,576 | |
TCP_SEND_SIZE_DEFAULT | 32,768 | |
TCP_SEND_SIZE_MAX | 2,147,483,647 |
So, I thought I would post this so people could see what an AWR report looks like when the cpu it totally pegged out and queuing big time. Note that the AWR report was from 1 am to 3 pm so the starting load was low but the ending load during the afternoon was at 15.
– Bobby