Analyzing Database Server Bottlenecks using VMSTAT

vmstat (virtual memory statistics) reports information about processes, memory, paging, block IO, traps, and cpu activity. Below is the sample default output generated by vmstat with no options:

$ vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  41496  35836  16708 255712    0    0     5    24    3    3  2  0 97  1  0

Column description of default output.

Procs
- r: The number of processes waiting for run time.
- b: The number of processes in uninterruptible sleep.
Memory
- swpd: the amount of virtual memory used in KB.
- free: the amount of idle memory in KB.
- buff: the amount of memory used as buffers in KB.
- cache: the amount of memory used as cache in KB.
Swap
- si: Amount of memory swapped in from disk (KB/s).
- so: Amount of memory swapped to disk (KB/s).
IO
- bi: Blocks received from (read in) a block device (blocks/s).
- bo: Blocks sent to (written out) a block device (blocks/s).
System
- in: The number of interrupts per second, including the clock.
- cs: The number of context switches per second.
CPU – These are percentages of total CPU time.
- us: Time spent running non-kernel code. (user time, including nice time)
- sy: Time spent running kernel code. (system time)
- id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
- wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
- st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.

Below are some general tips, which you can use while interpreting the output –

The run queue (r) is a queue of processes that are ready to run but must wait for their turn on a CPU; a run queue of 5 means that 5 processes are currently waiting to execute. When the CPU is pegged at 100% utilization, the severity of the CPU starvation won’t be reflected in the percentage of CPU utilization, but the run queue (r) will clearly show the impact.
Number of processes in uninterruptible sleep (b) can be used to identify the CPU power. If the value is constantly greater than zero then you may not have enough CPU power. Find out most CPU consuming processes and SQL statements. You can use ps command to list out most CPU consuming processes.
Amount of memory swapped in from disk (si) and Amount of memory swapped to disk (so) can be used to identify the memory bottleneck. If the value is constantly greater than zero, then it is an indication that you have a memory issue. Find out most memory consuming processes and SQL statements. You can use ps command to list out most memory consuming processes.
Time spent running non-kernel code (us) is high (ie. above the normal usage) there is a possbility that some of the user intiated processes are consuming high CPU, use ps to list ouf the most CPU consuming processes.
Time spent waiting for IO (wa) is high then there is an issue in the disk storage subsystem and you will have to identify the sources of I/O contention using iostat.

One of the most frequently used usage is below, the former parameter is interval in seconds and latter is number of intervals; it prints the vmstat snap every 6 seconds for 10 times.

$ vmstat 6 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0  41496  45392  19380 243528    0    0     5    24    4    3  2  0 97  1  0
 0  0  41496  45392  19388 243528    0    0     0    23 1012   47  0  0 99  0  0
 0  0  41496  45392  19396 243560    0    0     4    38 1017   57  6  1 93  0  0
 0  0  41496  45392  19404 243560    0    0     0    21 1013   44  0  0 99  0  0
 0  0  41496  45392  19412 243560    0    0     0    18 1012   45  0  0 100  0  0
 0  0  41496  45392  19420 243560    0    0     0    18 1012   44  0  0 100  0  0
 0  0  41496  45392  19428 243560    0    0     0    18 1013   44  0  0 99  0  0
 0  0  41496  45392  19436 243560    0    0     0    18 1012   50  0  0 99  0  0
 0  0  41496  45392  19444 243560    0    0     0    18 1013   44  0  0 99  0  0
 0  0  41496  45392  19452 243560    0    0     0    19 1013   46  0  0 100  0  0

Leave a Comment Cancel Reply