sar -d on Linux

I started using sar -d to look at disk performance on a Linux system this week and had to look up what some of the returned numbers meant.  I’ve used sar -d on HP Unix but the format is different.

Here is an edited output from a Linux VM that we are copying files to:

$ sar -d 30 1
Linux 2.6.32-504.3.3.el6.x86_64 (myhostname)  04/01/2015      _x86_64_        (4 CPU)

05:26:55 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
05:27:25 PM  dev253-9   7669.55      2.44  61353.93      8.00     35.39      4.61      0.03     19.80

I edited out the real host name and I removed all the lines with devices except the one busy device, dev253-9.

Earlier today I got confused and thought that rd_sec/s meant read I/O requests per second but it is not.  Here is how the Linux man page describes rd_sec/s:

Number  of  sectors  read from the device. The size of a sector is 512
bytes.

In the example above all the activity is writing so if you look at wr_sec/s it is the same kind of measure of activity:

Number of sectors written to the device. The size of a sector  is  512
bytes.

So in the example you have 61353.93 512 byte sectors written per second.  Divide by 2 to get kilobytes = 30676 KB/sec.  Divide by 1024 and round-up to get 30 megabytes per second.

But, how many write I/O operations per second does this translate to?  It looks like you can’t tell in this listing.  You can get overall I/O operations per second including both reads and writes from the tps value which the man page defines as:

Total number of transfers per second  that  were  issued  to  physical
devices.   A transfer is an I/O request to a physical device. Multiple
logical requests can be combined into a  single  I/O  request  to  the
device.  A transfer is of indeterminate size.

Of course there aren’t many read requests so we can assume all the transfers are writes so that makes 7669.55 write IOPS.  Also, you can find the average I/O size by dividing rd_sec/s  + wr_sec/s by tps.  This comes out to just about 8 which is the same as avgrq-sz which the man page defines as

The  average size (in sectors) of the requests that were issued to the
device.

So, avgrq-sz is kind of superfluous since I can calculate it from the other values but it means that our average I/O is 8 * 512 bytes = 4 kilobytes.  This seems like a small I/O size considering that we are copying large data files over NFS.  Hmmm.

Also, the disk device is queuing the I/O requests but the device is only in use 19% of the time.  Maybe there are bursts of 4K writes which queue up and then gaps in activity?  Here are the definitions for the remaining items.

avgqu-sz

The average queue length of the  requests  that  were  issued  to  the
device.

await

The  average  time  (in  milliseconds)  for I/O requests issued to the
device to be served. This includes the time spent by the  requests  in
queue and the time spent servicing them.

svctm

The  average service time (in milliseconds) for I/O requests that were
issued to the device.

%util

Percentage of CPU time during which I/O requests were  issued  to  the
device  (bandwidth  utilization  for  the  device).  Device saturation
occurs when this value is close to 100%.

The service time is good – only .03 milliseconds – so I assume that the I/Os are writing to a memory cache.  But the total time is higher – 4.61 – which is mostly time spent waiting in the queue.  The average queue length of 35.39 makes sense given that I/Os spend so much time waiting in the queue.  But it’s weird that utilization isn’t close to 100%.  That’s what makes me wonder if we are having bursts of activity.

Anyway, I have more to learn but I thought I would pass along my thoughts on Linux’s version of sar -d.

– Bobby

P.S. Here is the output on HP-UX that I am used to:

HP-UX myhostname B.11.31 U ia64    04/02/15

11:27:14   device   %busy   avque   r+w/s  blks/s  avwait  avserv
11:27:44    disk1    1.60    0.50       3      95    0.00   10.27
            disk6    0.03    0.50       1       6    0.00    0.64
           disk15    0.00    0.50       0       0    0.00    3.52
           disk16  100.00    0.50     337    5398    0.00    5.52

r+w/s on HP-UX sar -d seems to be the equivalent of tps on Linux.  blks/s on HP-UX appears to be the same as rd_sec/s  + wr_sec/s on Linux.  The other weird difference is that in HP-UX avwait is just the time spent in the queue which I believe is equal to await – svctm on Linux.  I am more accustomed to the HP-UX tool so I needed to get up to speed on the differences.

About Bobby

I live in Chandler, Arizona with my wife and three daughters. I work for US Foods, the second largest food distribution company in the United States. I have worked in the Information Technology field since 1989. I have a passion for Oracle database performance tuning because I enjoy challenging technical problems that require an understanding of computer science. I enjoy communicating with people about my work.
This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to sar -d on Linux

  1. Anonymous says:

    Very good one.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.