Introduction to Disk Bottlenecks on Windows Servers
This page will explain how to use performance monitor to log disk counters. I will also
recommend solutions to disk bottlenecks on Windows Servers (2003 or 2008).
Disk Topics
♦
Disk Monitoring Homily
Firstly, a homily to explain why you should always monitor these 'big four'
objects: Memory, Processor, Disk and
Network. Beware of monitoring one counter in isolation because that
can lead to the wrong conclusions.
One company thought they had a problem with slow disks on a Windows 2003 Server. Performance
monitor confirmed long queues and slow disk access times. Their
conclusion was that the bottleneck was the disk and so they bought faster
disks. Unfortunately, the slow response persisted and they called me in to
investigate. By monitoring all the 'big four' performance objects, I
found excessive paging, there was also less than 2MB of available bytes. The true
ailment was lack of memory, high disk usage was a symptom and not the cause. The
lesson: incomplete monitoring can mean a waste of time and money, so always
record these four objects:- Memory, Processor, Disk and Network.
The Windows server roles most likely to experience disk problems are, web servers
with lots of graphics and file servers. On the other hand, Domain Controllers, DNS,
or DHCP servers are unlikely to have disk bottlenecks
PhysicalDisk
- PhysicalDisk: Avg. Read Queue Length Should be less than 2
- PhysicalDisk: Avg. Write Queue Length Should be less than 2
- PhysicalDisk: % Disk Time more than 50% indicates a bottleneck
In Diagram 1 performance monitor shows classic symptoms of a disk bottleneck. My diagnosis is
based on the Disk write queue counter, you can see that this queue averages
more than 2. In fact the average is nearly 4 (with a peak of over 8).
I wanted to to be unbiased. So, to ensure that it was not a processor or memory bottleneck, I also recorded % processor time and available bytes.
As you can see from Diagram 1, the processor's average was below 30%.
If the processor were the bottleneck the trace would be over 80%. On the
other hand, if there was a memory shortage, available bytes should drop below 10MB.
The graph show there was always 70 MB of Available MBytes. The
performance bottleneck may be worse than the average figures above suggest. In
Diagram 2, I have legitimately chopped the graph to isolate the period
of intense
disk activity. For these 5 minutes (4:46) the average is almost 6
against the bottleneck threshold of 2. The other difference is
that in
Diagram 2 (taken from performance monitor), I have included % Disk Time, this exceeds 100% for the
duration of the trace. In other words, the disk is working flat out
writing data to to the hard drive. There is one more deduction we
can make from the queue data on the chart. If you compare the
white line with the thick green line near the bottom, you can tell that the
disk is writing more rather than reading. To see the diagrams more clearly, double click and expand the thumbnails into larger diagrams.
|