Originally posted by: stephen2615
We have an MS Exchange cluster that is behaving very badly and it has to be the SAN right? I have HTnM reports that show that the USP is happy but the Windows host is not all that happy.
Eg. a device which is a filesystem on the server reports in the Host/Array Correlation:
Windows device reports a Response Time of 200 ms+ but the USP logical disk reports under 6 ms. IOPS are identical at both ends and not all that large in numbers.
No Array Group LDEV's that make the LUSE volumes are working more than 10% busy. Both USP ports are barely pushing 40 MB's peak and the IOPS are below 2500 max.
Max IOPS that the server is using has been about 500 all day. At least 10 different Array Groups are being used to provide the LUSE LUN's.
The USP ports and the host ports on the switches are doing almost nothing for most of the time. As a matter of fact, the location where the Exchange cluster is running is our DR site and it does nothing for most of the time. So the infrastructure is barely ticking over.
I did note that the Exchange host sometimes goes ballistic with disk queue lengths of say 25 per LUN all happening at the same time which then presents a queue length of upward of 300 for the server (14 LUNS over two HBA ports) . It does not happen all that often but at the same time, Exchange goes a bit strange with reading certain files on the devices. That information comes from Performance Reporter doing real time troubleshooting.
So, everyone is blaming the SAN but everything out of HTnM suggests nothing is wrong. I have been told to fix this problem but can anyone suggest what the problem could be? Everything but the host is barely doing any work?