Jean Louis Di Domenico

What tools do you use for z/OS Performance Monitoring on a specific group of Logical Volumes (LV) say, IMS-Loggers?

Discussion created by Jean Louis Di Domenico Employee on Jul 18, 2013
Latest reply on May 7, 2014 by Jean Louis Di Domenico

Greetings,

Monitoring by cluster of volumes has a real interest.

For years (centuries?) one has been chasing tools and expert systems to do "Troubleshooting" for IBM z/OS.

When a need to troubleshoot a configuration arises, one (maybe) should consider "Monitoring" first to detect onset of coming up bottlenecks prior need to pickup the red-telephone and rush to fix a slowdown on mission critical application.

However, with this many years (centuries?) of practices to troubleshoot, many people love (and nurture) the fact to identify the worse volume as far as performance is concerned.

Because Monitoring is a different beast, one can group LVs handling the same workload type, then monitor, with all known and classic metrics, how this "group of LVs" is performing, by looking at averages per 15mn for that group, as if that group was a unique and giant Logical Volume to support that load. It is very close to Storage Group monitoring. However my concept goes beyond that granularity.

It allows to group together, say, all IMS-Loggers of all LPARs inside the same sysplex, and/or all DB2-LOGs of all LPARs in the same sysplex.

Then monitoring will right away identify "hot processing windows", then assist in doing troubleshooting (if needed) on THAT specific window.

Also, by monitoring, it means to pull graphs with 2 thresholds, green and red level. Between zero and green it is green zone, or happy zone, between green and red it is orange zone, or warning zone and also known as best ROI but requiring to watch, then red level danger zone where you have to do something and fast.

Groups of LVs can be many... Another example is Oracle-DB LVs together and Oracle-LOGs together.

Then you can decide remedial action based on the internal functionality of Oracle.

TSM can be monitored too... 3 level of backups etc...

 

The core-idea of this question is to keep customers happy BEFORE bottleneck or slowdown happen weeks in advance and decide for configuration reorg or expansion.

 

I welcome any comment/discussion/suggestion/desire to accomplish this that would not exist today.

Outcomes