technical: Performance Tuning vs CPU Reduction
I was trained as an engineer: so, I love elegance. I love systems that are simple, and I hate waste. I have more psychological issues, but these come in handy when making things run better: when tuning.
But what does 'better' really mean? As a consultant, my clients usually want to do one of two things: make things run faster, or reduce resource used (almost always CPU). Sometimes, my clients would like me to do both. But here's the thing: tuning to make things run faster is different to reducing CPU. Let me explain.
The First Difference: Target
When doing any tuning, the first step is to figure out why we're tuning: a CICS transaction may be running too slow, a batch stream may not be finishing on time, or we may be using too much CPU. Once we've figured out the why, we then measure to see if there is a problem, and where it is.
This is the first place where tuning for performance differs from tuning for CPU usage.
If we want things to run faster, then we already have a target: one or more CICS transactions running too slow, batch jobs aren't finishing by a deadline. With CPU, we usually don't have a specific target. In some cases, I may get asked to reduce the CPU used by a job or CICS region. But usually, my scope is far larger. For example: reduce the CPU used by a group of z/OS systems. So, I need to figure out what is using CPU, and then dig down from there.
The Second Difference: Tools
When doing tuning, I'll use tools to find dig deeper to see if there's anything I can do. Now, there are a few tools that can be used for both CPU reduction and performance improvement. Here are some examples:
- Sampling Tools: tools like IBM APA and BMC Compuware Strobe. These can provide an X-ray of a process: what is using CPU, and why it is waiting. Can also provide interesting information like options used to open or access a dataset.
- IMS Logs: essential for measuring the performance of IMS transactions.
- SMF Type 30 Records: Interval records show the CPU used for each address space in a period.
- CICS Statistics: usually written to SMF Type 110 records. Breaks down CICS transactions: how long they took, what they were doing, how much CPU they were using. Similar statistics provided by BMC MAINVIEW.
- RMF Monitor I: great information about z/OS related delays (think waiting for HSM, ENQs, or CPU) for service or reporting class.
- RMF Monitor III: the workload delay information is brilliant: showing z/OS related delays for individual jobs.
Although I could use all of these for both CPU reduction or performance, I usually use sampling tools for CPU reduction, and the other statistics for performance.
The Third Difference: Changing Things
There's no doubt that some things I change to reduce CPU also improve performance. And vice versa. But this happens less than you think. For example:
- Often batch job elapsed times can be reduced with correct buffering options. These rarely save CPU.
- Often, traces consume a lot of CPU, with far less effect on performance.
Most of my changes to improve performance concentrate on two things:
- Improving I/O: dataset I/O, Db2, MQ, TCP/IP
- Reducing waits: ENQs, wait for processor, waits for other resources
However, CPU is more about what is using CPU: eliminating loops, reducing the amount of data processed, eliminating unnecessary processing.
A Similarity: Measurement
Whenever I change something, I need to know the effect. So, I'll want to measure the performance before the change, and after. The tools I use for this are often the same:
- CICS transactions: CICS statistics
- IMS transactions: IMS logs
- Batch jobs and started tasks: SMF Type 30 records
There are some cases when the measurement will differ. For example, I may look at the z/OS capture ratio or RNI to measure a CPU reduction, but never a performance benefit.
Important To Understand the Why
Maybe the biggest difference between improving performance or reducing CPU is what I'm thinking. There are many things that may impact CPU usage, but not performance. And vice versa. So, before I start any tuning, I always agree with the customer on the why and what: why am I doing this, what is in scope? Specifically, I need to understand if I'm improving performance, or reducing CPU.