technical: Five Issues With SMF30
In our partner article, we talk about how SMF 30 records are brilliant for finding out information about a z/OS system.
However, there are a few interesting issues when working with SMF30 records that can trap the unwary. Let's look at five of them.
1. Interval Initiator Time
SMF30 interval (subtype 2 and friends) records show the number of CPU seconds used in a period. However, sometimes there is CPU time that is only measured at the end of a job or job step. For example, the initiator time is the CPU time used by the initiator to start up jobs. It is only measured at step-end: it isn't included in the SMF30 interval records.
Now, initiator time is usually very small: like 0.1 CPU seconds: normally not a big deal. However, I had one client with a short running job that had lots of steps executing hundreds of times every hour. Here is the CPU time measured in one hour for some of these jobs (11 jobs, 108 steps) by the SMF30 interval, step end (SMF30 subtypes 3 and 4), and job end (SMF30 subtype 5) records:
The initiator time is the same for the step-end and job-end statistics, and is included in the total CPU time figure for these records. Interestingly, this initiator time is more than the CPU time used by the job. However, this initiator time is not included in the CPU statistics for the SMF30 interval records.
So, when analysing short running jobs, I usually use the step-end or job-end statistics.
2. Missing Address Spaces
SMF30 interval records are usually created for all running address spaces. However, there are a couple of exceptions:
- If using Hyperswap, SMF30 interval records may not be created for XCFAS, GEOHSWP, and GEOXCFST.
- SMF30 interval records are not created for the master address space.
3. Small Values
SMF30 records store total CPU time used in hundredths of a second in the SMF30CPT field. So, if your address space used 0.01 CPU seconds, SMF30 records will record it. Most SMF processing tools like Merrill MXG and Broadcom MICS also store CPU usage in hundredths of a second. This sounds like enough granularity, and usually is.
However, suppose we have a batch job that runs 1000 times an hour. It has an elapsed time of less than one second and uses 0.009 CPU seconds. These jobs use nine CPU seconds that won't appear in the SMF30CPT field of SMF30 interval records.
More granularity can be achieved using other SMF30 fields. For example, the SMF30 documentation explains that CPU time can be calculated to 1/1000 of a second from service unit data using the formula:
CPU time in microseconds = (SMF30CSU * 10) / SMF30CPC * SMF30SUS / 16
SMF30 records record activity by address space. So, we can find out how much CPU a batch jobs consumed between 10 and 11am by looking at the SMF30CPT field.
Enclave CPU is included in the SMF30CPT field, but only for the owning address space of the enclave. So, an independent enclave using 10 CPU seconds in two different address spaces will record 20 CPU seconds in the owning address space, and zero seconds in the other.
Foreign enclaves are an exception: CPU is not recorded in the SMF30CPT field. Rather, it is recorded in the SMF30MRI or SMF30MRD fields: but only in the owning address space. A foreign enclave using 10 CPU seconds in the owning address space on MVS1, and another 10 seconds in another address space on MVS2 will record 20 CPU seconds for MVS1, and zero CPU seconds for MVS2 in SMF30.
5. No Program
Sometimes, I'll find an SMF 30 interval record with a blank program. For example:
These are all z/OS system tasks, and they don't register a program. If we are creating reports based on the program name (and I often do), then these address spaces will appear as blank programs.
If you compare the total CPU usage measured by SMF30 interval records against the total CPU usage measured by z/OS (SMF 72 records), you'll notice a difference: SMF 30 interval records don't record everything. In this article, we've looked at some of the things that are missed by SMF30 interval records.
Watson Walker cover some of these and other issues in their brilliant 2013 Share Session The Many CPU Fields of SMF.
But despite these pitfalls, SMF30 interval records are still brilliant. If you understand their shortcomings, SMF30 interval records can give you amazing, accurate information about z/OS address spaces.