technical: Ten Things I Can Find Out From SMF30
SMF type 30 records hold a bounty of information about address spaces: including started tasks, TSO sessions and batch jobs. Whenever I start working on an unfamiliar z/OS system, one of the first things I want to see are the SMF type 30 records: regardless of the goals of the project. So, let me tell you 10 things I can find out from these records.
1. How Many CICS Regions?
Ok, so I'm new to a site. How many CICS regions do they run? What are their names? I can find this using the SMF 30 interval records: records that are produced regularly (say, every 15 minutes). From these, I can list all the address spaces running the CICS program: usually DFHSIP.
I can also find out what regions run on each z/OS system. I can even find out if they run 24 hours a day or are shutdown regularly by looking at SMF 30 interval records over a 24 hour period.
This can be done for pretty much everything. Want to know how many MQ systems there are? Look for the CSQYASCP program. How many TCP/IP stacks a z/OS system has: EZBTCPIP. How many Db2 subsystems: DSNASCP (and the started task name ends in 'MSTR': other Db2 address spaces use the same program).
2. Is Connect:Direct Used?
I can use SMF30 interval records to see if a product is installed and used. For example, I can find out if they use IBM Netview (program DSIMNT) or Connect:Direct (DMINIT).
I can find out the z/OS systems where SAS is used (SAS*), or if they are replicating Db2 data using Infosphere Data Replication (ASNAPPLY). I can see if they're using Omegamon (KLV) or MAINVIEW (SSLOAD) as their monitoring tool.
3. How Much CPU?
This is perhaps the most common thing I use SMF 30 records for. Using the SMF30 interval records, I can graph the CPU usage of one or more address spaces. For example:
From this, I can determine how much CPU they're using, and when.
4. Is Connect:Direct Using zIIP?
Connect:Direct is a managed file transfer utility, and if it is correctly configured, it can offload processing to zIIP. With SMF 30 interval records, I can create a similar chart to the GP CPU usage chart shown previously, but for zIIP. So, I can determine if an address space (like Connect:Direct) is using zIIP, and how much. This is also brilliant when measuring the effect of enabling zIIP: I can see exactly now much zIIP is offloaded.
5. Any zIIP Eligible on GP?
If zIIP processors are busy, zIIP eligible work may run on GP CPUs. This isn't ideal: it affects the performance, as the work must wait for a set period to get a zIIP processor, before being dispatched on a GP CPU. It can affect a mainframe sites costs: more GP CPU usually means higher costs.
SMF 30 interval records include measurements of how much zIIP eligible work ran on a GP CPU for each address space.
6. Is DD FILE1 used?
SMF 30 records don't just record CPU usage. They also record I/Os. SMF 30 interval records can be used to find out how many I/Os an address space used in a period. SMF30 step end records can tell me how many I/Os were used for each DD statement. So, I can use these step end records to determine if a DD statement was used, or not.
This can be really handy. For example, we can see the STEPLIB DD performed many I/Os: maybe using features like LLA/VLF can improve our performance. Or, we may want to look at buffering options when our address space performs a lot of I/Os to a DD pointing to a VSAM dataset.
These records also record the device used. So, we can find out if VIO was used, or if the job wrote to tape.
7. How Many Batch Jobs Run?
Using SMF30 job end statistics, we can find out how many batch jobs ran. We can find out their start and end times, and from there list all jobs that run for longer than one hour. We can see the JES job class used, and so see if there are unused classes. We can find out how long a job waited to be executed: handy if we want to see if we need more initiators for a JES class.
8. Which Jobs are Discretionary?
z/OS Workload Manager (WLM) can assign a discretionary goal to a service class. Workloads using this service class will generally use CPU if no workloads in non-discretionary service classes need it.
The service class used is kept in the SMF30 records. So, we can list all the batch jobs using service classes that are defined as discretionary.
WLM provides a discretionary service class called SYSOTHER. Workloads that do not match any workload selection rules are assigned SYSOTHER. So, we could use SMF30 records to confirm that there are no important workloads 'falling' into SYSOTHER (and discretionary).
9. Who is Using All the Memory?
Today, most z/OS systems do not page. Memory is cheap, CPU is expensive. So, ideally our page rates should be zero. If a z/OS system is paging, either we add more memory, or we find out who is using all the memory.
SMF30 records store the amount of memory used by an address space. Even better, they have paging statistics, so we can determine if an address space was paging.
SMF 30 records also store memory storage occupancy (MSO) service units. So, we could compare the MSO service units of two jobs to see which used more memory.
10. Who is Using Tape?
SMF 30 records keep a record of the number of tape mounts an address space uses. So, we could find out which address spaces are performing tape mounts. Using those I/O stats, we can also see which DD statements are allocating tape devices, and how many I/Os they sent to each.
Just The Start
SMF Type 30 records are brilliant and can be used to find out so much information. I've given a lot of possible uses for SMF type 30 records, and this is just the beginning.
David Stephens
|