management: Three Letters To Help Problem Diagnosis
Ever used IDCAMS? You probably have. But have you ever noticed those first three letters: 'IDC'? Those three letters are interesting and can help when diagnosing problems. Let's see how.
Messages and Modules
VTAM is important to z/OS: it's the base of all network communications. If you look at the VTAM started task, you'll see messages like:
|
|
IST2158I VTAM HAS JOINED THE SYSPLEX GROUP ISTCFS01
IST093I TCP ACTIVE
IST020I VTAM INITIALIZATION COMPLETE FOR CSV2R4
Each message starts with the same three characters: 'IST'. If you look in the started task JCL, you'll see something like:
VTAM EXEC PGM=ISTINM01,REGION=0M
Yep, the module VTAM is calling starts with the same three letters: 'IST'. In fact, many VTAM modules start with 'IST.' Wander around some of the other started tasks on your z/OS system, and you'll see the same. CICS modules and messages start with 'DFH', MQ with 'CSQ', and RACF with 'ICH'.
And this isn't an accident. IBM for decades has used three characters to identify individual products. But not only products: they're often used for product components. For example, the characters 'IDC' are used by DFSMSdfp access method services, z/OS I/O subsystem uses 'IOS', and z/OS UNIX 'BPX'.
It's not only IBM: most other vendors play ball. For example, ACF is used by Broadcom CA ACF2, and BMC is used by, well, BMC. Some vendors bend the rules a little. For example, BMC TMON uses the four characters TMON.
Identifying the Product and Code
IBM publishes a table in the z/OS Diagnosis: Reference manual with codes and the z/OS product or feature it relates to. Here's a part of it:
So, modules and messages beginning with 'ADR' relate to DFSMSdss.
CICS messages start with 'DFH', but some subcomponents add a couple of letters. CICS also publishes information about this:
So, any message beginning with DFHAP is produced by the application domain (a subcomponent of CICS).
But How Does This Help?
Suppose we have an abend:
IEA995I SYMPTOM DUMP OUTPUT 998
SYSTEM COMPLETION CODE=0C4 REASON CODE=00000004
TIME=10.51.49 SEQ=00786 CPU=0000 ASID=0057
PSW AT TIME OF ERROR 070C1000 81594E22 ILC 2 INTC 0D
MODULE=ERBRMFC
NAME=UNKNOWN
DATA AT PSW 01594E1C - 8400181E 0A0D18FB 180C181D
AR/GR 0: 008FEAEC/00001F00 1: 00000000/84806000
2: 00000000/00000000 3: 00000000/00000000
4: 00000000/00000000 5: 00000000/008F8340
6: 00000000/000000FF 7: 00000000/00000000
8: 00000000/7F690050 9: 00000000/01595348
A: 00000000/00000000 B: 00000000/00000000
C: 00000000/00000000 D: 00000000/7F690050
E: 00000000/84806000 F: 00000000/00000004
END OF SYMPTOM DUMP
I know straight away that this abend occurred in an IBM RMF module (the abending module ERBRMFC starts with ERB). Suppose I see a message that looks concerning:
IOS569I DEVICE IN PERMANENT ERROR STATE
It starts with 'IOS': the z/OS I/O subsystem. But if I see the message:
ACFE2002 NOT AUTHORIZED TO ISSUE SVC 512
I know it's Broadcom ACF2.
With experience, we can become familiar with many three letter combinations and what produces them. This can speed up error identification and debugging. But this isn't the only reason to know this.
Other Uses
Sometimes we want to find out what a job or started task does. The first step is to look at what program each step is executing. Suppose we have a job with the following step:
//PRNTJNL EXEC PGM=DFHJUP
The first three letters are 'DFH': CICS. I can now search CICS documentation to see what DFHJUP does (it's a batch utility to archive CICS journals).
These three letters can sometimes help when trying to find out what a dataset is for. Suppose we have a dataset called AA1.PROD.ADFHCOB. The low-level qualifier (ADFHCOB) may help.
Some IBM products have a low-level qualifier format for datasets that they supply with the product. The first letter indicates the type of dataset: 'A' indicates that this is an SMP/E DLIB, 'S' indicates an SMP/E target library. The next three letters identify the product: 'DFH' indicates CICS. So, this may be a dataset supplied with CICS.
Your Own Applications
If you're developing your own software or applications, it makes sense to use your own three letters: different to any other product. This way, you can identify the application issuing a message or abending. But there are other reasons.
Unlike many other operating systems, z/OS systems run a lot of different applications, tasks, and products in a single system. If your modules have the same name as other applications, or modules used by z/OS and other components, it can cause problems.
It also makes sense to have unique message IDs. This way, automation rules can perform actions based on a message ID. Sometimes, the message text may change with different software releases: but the message IDs rarely do.
So, don't write messages to the z/OS console beginning with ERB (RMF is using those characters). Don't have module beginning with IDC (DFSMSdfp access method services modules begin with those characters). And use the same three characters for each application, component or project so it's easier to identify modules and messages used.
Three Very Helpful Letters
Those three letters can be useful: helping you to quickly identify the product or application that owns a module, dataset, or message.
David Stephens
|