technical: Do We Really Need Systems Automation Software?
I recently looked at a z/OS site that was interesting: no automation software. No Tivoli Systems Automation, CA Ops/MVS or BMC AutoOperator. Which kind of begs the question: do we really need this automation software?
In this article, I'm going to see how much automation we can get without any automated software.
This exercise is interesting not only to take a look at automation software and what it does. It also takes a fresh look at some old and new z/OS facilities that may be useful, even if we have such automation software. In our partner article, we will do the same for automated job schedulers. |
|
Automation Functions
Ok, so let's start by reviewing what we're trying to do. Systems automation software generally provides four basic features:
- Startup/Shutdown - automatically startup or shutdown a system, and confirm it is up or down, and has completed without error.
- Command automation - automatically issue commands at certain times of the day, or after certain periods.
- Message automation - do things automatically when certain messages appear, or when operators issue certain commands. For example, raise an alert, issue a command.
- Message / command suppression - prevent messages from 'clogging' up a console. Prevent operators from issuing commands that are not appropriate.
So, let's look at each of these.
Startup/Shutdown
Commands to be automatically issued after an IPL can be specified in the z/OS IEACMDxx and COMMNDxx parmlib members. So, we can easily code all the commands to startup everything: JES, VTAM, CICS etc.
The problem is that we may need some systems up, before starting others. For example, we want VTAM to be up before starting CICS. Automation software usually has some mechanism to create a 'hierarchy' of system tasks. For example:
JES2 -> VTAM -> DB2 -> CICS
So VTAM will startup once JES2 is available, DB2 after VTAM is active, and CICS after DB2.
JES2 has a feature called the automatic command facility. Using this facility, JES commands can be scheduled to occur at a certain time of day, or after a certain period. How does this help? Well, JES2 also has the $VS command for issuing z/OS commands.
So, for our above hierarchy, we could use the $VS command to start VTAM immediately once JES is up, and the JES2 automatic command facility to start DB2 after 2 minutes, and CICS after 5 minutes:
$VS,'S NET'
$TA,I=120,'$VS,''DB1S -START DB2'''
$TA, I=300,'$VS,''S CICS1'''
In the above example, we're hoping that VTAM starts within two minutes, and DB2 within five.
Commands to be issued once JES2 initialization has been completed can be added to the bottom of the JES2 initialization parameters.
Commands in JCL
Sometimes we need to issue z/OS commands at the beginning or end of a started task. For example, we may want to active a VTAM major node before bringing up CICS.
This can be done easily within the JCL of the started task, or by using the JCL COMMAND statement. For example:
// V NET,ACT,ID=APPCIC1
// COMMAND 'V NET,ACT,ID=APPCIC1'
IBM manuals prefer the COMMAND JCL statement.
If we're replying to an outstanding WTOR to issue a command (for example, to shutdown IMS Connect), then things are a bit more complicated: we'll still need to do these manually.
Some other products provide features for performing actions on startup. For example, VTAM has the ATCCONxx member with a list of major nodes to automatically activate on VTAM startup. CICS has the PLTPI with a list of CICS commands to execute on startup. Websphere MQ commands can similarly be coded in the CSQPINP* initialisation datasets.
Command Automation
The same JES automatic command facility can be used to schedule commands at a set time of the day, or after a set period. For example, suppose we code the following commands at the end of the JES initialization parameters:
$TA,T=00.00,I=86400,$VS,'F PCICS1,CEMT P SHUT'
$TA,T=00.05,I=86400,$VS,'F PCICS1,CEMT P SHUT IMMED'
$TA,T=00.10,I=86400,$VS,'CANCEL PCICS1'
$TA,T=04.00,I=86400,$VS,'S PCICS1'
These commands will shutdown the CICS region PCICS1 at midnight every day (we specify a time with the 'T' parameter, and an interval of one day, or 86,400 seconds, with the 'I' parameter). If a normal shutdown has not been completed by 5 minutes after midnight, increasingly 'hard' shutdown commands are attempted. At 4am every morning, PCICS1 is restarted.
UNIX Facilities
Another option is to switch to z/OS UNIX. UNIX has the AT facility that can be used to schedule commands or scripts. So, we could issue the UNIX command
at -f restart.sh 00:00
To execute the shell restart1.sh at midnight. The problem here is that it's not possible to issue z/OS commands from UNIX without a couple of extra steps. Here are our options:
- Use the UNIX submit command within the script to submit a batch job. This job could use the JCL COMMAND statement to issue a command.
- Download the freeware oeconsol utility from the z/OS Tools and Toys Github
The UNIX crontab facility can also be used to issue commands and scripts at set times in the future: weekly, month etc.
These UNIX facilities need to be setup initially. We could have a started task that runs a UNIX script after every IPL - and this script could then setup any AT or CRONTAB definitions. This started task could call the BPXBATCH utility to execute our 'setup' script.
//SUBJOB PROC JOB=
//SUBMIT EXEC PGM=BPXBATCH,PARM='SH setaut.sh'
//STDOUT DD SYSOUT=*
//STDERR DD SYSOUT=*
Message Automation
Automation products provide a rich suite of features for automatically doing things when a message is received, or a command is detected. For example, we probably want to dump an SMF dataset once we see the SMF message indicating that it is full and no longer in use.
Outside of systems automation software, there's no way of automating processing of messages received without writing our own message processing exits. But there are some options to automate processing for some of the situations that cause messages.
Take our SMF archive as an example. IBM supply a sample IEFU29 and IEFU29L exit to automatically perform a command when an SMF dataset or logstream are switched.
z/OS Automatic Restart Manager (ARM) is another nice feature. It can be configured to automatically attempt to restart a task that has failed. It has quite a nice arrange of features. For example, you can specify how many times to try and restart a failing task, and even restart it on another z/OS system in the case of a z/OS failure.
If z/OS has a serious problem, the z/OS automatic IPL function can do exactly that: optionally after performing a standalone dump. This is configured in the DIAGxx parmlib member.
z/OS has an auto-reply facility to automatically reply to WTORs. Auto-reply policies are set in the AUTORxx member. IBM supply a starter set of common WTORs requiring an automatic reply in the AUTOR00 member of parmlib.
The MPFLSTxx parmlib member can be used to highlight certain messages using the .MSGCOLR parameter. Perhaps not as good as an email or other alert from an automation product, but if operators are regularly checking the console, these will stand out.
Other subsystems have functions to automate messages, issues or commands. For example, Policies can be set in CICS to automatically generate messages or events when something abnormal occurs. Events can be processed by other CICS programs, or even put onto a Websphere MQ queue for processing by something else.
Message / Command Suppression
The z/OS MPFLSTxx parameter can be used to suppress messages from the console. This can also be used to limit the messages that automation software must process, potentially reducing the CPU and processing overhead.
Automation software can stop or change a command. For example, CICS should be shutdown using the CEMT transaction, not the STOP or CANCEL command. So, automation software could 'trap' such STOP or CANCEL commands, and change them to a more suitable CEMT transaction. Unfortunately, there's no easy way to do this outside of automation software.
Summary
In our site with no automation software, they did a lot of things manually. Operations regularly checked the console for errors and issues. They rarely IPLed, and when they did, this was done using the IEACMDxx and COMMNDxx parameters, as well as manual commands.
Our site also used the freeware in File 332 of the CBTTape website to automatically submit jobs and issue commands at set times during the day. They could also have considered CBTTape File 770 that provides a basic message and command automation feature.
The reality is that almost all current z/OS systems will be sufficiently complex to require some commercial automated operations software. However, z/OS and related systems do provide some interesting options, and more simple sites may be able to do away with the cost of this commercial software.
David Stephens
|