technical: Do We Really Need Automated Job Scheduling Software?
Most sites will have some automated job scheduling software like IBM Tivoli Workload Scheduler, BMC Control-M or CA ESP. This software automates some of the complicated job schedules: starting jobs as required, and reporting on job failures. But do we really need this software?
In this article, we're going to look at the job automation options provided by z/OS, JES and other software, and see how much this can replace commercial job scheduling software. We'll also ask a similar question about automated operations software like IBM Tivoli Systems Automation in our partner article.
Automated Job Scheduler Functions
Before we start looking for ways to perform automation with basic z/OS facilities, let's start with the functions that automated job schedulers give us. In a nutshell we get:
- Automated Job Submission - automatically submitting jobs after set criteria are met. This could include a time of day, if a file has been received, if a subsystem is up/down, or if other job(s) has completed successfully (or unsuccessfully).
- Manage Job Schedules - some job schedules can have hundreds, or even thousands of jobs. Many automated job schedulers provide tools to view and manage these schedules: from planning to monitoring them as they progress.
- Alerting - if a job fails, this software alerts the operator that there has been a problem. This alerting can be sophisticated, including email or SMS alerts, as well as status displays.
Automated Job Submission - After IPL
OK, so we want to automatically submit some jobs. Let's start with submitting jobs after an IPL has completed. z/OS doesn't have any facility to submit jobs after an IPL, but it does have a facility to issue commands; including starting started tasks. So, we could 'package' our jobs as started tasks, and use the z/OS parmlib members IECMDxx and COMMNDxx parmlib to start them.
Perhaps a better way is to create a general job submission started task that will copy jobs from a dataset to the JES internal reader (i.e. submit them). Something like:
//SUBJOB PROC JOB=
//SUBMIT EXEC PGM=IEBGENER
//SYSUT1 DD DISP=SHR,DSN=JOB.PDS(&JOB)
//SYSUT2 DD SYSOUT=(A,INTRDR)
//SYSPRINT DD SYSOUT=*
//SYSIN DD DUMMY
So, we could start this task with the z/OS command S SUBJOB,JOB=JOB1. The JCL in JOB.PDS(JOB1) would be submitted.
Automated Job Submission - Set Times
JES $VS and $TA
We often want to submit jobs at set times, or after set periods. For example, we may submit backups every night at midnight, or a job every hour to empty a Websphere MQ queue.
JES2 has a feature called the automatic command facility. Using this facility, JES commands can be scheduled to occur at a certain time of day, or after a certain period. So, we could use this with the JES $VS command, and our SUBJOB started task to achieve this. For example, the JES2 command
$TA,I=120,$VS,'S SUBJOB,JOB=BACKUP1'
Would submit the job in JOB.PDS(BACKUP1) every 120 seconds (two minutes). The command
$TA,T=01:00,I=86400,$VS,'S SUBJOB,JOB=BACKUP1'
Submits our job BACKUP1 every 24 hours (86400 seconds) at 1am.
JCL SCHEDULE
A recently added JCL statement for JES2 is the SCHEDULE statement. Amongst other things, this can be used to control when a submitted job will run. For example, the JCL statement
// SCHEDULE STARTBY=('03:50','09/24/2018'),
// HOLDUNTL=('03:00','09/24/2018')
will hold the job until 3am on 24-Sep-2018. The preferred time to execute is 03:50 on the same day.
z/OS UNIX
Another option is to switch to z/OS UNIX. UNIX has the AT facility that can be used to schedule commands or scripts. So, we could issue the command
at -f backup1.sh 23:00
To execute the shell backup1 at 11pm. This shell could issue a submit command to submit jobs. For example:
submit job1.jcl
The UNIX crontab facility can also be used to issue commands and scripts at set times in the future: weekly, month etc.
These UNIX facilities need to be setup initially. We could have a started task that issues a setup script after every IPL - and this script could setup any AT or CRONTAB definitions. Our started task would use the BPXBATCH utility to execute our startup script, something like:
//SUBJOB PROC JOB=
//SUBMIT EXEC PGM=BPXBATCH,PARM='SH setaut.sh'
//STDOUT DD SYSOUT=*
//STDERR DD SYSOUT=*
Automated Job Submission - Job Completion
JES Internal Reader
Many job schedules submit one job after another job has successfully completed. We could achieve this by submitting a second job from JCL of the first job. An easy way is to use IEBGENER to copy a job from a dataset to the JES internal reader. For example:
//STEP2 EXEC PGM=IEBGENER
//SYSUT1 DD DISP=SHR,DSN=JOB.PDS(NEXTJOB)
//SYSUT2 DD SYSOUT=(A,INTRDR)
//SYSIN DD DUMMY
We could be a little smarter, and only submit this second job if all previous steps in the first job were successful (return code < 8):
//STEP2 EXEC PGM=IEBGENER,COND=(8,LT)
//SYSUT1 DD DISP=SHR,DSN=JOB.PDS(NEXTJOB)
//SYSUT2 DD SYSOUT=(A,INTRDR)
//SYSPRINT DD SYSOUT=*
//SYSIN DD DUMMY
The IF/THEN/ELSE JCL statements give us some more options to conditionally execute JCL. For example, we could have
//CHECK1 IF (NOT ABEND & RC <8) THEN
//SUBMIT EXEC PGM=IEBGENER,COND=(8,LT)
//SYSUT1 DD DISP=SHR,DSN=JOB.PDS(NEXTJOB)
//SYSUT2 DD SYSOUT=(A,INTRDR)
//SYSPRINT DD SYSOUT=*
//SYSIN DD DUMMY
// ENDIF
This would only submit our second job if there was no abend, and all return codes were less than 8.
JES Job Groups
z/OS 2.2 introduced job groups for JES2 which give us more options. Jobs can be put into groups. Take a look at the following JCL:
//BACK1 JOBGROUP
//JOB1 GJOB
//JOB2 GJOB
// AFTER NAME=JOB1,WHEN=(RC=0)
//BACK1 ENDGROUP
//*
//JOB1 JOB (ACCT),MSGCLASS=A,CLASS=B
// SCHEDULE JOBGROUP=BACK1
//STEP1 EXEC PGM=IEFBR14
//*
//JOB2 JOB (ACCT),MSGCLASS=A,CLASS=B
// SCHEDULE JOBGROUP=BACK1
//STEP1 EXEC PGM=IEFBR14
JOB1 and JOB2 are in the same group: BACK1. JOB2 will only be executed once JOB1 has completed, and return codes were 0.
Job grouping is quite powerful. JES commands can be used to hold, release and purge entire job groups. Jobs within the group can have sophisticated scheduling parameters.
Automated Job Submission - Dataset Release
Suppose we want a second job to be submitted when the step of a first job has completed? Most automated job scheduling software can do this. There are a couple of options without this software.
Of course, we could simply have a step in our first job that submits the second job to the JES internal reader as we've previously seen. But let's look at a different option. Suppose our first job starts with
//JOB1 JOB (ACCT),MSGCLASS=A,CLASS=B,DSENQSHR=ALLOW
//ALLOC EXEC PGM=IEFBR14
//DD1 DD DISP=OLD,DSN=MY.DATASET
Our second job also tries to allocate the same dataset, but with DISP=SHR:
//JOB2 JOB (ACCT),MSGCLASS=A,CLASS=B,DSENQSHR=ALLOW
//ALLOC EXEC PGM=IEFBR14
//DD1 DD DISP=SHR,DSN=MY.DATASET
Normally, JOB2 would wait until all steps in JOB1 have completed, as any DISP=OLD/MOD/NEW allocation retains exclusive access for the entire job. But JOB1 has DSENQSHR=ALLOW. So, if a later step in JOB1 allocates our dataset as SHR, the ENQ will be downgraded to SHR, and JOB2 can start. Or in other words, we allocate our dataset SHR in JOB1 when we're ready for JOB2 to start.
This concept of using a dataset allocation to control serialization is not new, but can be a powerful way of controlling job execution outside of automated scheduling software.
Automated Job Submission - Dataset Create
Suppose we want to submit a job once a dataset has been created. This is often used to automate processing when an external file is received.
What we could do is submit a job at the earliest time we would expect a dataset to be created like this:
//J1 JOB (ACCT),MSGCLASS=A,CLASS=B,DSENQSHR=ALLOW
// SCHEDULE HOLDUNTL='+00:05'
//ALLOC EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
ALLOC DSNAME(MY.DATASET) SHR
//RESUB EXEC PGM=IEBGENER,COND=(4,GT)
//SYSUT1 DD DISP=SHR,DSN=JOB.PDS(&JOB)
//SYSUT2 DD SYSOUT=(A,INTRDR)
//SYSPRINT DD SYSOUT=*
//SYSIN DD DUMMY
//*
//PROCESS EXEC PGM=xxx,COND=(8,LT)
(continue processing)
Let's look at this job.
The SCHEDULE statement causes the job to wait for 5 minutes after it is scheduled before it runs. We're using IDCAMS to try and allocate our target dataset. If this allocation is successful, we've received our dataset, and continue processing. If we can't, we then resubmit ourselves (and wait 5 minutes before continuing processing) to check again. So, we effectively check every 5 minutes until our dataset exists.
Alerting
Automated job schedule software provides features to alert when there are problems: when a schedule has not started by a certain time, when a job has not completed by a certain job, or when a job has failed.
The NOTIFY parameter of the JOB statement can be used to send a message to a TSO user when the job completes. So, our job could have:
//JOB1 JOB (ACCT),MSGCLASS=A,CLASS=B,NOTIFY=USER1
How about if we want to send an email. We could add a step to the end of each job that sends an email if the job fails or abends. For example,
//CHECK1 IF (ABEND | RC >4) THEN
//SUBMIT EXEC PGM=IEBGENER
//SYSUT1 DD DISP=SHR,DSN=JOB.EMAILS(JOB1FAIL)
//SYSUT2 DD SYSOUT=F
//SYSPRINT DD SYSOUT=*
//SYSIN DD DUMMY
// ENDIF
This job sends an email to the JES2 output queue, where (if configured correctly) it could be sent using the z/OS Communications Server SMTP server task.
This is great for alerting if a job abends or fails. But doesn't alert if there is a JCL error, or the job was cancelled before it started running.
In z/OS 2.3, the NOTIFY command was added to JES2. This needs some extra JES configuration, including a new JES2EDS started task, and z/OS Management Facility (z/OSMF). It lets us send an email when a job was successful or unsuccessful. Look at the following statement:
//NFY1 NOTIFY EMAIL='jobsch@my.firm.com',WHEN='(!RUN | RC!=0 | ABEND)'
This will send an email to the specified email address if the job has not run (it was cancelled or had a JCL error), had a return code that was not zero, or abended. Nice.
Job Schedule Display
Automated job scheduling software usually has some nice screen displaying the job schedule: things like what jobs are waiting for, what jobs will execute next. If we use the job grouping options we've talked about previously, JES provides some reasonable displays about jobs in each job group.
$HASP890 JOB(BACK1) 890
$HASP890 JOB(BACK1) JOB_GROUP_STATUS=SUSPENDED,ONERROR=STOP,
$HASP890 ERROR=RC>0004
$HASP890 ***************************************
$HASP890 JOB GROUP JOB IN ERROR LIST
$HASP890 LIST OF JOBS TO RESUBMIT
$HASP890 JOB NAME JOBID JOB STAT COMP STAT
$HASP890 ---------------------------------
$HASP890 J1 JOB00053 Q=HARDCPY IN ERROR
$HASP890 ***************************************
$HASP890 JOB GROUP JOB LIST
$HASP890 JOB NAME JOBID JOB STAT COMP STAT
$HASP890 ---------------------------------
$HASP890 J1 JOB00053 Q=HARDCPY IN ERROR
$HASP890 J2 JOB00056 PEND DEP PENDING
$HASP890 ***************************************
SDSF also has job group displays.
Summary
As you'd expect, we can't do everything job automation software can do. For example, it's really difficult to submit a job when multiple criteria have been met. This software also provides nice screens to manage complicated schedules, submit different jobs under different userids, and more. But without this software, there is still a lot of automated job management that can be done with the basic z/OS and JES services.
(Thanks to Peter Vels for spotting a couple of errors in our JCL. Fixed 17-May-2020)
David Stephens
|