management: Eight Ways to Make z/OS Administration Easier
It's a good time to be a z/OS systems programmer looking for work: there simply aren't enough experienced z/OS systems programmers to go around.
Because of this, it makes sense take steps to reduce systems programmer workloads: the less work, the less systems programmers you need. Or in other words, make systems programmers jobs easier.
But how can you do this? Here are my eight best ways of reducing the work of a systems programmer.
At one site they had multiple parallel sysplexes – a legacy of past acquisitions. However, each Sysplex had a different way of using SMP/E to install software products. At another site, there were no SMP/E standards: so, the CICS SMP/E environment was totally different to the IBM MQ environment.
Smart sites with multiple z/OS and parallel sysplexes standardize as much as possible: every z/OS should be a close as possible to being the same as other systems. For example, I've seen many sites that install using SMP/E on one systems programmer system, and then copy the libraries to other systems. There's even z/OSMF features to make this easier.
There are many other ways to standardize: from dataset naming conventions to JES job classes.
If z/OS systems are very similar, it's easier for systems programmers to manage them. It's also a resilience benefit: non-production systems should be as close as possible to production systems – errors or bugs have a greater chance of being caught early.
Many sites I see have a separate master catalog for every z/OS system. So, a new alias must be defined in each master catalog – opening the potential of master catalogs being different.
It doesn't have to be this way. Smart sites have one master catalog shared by all z/OS systems in the parallel Sysplex. Less work to define aliases, less chance of them being different.
There are many other ways of sharing assets between z/OS systems to reduce the work in administering them. For example, smart sites will have a shared SYS1.PARMLIB: using symbols as needed. They will allow SYSRES volumes be shared.
Sharing as much as possible also helps to standardize things between z/OS systems.
Every site has some automation used to IPL and shutdown systems. This will include some message automation to perform operations or alert when certain messages appear on the console.
However, few sites really use this to the fullest. For example, most sites I visit don't have automation rules for many of the possible serious z/OS error messages. Watson and Walker have developed a z/OS Health Checker rule (IMPORTANT MESSAGES Health Check) to alert when some of these occur. Smart sites will regularly review their message automation, adding rules for new messages, or improving actions to be performed when messages occur.
While we're talking about z/OS Health Checker, many sites also don't fully use it. I see many with outstanding exceptions that haven't been resolved. Most don't have automation alerting when exceptions occur. And almost none code their own checks.
z/OS also has many other brilliant automation features that many sites don’t use. For example:
- AUTOIPL – a feature to automatically IPL a system in a disabled wait state (and optionally perform a standalone dump).
- Predictive Failure Analysis (PFA) – early detection of 'soft' failures like insufficient CSA.
- System Status Detection (SSD) – allows a Sysplex to detect the failure of a z/OS system early.
- Automatic Restart Manager (ARM) – automatically restart a failing address space. Many sites rely on their automation product, but ARM can automatically restart a failing address space on another z/OS system if the current system fails.
- Message flood automation – automatically handle situation where a failing address space or hardware 'floods' a z/OS system with messages.
- AUTORxx PARMLIB – automatically reply to z/OS console messages requiring an operator reply.
4. Use Tools
Most of the z/OS systems programmers I work with are still doing it the same way they were in the 1980s: using ISPF editor, and coding or checking things by hand.
Today, there are some amazing tools and utilities that can make a system programmers life easier. For example:
- z/OS Management Facility – now becoming essential for some z/OS systems programmer tasks, it includes a raft of features to make things easier. One of the most interesting is the workflow feature, allowing sites to create workflows that staff can follow to perform repeating tasks (like upgrading z/OS).
- Integrated Development Environments (IDEs) – most sites think of application programmers when implementing IDEs like IBM IDz, BMC Compuware Topaz and ZOWE Explorer. However, systems programmers will benefit from some of these features.
There are some other tools that can really make things easier for systems programmers. For example, almost all sites will have monitors like IBM Omegamon, Broadcom SYSVIEW or BMC Mainview (now BMC AMI Ops). However, I often see that these haven't been fully installed or customized. For example, few sites have properly setup and customized features to alert when there is a problem.
5. Clean Up Stuff
At one site, I found almost half of the products installed were not used: some of these were old versions of products, others were products that were never deinstalled.
All this doesn't just consume disk space, it consumes systems programmer time. There are more products to review whenever upgrading or updating a z/OS system. There is more time dealing with vendors and managing agreements. There are additional risks that a user will use an old version.
Smart sites don't forget about 'hygiene', and clean things up as they go along.
6. Educate Users and Juniors
At one site, there was a systems programmer who did everything. If someone rang with a problem, he'd simply fix it himself. If they had a question, he'd find out the problem, and fix that.
This is great, but adds to the workload of the systems programmer. Smart systems programmers educate their users. Rather than simply solve a problem, they talk about the problem with the user, and help them fix similar problems themselves. Maybe point them to documentation or presentations that explain issues. Ensure they have enough access to fix the problem.
Smart systems programmers work with their junior staff, empowering them to try and fix problems themselves.
7. Give Users Dashboards and Tools
Few sites have great dashboards for users and management. So, if there's a problem, it may not be obvious what it is. And so, they call a systems programmer.
Suppose everyone in the organization had a dashboard showing the health of the mainframe and its major components. So, if a user has a problem, they can quickly see if there's a problem with a mainframe subsystem or asset. If there is, they can contact the right person. If not, then they can look elsewhere.
If we're going to educate and empower users to fix their own problems, then it makes sense to give them all the tools to do so. In case you haven't realized, I love monitors like IBM Omegamon. However, few sites allow anyone other than systems programmers to use them. However, all these monitors can be customized to allow users to look (but not touch). Allowing application programmers, help desk staff and others to use these tools helps them help themselves.
8. Less Exits
When I started systems programming, every site had a bunch of Assembler exits: custom routines that would be called during normal z/OS processing. For example, every site had an exit to output information at the end of each job and job step (IEFACTRT). Most had an exit that would run when a job or step started to validate a jobname or other job information (IEFUJI and IEFUSI).
Today, there's far fewer reasons to write exits. z/OS parameters provide the functionality that once required exits. JES2 policies can now be created to replace exits. And products such as BMC Thruput manager can replace exits.
IBM also offers 'pre-canned' exits that can be used without compiling or assembling code: IEFACTRT is one example.
Despite these options, many sites continue to have those legacy exits in place. It makes sense to review all exits, and eliminate them where possible.
Less is More
Smart organisations will want to minimise the number of technical staff they need: particularly hard-to-find systems programmers. By making life easier for systems programmers, the number of systems programmers needed will reduce. If systems programmers have a reduced workload, they will have more time to find other ways to reduce their workload.