longpelaexpertise.com.au/ezine/DevOpsHypothetical.php?ezinemode=printfriend

LongEx Mainframe Quarterly - November 2017

management: Why I Like DevOps: A Hypothetical

Right from the start, I need to say that I'm a little excited about DevOps. And this is strange, coming from a sceptical mainframe veteran who has seen so many hot topics come and go. What's perhaps more interesting is that I have not yet seen a site where DevOps applied to the mainframe - yet. But for me, DevOps tackles one of the biggest problems I see in mainframe sites around the world: collaboration and communication.

Let me show an example. I'll base it on a couple of clients I've seen in the past, but it's hypothetical. In this hypothetical, a business group needs a new feature from their legacy CICS/VSAM application to communicate with a external application. This feature will need some new CICS resources: transactions, programs, VSAM files etc. It will also be using new Websphere MQ queues for input and output. And, as its customer facing, it needs to keep running reliably.

The Process Before DevOps

Planning

The business units provide the development teams with specifications of the requirements. The application will communicate with an external organization who have published a list of formats for incoming and outgoing messages.

Development

The application development teams start working on the problem. This change has to fit in with the existing CICS applications, but the developers have been around for a while, and they're good with this.

The developers have their own development CICS regions, and authority to do pretty much whatever they want. They define their own CICS resources (programs, transactions, VSAM files etc). Similarly, they send an email to the MQ administrators asking for a new queue to be added to the development queue manager, with specifications on what it should look like. Then they're off and programming.

Testing

After a while, the developers are done, and ready to move this change along the deployment road. Next step, Quality Assurance testing: a separate QA group. The QA group request the CICS administrators to add the new CICS resources. However, the new CICS programs are specified as threadsafe – this site has never used threadsafe programs before. So, the CICS group need to investigate and test this new feature. The project is put on hold for a month while this happens.

The QA group also ask the MQ group to add the new queue. But the queue name doesn't adhere to the MQ group's standards for this type of application (the original request for development didn't specify that is was an external link). So, the queue name must be changed, together with some of the queue specifications. Back to the developer to make these changes. More delays.

Now the QA team can start testing. They've got a QA environment that can send and receive the MQ messages, and begin their testing. However, they haven't discussed this with the developer, so they test as they've done in the past. All looks good.

Deployment

So, we're ready to implement this in production. The security team see the request for a new external Websphere MQ link, and note that it is not encrypted. So, the project is delayed a month while the correct certificates are obtained and installed. This also has to be done in the QA environment, so back to QA for re-testing. More delays.

Once these issues have been solved, the changes are implemented in production over a weekend. And on Monday morning when workloads are at their highest, the performance of all applications in the CICS region are impacted. The new application cannot process incoming messages as fast as they're added to the MQ queues, so the queues fill up, and start rejecting messages. It takes the operations team until 5pm to backout the change, and for that day, service levels have not been met.

Diagnosis

The production problems were caused by two issues. The first was that the application had not been load tested in QA. So, no one knew that it couldn't process the full incoming workload, affecting the entire CICS region.

The second problem was that the new application had a bug, and overwrote storage in other programs, causing abends and storage violations. This wasn't picked up in testing.

The Process with DevOps

OK, so how would this look differently with DevOps? To be honest, no one really knows, as DevOps isn't really well defined, and often means different things to different people. But let's see how it could have been changed.

Planning

The business units provide the development teams with specifications. Every new change is discussed by a working group consisting of application developers, database administrators, MQ administrators, security consultants, systems programmers, QA testers and operations.

The development is discussed, and a plan decided. All teams are now aware of the change. As part of this, the security team identify the need for encryption, and commence work on the certificates.

Development

The development teams develop as before. They discuss this with the CICS systems programmers, and together the new CICS resource definitions are created. These are stored in the same SCM as the source code. The CICS systems programmer identifies the threadsafe issue, and work is started to confirm that it is OK. The CICS systems programmer also recommends some resource definition changes that will improve the performance of the application.

Similarly, MQ definitions are discussed with the MQ administrators. Standards are followed, queue names are correct, and the MQ administrators recommend queue definitions and programming hints to improve resilience and performance.

The development systems (CICS and MQ) are the same as QA: same configuration. All parameter definitions are stored an SCM system. Developers can rebuild their CICS and MQ definitions from these stored configuration definitions whenever they want. Test cases are also stored in this SCM, so developers must create and execute these test cases before the programs can be promoted.

Testing

QA systems are rebuilt every evening from definitions stored in the SCM – at the same time as new programs are deployed. These definitions are the same as development. These QA systems automatically and regularly perform functional and load testing of all applications in the test system. QA test teams liaise with developers and business units when modifying any testing scripts to cover the new functionality.

Monitoring definitions for the monitoring products used in z/OS are also stored in the SCM. Systems programmers and MQ administrators liaise with the developers, and create modifications to the monitoring for the new application. QA testing is monitored in the same way as production.

The QA monitoring discovers the performance issues and bugs. The CICS systems programmers look at the performance, and suggest improvements. The developers resolve and fix the storage issues. QA testing is repeated and passed.

Deployment

Deployment in production is the same as QA – CICS and MQ definitions are create from definitions in the SCM. Monitoring products are modified in a similar way.

Discussion

There are a lot of 'holes' in this hypothetical. But it's gives an idea of how DevOps can improve the development and deployment of an application: a mainframe application.

You'll notice that I haven't really mentioned tools. Sure, I've mentioned the SCM, which should be a standard software tool used by anyone. I've also mentioned how CICS and MQ definitions can be automatically rebuilt from SCM definitions. Monitoring is also briefly covered, without going into detail. However I skipped over the continuous testing and deployment ideas that are part of most DevOps discussions.

But for me, the biggest plus is the collaboration of all the teams. In my experience, this doesn't happen often in large mainframe sites dispersed around the world. In fact, as a consultant one of my most important jobs can be to act as a bridge between different groups.


David Stephens