LongEx Mainframe Quarterly - May 2020
We've recently been working with our partner, CPT Global, on a project to convert a client's CICS/VSAM files to RLS. In the previous articles, we discussed how we did it, and how we measured performance. But every project produces issues that surprise, and this was no exception. In this article, we look at five of the biggest, most unexpected issues we found when converting to RLS. 1. Empty FilesWe wanted to minimize changes required. We also believed that LSR processing would be faster than RLS for read operations. We reviewed CICS statistics for a six-week period, and found some files that did not appear to be updated within CICS. So rather than convert these to RLS, we changed the FCT entries to read-only. This worked well. However, we came across a potential issue with an empty VSAM file. Here's how it works:
To solve this, we advised our applications to 'prime' VSAM datasets that are not RLS, and could possibly be empty when opened by CICS. 2. Follow the RulesYou may not realise, but there are some rules with CICS/VSAM that were tolerated before RLS, but are now enforced. One we came across works like this: Suppose a program performs an EXEC CICS READ FILE() UPDATE RIDFLD(), followed by an EXEC CICS DELETE FILE() RIDFLD(). This is not permitted in CICS. This can be resolved by either:
However, our application had programs that did exactly this, and got away with it. RLS changed that. Our applications now received errors when attempting this. We did a scan of the client's source code identifying programs that may do this, and they were fixed by our client. 3. Backup While OpenWe were backing up our VSAM files using a VSAM backup and recovery product, which in turn used IBM Flashcopy. Before RLS, our backup product would perform a backup of a file in seconds. After the file was converted to RLS, this took up to 10 minutes. During that time, CICS applications attempting to update this file failed with AFCK abends. Here's what was happening. Our backup product was calling DFSMSdss in the background to perform the backup. By default, DFSMSdss insists on data integrity for RLS files. In our case, when it attempted to perform a backup of an RLS file, it would first perform an RLS QUIESCE of the file, then use IBM Flashcopy, and finally perform an RLS UNQUIESCE. This took us (and our vendors) a while to figure out. After we realised that our backup utility was indirectly performing a QUIESCE, we suspected an outstanding, uncommitted RLS update. IBM suggested we look for transactions around the time of the error with a low syncpoint number, but high service time. Going back through the SMF 110 records, we saw our candidate: a transaction that started before our problem, did only one syncpoint, had a high syncpoint wait time, and ended around the time our problem stopped. The problem was that we had a program that did an update of our VSAM RLS file, then performed an IBM MQ GETWAIT for up to 10 minutes. Our RLS update remained uncommitted for that 10 minutes, so SMSVSAM could not perform the QUIESCE. And of course, anything else wanting to update the file had to wait behind the QUIESCE request. We got away with this for non-RLS, as DFSMSdss allowed a 'dirty' backup of our dataset. Solution: we changed the CICS program to perform a syncpoint before the MQ GETWAIT. A New FORHere's something that will surprise you (and us at the time): we created a new File Owning Region (FOR) for our RLS files. Why would we do that? We had several foreign applications that were accessing our VSAM files using CICS function shipping. The remote FCT definitions for these remote applications pointed to one CICS region. We had a couple of choices:
The FOR gave us a couple of advantages. If the FOR failed, it could be restarted quickly (no application code). Our local application could not affect the FOR, so we improved resilience. 5. Skeletons in the ClosetAs we went through the project, we found some 'skeletons in the closet.' Or in other words, things that should not be done, or should be cleaned up. Here are some examples:
ConclusionMoving to VSAM RLS was not a trivial project. There were a lot of issues and problems that we found along the way. In this article, we've touched on five of them. But as you'd expect, there were many more smaller issues we found. |