longpelaexpertise.com.au/ezine/VSAMRLSIssues.php?ezinemode=printfriend

LongEx Mainframe Quarterly - May 2020

technical: Converting CICS/VSAM to RLS: Five Issues We Found

We've recently been working with our partner, CPT Global, on a project to convert a client's CICS/VSAM files to RLS. In the previous articles, we discussed how we did it, and how we measured performance.

But every project produces issues that surprise, and this was no exception. In this article, we look at five of the biggest, most unexpected issues we found when converting to RLS.

1. Empty Files

We wanted to minimize changes required. We also believed that LSR processing would be faster than RLS for read operations. We reviewed CICS statistics for a six-week period, and found some files that did not appear to be updated within CICS. So rather than convert these to RLS, we changed the FCT entries to read-only.

This worked well. However, we came across a potential issue with an empty VSAM file. Here's how it works:

The VSAM file is recreated with no records
The first CICS region to open this file opens it successfully. However, it is opened for update, even though the FCT entry specifies ADD(NO) DELETE(NO) UPDATE(NO) READ(YES) BROWSE(YES).
Subsequent CICS regions fail to open this file, as it is already opened for update

To solve this, we advised our applications to 'prime' VSAM datasets that are not RLS, and could possibly be empty when opened by CICS.

2. Follow the Rules

You may not realise, but there are some rules with CICS/VSAM that were tolerated before RLS, but are now enforced. One we came across works like this:

Suppose a program performs an EXEC CICS READ FILE() UPDATE RIDFLD(), followed by an EXEC CICS DELETE FILE() RIDFLD(). This is not permitted in CICS. This can be resolved by either:

Removing the RIDFLD parameter on the DELETE command
Performing an EXEC CICS UNLOCK FILE() command before the DELETE command
Performing an EXEC CICS SYNCPOINT command before the DELETE command

However, our application had programs that did exactly this, and got away with it. RLS changed that.

Our applications now received errors when attempting this. We did a scan of the client's source code identifying programs that may do this, and they were fixed by our client.

3. Backup While Open

We were backing up our VSAM files using a VSAM backup and recovery product, which in turn used IBM Flashcopy. Before RLS, our backup product would perform a backup of a file in seconds. After the file was converted to RLS, this took up to 10 minutes. During that time, CICS applications attempting to update this file failed with AFCK abends.

Here's what was happening. Our backup product was calling DFSMSdss in the background to perform the backup. By default, DFSMSdss insists on data integrity for RLS files. In our case, when it attempted to perform a backup of an RLS file, it would first perform an RLS QUIESCE of the file, then use IBM Flashcopy, and finally perform an RLS UNQUIESCE.

This took us (and our vendors) a while to figure out. After we realised that our backup utility was indirectly performing a QUIESCE, we suspected an outstanding, uncommitted RLS update. IBM suggested we look for transactions around the time of the error with a low syncpoint number, but high service time. Going back through the SMF 110 records, we saw our candidate: a transaction that started before our problem, did only one syncpoint, had a high syncpoint wait time, and ended around the time our problem stopped.

The problem was that we had a program that did an update of our VSAM RLS file, then performed an IBM MQ GETWAIT for up to 10 minutes. Our RLS update remained uncommitted for that 10 minutes, so SMSVSAM could not perform the QUIESCE. And of course, anything else wanting to update the file had to wait behind the QUIESCE request.

We got away with this for non-RLS, as DFSMSdss allowed a 'dirty' backup of our dataset.

Solution: we changed the CICS program to perform a syncpoint before the MQ GETWAIT.

A New FOR

Here's something that will surprise you (and us at the time): we created a new File Owning Region (FOR) for our RLS files. Why would we do that?

We had several foreign applications that were accessing our VSAM files using CICS function shipping. The remote FCT definitions for these remote applications pointed to one CICS region. We had a couple of choices:

Leave it as is: so remote applications access the file via one CICS region. However, if this region failed, then the remote applications would fail to access our file.
Investigate IPIC: it was possible that by using an IPIC connection to a DVIPA address, we could allow our foreign regions to access any CICS region that was active. However, this would have taken some testing to confirm, and there was no guarantee that this would work.
Open the file in RLS mode in the foreign CICS region. As we had batch jobs closing and opening our files, we didn't want our batch jobs to be reliant on foreign applications. Further, a couple of these foreign applications were in a different parallel Sysplex: can't use RLS across sysplexes.
Create an FOR: remote files would specify a remote file pointing to our FOR, which would open our VSAM files in RLS mode.

The FOR gave us a couple of advantages. If the FOR failed, it could be restarted quickly (no application code). Our local application could not affect the FOR, so we improved resilience.

5. Skeletons in the Closet

As we went through the project, we found some 'skeletons in the closet.' Or in other words, things that should not be done, or should be cleaned up. Here are some examples:

ESDS Recovery: Some of our application files were ESDS files with RECOVERY(ALL) specified in the CICS file definition. If a transaction inserting a record for this file failed, CICS would not be able to back it out: records cannot be deleted in an ESDS. Unless there is a backout exit specified (XFCLDEL - we didn't have one of these), the transaction backout would fail, and the file would probably be unusable (no more inserts, as there is an uncommitted insert pending). Solution: specify RECOVERY(NONE), or convert to a KSDS/RRDS.
Bad Shareoptions: we found a lot of files with Shareoptions(3,4) and (4,3). Or in other words, no CICS/system supplied integrity: multiple address spaces can update the VSAM dataset without integrity.
Unused Datasets: using CICS statistics, we found a lot of VSAM datasets that did not appear to be used. We suggested that it was easier to decommission VSAM files than convert them to RLS. In many ways, this project justified a lot of cleanup that makes maintaining the application easier.

Conclusion

Moving to VSAM RLS was not a trivial project. There were a lot of issues and problems that we found along the way. In this article, we've touched on five of them. But as you'd expect, there were many more smaller issues we found.

David Stephens