technical: Implementing CICS/VSAM RLS: How We Did It
Over the past couple of years, we've been working with a client and our partners, CPT Global, to convert their CICS/VSAM files to RLS. Why? They were setting up a CICSPlex, and needed to open and update these files from more than one CICS region.
Read the manuals, and RLS looks pretty easy. Change the storage class and FCT entries, and you're away. But of course, that's not what we found.
In this first article of three, we summarize what we did, and why. In our second article, we talk about VSAM performance, and how we compared performance before and after our RLS conversion. In a third article next issue, we will talk about some of the issues and problems that we found.
Part 1: Research
Our first step was to understand how the application used VSAM. What files, how many, what usage. In our case, the client had over 800 VSAM files defined to CICS. But it wasn't that bad. 100 of these were paths to alternate indexes. Four were entries to datasets that did not exist.
Looking at CICS usage statistics for a six-week period, another 152 weren't used, so our target list was smaller. These statistics also showed that some of these files were not updated within CICS. So, they didn't need to be converted to RLS, further reducing our list.
16 of these datasets were IAM. Our client didn't have the IAM component that emulates RLS, so these needed to be converted to VSAM. Our client also had some ESDS datasets, we talk more about these below.
From this list, we created 'blocks' of datasets to be converted. We chose the lesser used datasets (and less important) datasets first, moving towards the highest used.
We also researched what the applications did with these files. We found programs that opened and closed files within CICS. We found others that changed attributes of the files (between read-only and update). We found programs that inquired on the status of each file, and this affected processing. More on this in a moment.
We also looked at applications accessing our VSAM datasets remotely via function shipping, and determined if they also needed to convert their definitions to RLS. We checked batch access to these VSAM datasets, and considered if these needed to be converted to RLS. We looked at backups, and any issues that may cause us problems. We also found that some batch jobs were opening and closing VSAM files. We talk about what we found here in the third article in this series.
We were also concerned about performance. The Redpaper IBM CICS Performance Series: CICS and VSAM RLS, the 2015 Share presentation Getting the Most out of your VSAM Data Sets in CICS by Using RLS , and the IBM Support Pac CP13 were useful. These indicate that we may get a performance enhancement from VSAM RLS.
Issue: Read Integrity
Many don't realise that VSAM LSR does not have read integrity. So, if a program PGM1 obtains a record for update (EXEC CICS READ FILE() UPDATE), another program PGM2 can continue to read that (old) record. If PGM1 updates the record, PGM2 will 'see' the new record, even before the syncpoint (i.e., the change is not committed).
Because of this, we did not use read integrity in our RLS files: no real change from pre-RLS. Or in other words, all FCT entries specified READINTEG(UNCOMMITTED).
IBM recommend that RLS not be used with ESDS files. And with good reason: performance will be degraded as users always try to get a lock for the 'next' empty record. Our client had a few ESDS files.
We solved this in two ways:
- For 'log' files, we created different datasets for each CICS region (same CICS file name).
- For other files, these were converted to KSDS or RRDS.
Part 2: Preparation
There was a lot of preparation. Here's a summary:
- CICS SIT change: RLS=Y, FTIMEOUT=30 (maximum time to wait for an RLS lock: we used the default of 30 seconds), RLSTOLSR=NO (don't include RLS files when calculating LSR resources - RLS doesn't use LSR, and we won't be switching between RLS and non-RLS in CICS).
- Coupling Facility structures: review existing RLS lock and cache sizes.
- Alter files not going to RLS. Some of our files weren't updated in CICS, so didn't need to be converted to RLS. We changed the FCT entry of these to READ/BROWSE, so they could not be updated.
- Forward recovery log: change from DASD-only log dedicated to one CICS, to shared coupling facility logstream shared across Sysplex.
- CICS Log of Logs: change from CICS-specific DASD-only logstream to shared coupling facility logstream.
- DFSMS storage class and data class. More on this in a moment.
- Modify CICS programs and batch jobs altering file attributes/status. More on these as well.
Let's look at a couple of interesting issues we came across in our planning.
Issue: SMS Classes
We defined two DFSMS storage classes: one for normal use, and one used for files that were backed up using snapshot technology. All RLS storage classes shared coupling facility cache and lock structures.
We defined four new data classes:
- DCRLS: uncompressed, extended format, non-extended addressing.
- DCRLSC: As for DCRLS, but compacted.
- DCRLSE: As for DCRLS, but with extended addressing.
- DCRLSCE: As for DCRLSC, but with extended addressing.
All RLS datasets were:
- Extended format - we didn't see any reason not to make them extended format (and it's required for compacted datasets), though we did not use striping.
- RLS CF Cache: ALL - all data cached for large RLS datasets.
- Rec ACC BIAS: User (System Managed Buffering disabled).
- RLS Above the Bar: YES.
We considered using extended addressing for all files. However, we had programs that used the RBA address. These would continue to work until a doubleword was required for the RBA. We decided that we wanted notice of this (an abend), rather than just letting our programs use the wrong RBA without notice.
Issue: Altering File Status
We had a few CICS programs that would change file attributes. For example, files would periodically be disabled for batch processing, and then re-enabled. With RLS files open in multiple CICS regions, this needed to be done in all CICS regions: EXEC CICS SET FILE() isn't good enough.
We solved this by using the CPSM API: EXEC CPSM. This was the first time we used these APIs, so there was the usually 'teething problems': security, teaching the applications how to use them, ensuring that all CICS regions (including development ones) were in a CICSPlex with a valid (and correct) CPSM scope.
Every CICS region executed using the same RACF user ID. So theoretically, one application could close files in another application's region with EXEC CPSM. We spent a lot of time ensuring application knew what they were doing. We also added in code to all EXEC CPSM files that would check that the current CICS region is in any CPSM scope used.
We had a similar issue in batch: batch jobs would close CICS files to do processing. These jobs used SDS CAFC. We changed these to use CPSM in batch (usually with a REXX).
Issue: RLS Quiesce
So, we had CICS regions that opened our VSAM datasets in RLS mode. We had batch streams that would close these files, and then process them in non-RLS mode. We briefly considered changing these batch jobs to RLS-mode, but decided that this was too hard. There were a lot of batch programs and utilities, and changing them to RLS-mode didn't buy us much, but would have been a lot of work.
Now, the 'normal' way of allowing a batch job to access an RLS dataset in non-RLS mode is to do an RLS quiesce. This can be done using the CPSM API, for example EXEC CPSM PERFORM SET ACTION QUIESCE. Then, when batch processing is completed, we could do an UNQUIESCE and everything would be fine.
We had two issues with this:
- Every CICS region with RLS=Y in the SIT (and any other address space 'registering' to RLS) must respond to the QUIESCE and UNQUIESCE. Our site had a lot of CICS regions for different applications. So, an issue with a CICS region for another application could possibly affect our processing.
- Our application relied on CICS files being open. After an UNQUIESCE, the files would remain closed until the first access, causing our application problems.
Because of this, we chose to use the EXEC CPSM PERFORM SET OBJECT CLOSE to close the file to all CICS region, and EXEC CPSM PERFORM SET OBJECT OPEN to open them again.
Part 3: Implementation
We did this in waves: starting with a couple of files, and having waves of up to 60 files at times. Each wave was done like this:
- Get baseline performance data.
- Phase 1: make files 'RLS-ready': redefine with an RLS storage and data class. We also specified our new shared forward recovery logstream in the DEFINE CLUSTER definition.
- Performance check.
- Phase 2: change FCT to open in RLS mode.
- Performance check.
This was obviously done while our application was still running in a single CICS region. Once completed, we could start the second CICS region to share these files.
We were very concerned with performance. Our application used VSAM heavily, with some very high add/update rates. You'll have seen that we did performance checks throughout the process. We discuss how we did this in our partner article.
Wasn't that simple, was it? I was surprised at the number of issues that had to be considered. Not just a simple change of storage class, and FCT change.
But it didn't end there. We were very concerned about performance, and spent a lot of time monitoring performance as the files were converted to RLS. We cover this in more detail in our article Comparing CICS/VSAM Performance With and Without RLS
We also found a few problems that we didn't expect as we moved to RLS. We will talk more about these in our follow-on article next issue.