longpelaexpertise.com.au/ezine/CICSVSAMAccessOpen.php?ezinemode=printfriend

LongEx Mainframe Quarterly - August 2020

technical: Is It Crazy To Read a VSAM Dataset Open in CICS?

Many sites I've seen have some process that reads a VSAM dataset while it is open for update in CICS. They may be browsing from ISPF using something like IBM FileManager, or accessing from a batch job. This is generally viewed as 'bad.' But is it really? And if so, what can we do about it?

In this article, we put some myths to the test.

The Myths

Let's look at some of the 'myths' about CICS/VSAM datasets:

  • You can read an ESDS or KSDS open to CICS, but there may be 'missing' records.
  • This can be fixed using SHAREOPTIONS(4,3).
  • This can be fixed by closing the files in CICS to 'flush the buffers'. The file can be immediately re-opened.

Let's test these out, starting with an ESDS dataset.

ESDS

We wrote a CICS/COBOL program that inserted a record with the current date and time every second. We created an ESDS dataset:

DEFINE CLUSTER( -
    NAME(DZS.VSAM1) CYLINDERS(1 1) RECORDSIZE(70 70) -
    NONINDEXED VOLUMES(*) SHAREOPTIONS(2 3) -
    REUSE)

We then

  1. Defined the dataset to CICS to use NSR with two data buffers (the minimum)
  2. Executed our CICS program from a 3270 session: it continually ran, happily inserting its records.
  3. Used IBM FileManager to regularly look at the contents from ISPF.
So, we saw something like:

****** ****  Top of data  ****
16/08/2020 03:52:11
16/08/2020 03:52:16
16/08/2020 03:52:17
16/08/2020 03:52:18
16/08/2020 03:52:19
16/08/2020 03:52:20
16/08/2020 03:52:21
16/08/2020 03:52:22
16/08/2020 03:52:23
****** ****  End of data  ****

So, if our first myth is true, we should see some missing records. Or at least a time lag between when the record was inserted, and when we saw it.

We didn't find this: there were no missing records. None. As our CICS program inserted a VSAM record, it immediately appeared in FileManager. We double-checked with an IDCAMS PRINT job:

//STEP1    EXEC PGM=IDCAMS
//DD1      DD DISP=SHR,DSN=DZS.VSAM1
//SYSPRINT DD SYSOUT=*
  PARM GRAPHICS(CHAIN(SN))
  PRINT INFILE(DD1)

Same results: records were immediately available. This doesn't match our first myth.

We tried different options, including:

  • Increasing the data buffers to 16.
  • Increasing the number of VSAM strings.
  • Using LSR rather than NSR.
  • Executing multiple CICS transactions to insert these records at the same time.

Results were the same: no missing records. So, what's going on?

HURBA

We finally managed to get missing records when we ran our transaction for a longer time. In fact, all records after the 263rd record were missing:

16/08/2020 21:24:3000000127
16/08/2020 21:24:3100000123
16/08/2020 21:24:3100000128
16/08/2020 21:24:3200000124
16/08/2020 21:24:3200000129
16/08/2020 21:24:3300000125
16/08/2020 21:24:3300000130
16/08/2020 21:24:3400000126
16/08/2020 21:24:3400000131
16/08/2020 21:24:3500000127
16/08/2020 21:24:3500000132
16/08/2020 21:24:3600000128
16/08/2020 21:24:3600000133
****** ****  End of data  ****

But they weren't just missing for a second or two. These were still missing after an hour. Same with IDCAMS PRINT: no records after the 263rd. We printed the dataset (remember, it was still open for update in CICS) using DFSMSdss (DFDSS):

//DFDSS    EXEC PGM=ADRDSSU
//SYSPRINT DD   SYSOUT=*
//DD1      DD   DISP=SHR,VOL=SER=VPMVSH,UNIT=3390
//SYSIN    DD   *
  PRINT DS('DZS.VSAM1.DATA') SHR INDD(DD1)

This showed our 'missing' data on disk. So, the data was actually written: we just couldn't see it outside of CICS. But why?

Looking at an IDCAMS LISTCAT of our VSAM dataset (while it was still open), we saw:

CISIZE-------------18432
HI-A-RBA----------829440
HI-U-RBA-----------18432

Our records were 70 characters long, fixed length. Dividing our control interval (CI) size by our record length (18432/70), we get 253 (and a little bit). So, our 254th record went to a new CI. However, the High Used RBA (HURBA) from our IDCAMS LISTCAT is only one control interval long: it hasn't been updated for our second control interval. More on this in a moment.

When we closed the VSAM dataset to CICS, our missing records instantly appeared in FileManager and IDCAMS PRINT. We looked again at our IDCAMS LISTCAT, and saw:

CISIZE-------------18432
HI-A-RBA----------829440
HI-U-RBA-----------36684
The HURBA had been updated.

Discussion (Part 1)

We're really asking the question "when is a record physically written to a VSAM dataset?" VSAM documentation is a little vague about this. It is possible to use a VSAM feature called Deferred Write. Deferred Write is specified the VSAM ACB control block, and records are not written until either:

  • The program issues a WRTBFR macro.
  • VSAM needs the buffer for a GET request.
  • The VSAM file is closed.

The problem is that you need to do this using native VSAM calls from assembler. No such features in languages like COBOL. It can also be done if using batch LSR (BLSR) if it is enabled:

//VSAM1   DD DSN=VSAM.DSET1,DISP=OLD
//BLSR    DD SUBSYS=(BLSR,'DDNAME=VSAM1,'DEFERW=YES')

It can also be activated using some VSAM buffering tools like CA Hyper-Buf. But how about CICS?

When we use CICS, we lose some of the 'bells and whistles' that are available with native VSAM. For example, we can't access datasets by control interval. CICS manuals don't provide too many details about what they do 'under the covers.' However, from our testing, it looks like CICS is not using Deferred Write: it's writing the records immediately. This is the default. Well, kind of.

If you look at the IBM documentation, you will see:

VSAM automatically defers writes for sequential PUT requests. It normally writes out the contents of a buffer immediately for direct PUT requests.

If we look at the Redbook VSAM Demystified:

What to do with the CIs updated in the buffers. Determine whether they are stored through in DASD immediately (store through algorithm) or later (store in algorithm)? For random, use the Defer Write option, where the installation decides what to do. The default is store through. For sequential, VSAM defers the write until half of the buffers are ready for the write.

Not very clear is it? What we're seeing is that our ESDS inserts are immediately written to disk.

So, what happens with the second CI? It looks like FileManager and friends can't see the second CI: they're looking at the 'old' HURBA, and stopping there.

The solution would seem to be simple: close the file to CICS to update the HURBA. And our test confirms this. But we're not 'flushing the CICS buffers' as is often said. Rather, we are refreshing the HURBA.

KSDS

So, is it the same with a KSDS? We modified our test program to insert the date and time to a KSDS, using this date and time as the key. When we ran our program, we saw the same as ESDS: records were immediately available to FileManager and friends.

We modified our program to delete records. Same results: FileManager couldn't see the deleted records as soon as the delete was completed. We tried the long-running test, but didn't see any missing records: they were all immediately available: even after the 263rd record. We tried a few variations:

  • Reduced the data component allocation to one track to see what happened when another extent was used.
  • Inserted records with high and low-key values to 'force' new control intervals and control areas.
  • Ran a very long test (12,700 records inserted).
  • Set the CI of the data component 512 bytes, and index component to 1024 bytes: to force the index to use multiple CIs.

In every case, the records were immediately seen. We also used DFSORT to find the last record:

//STEP1    EXEC PGM=SORT
//SORTIN   DD DSN=DZS.VSAM1,DISP=SHR
//SORTOUT  DD SYSOUT=*
//SYSOUT   DD SYSOUT=*
//SYSIN    DD *
    RECORD TYPE=F
    SORT FIELDS=(1,23,CH,D)
    OUTFIL ENDREC=1,OUTREC=(1,23)

We regularly ran this to see if there was a time lag between when the record was inserted, and DFSORT could see it. We found none.

Discussion Part 2

KSDS datasets are different to ESDS: we use the index to locate the record (or at least, we did in our program: we could have used the RBA directly - it's just harder). When viewing the dataset with FileManager, it shows all records in the dataset. So, it's probably doing a sequential read through the sequence set of the VSAM dataset. Same for our DFSORT job.

After our ESDS test, we could think that we wouldn't see the index records when the index component uses a new CI. But this wasn't the case.

So, it looks like reading VSAM KSDS from batch may be OK.

Batch Open

In the above tests, we used IDCAMS and FileManager to view records. These open the VSAM dataset every time. But what if the dataset was already open before the CICS updates were made?

This is better-documented territory. We created a batch COBOL program to read a KSDS directly with a specific key (date and time) that was in the future. This program opened the file, and then continually checked for this record until it was found. We submitted this job, and then started our CICS transaction to insert records. We'd expect our batch program to find our target record as soon as our CICS program inserted it.

But our batch job never found the record, even a few minutes after our CICS transaction had inserted it. Our batch program had loaded the relevant VSAM CI into buffers when it first tried to find the record. Every subsequent time the program checked it used the values in these buffers: it never went back to disk. When we reran the job (after the target record was inserted), it found the record.

We modified our program to do an open, then wait for a period (no VSAM reads). During this period, the CICS transaction inserted the target record. In this case, the batch program found the target record. It hadn't yet loaded the CI into buffers before the CICS transaction added the target record.

SHAREOPTIONS(4,3)

In our myths, defining the VSAM dataset with SHAREOPTIONS(4,3) fixed our problems. When we repeated our tests with SHAREOPTIONS(4,3) it did exactly that. All the problems mentioned above didn't happen.

Of course, SHAREOPTIONS(4,x) is a two-edged sword. Buffers are always flushed, so the performance (both from CICS and the non-CICS process accessing it) will be terrible. Not to mention data integrity and serialization. We talk more about this in our partner article.

Not a Complete Test

It's worth noting that our tests were not comprehensive. We didn't test RRDS or VRRDS datasets. We didn't try RBA access to VSAM datasets. We didn't test record updates, different record sizes, variable length records, alternate indexes, and more.

So, the results aren't absolute: but they raise interesting points.

Summary

Let's looks at our myths again, and what we found:

  • You can read an ESDS or KSDS open to CICS, but there may be 'missing' records.
    PARTIALLY TRUE.

    There could be many missing records for an ESDS. However, our tests indicated that KSDS records are immediately available.

  • This can be fixed using SHAREOPTIONS(4,3).
    TRUE.

  • This can be fixed by closing the files in CICS to 'flush the buffers'. The file can be immediately re-opened.
    PARTIALLY TRUE.

    Only needed for ESDS. It only works if the non-CICS process opens the VSAM file after it has been closed to CICS. We're not 'flushing the buffers,' we're refreshing the HURBA.

Reading datasets that are open to CICS for update is risky. If a non-CICS process needs the latest records, simply reading the VSAM dataset is unlikely to be the best choice. In our partner article, we look at some better options.


David Stephens