LongEx Mainframe Quarterly - November 2020
At a client site recently, we had a production outage caused by a VSAM space issue. And it wasn't the first time. When I first started as a systems programmer, space related abends were common. However, IBM and other vendors have been spending a lot of time creating ways of reducing these abends. In fact, I believe that space related errors should never, ever occur. But they still do. So, how can we 100% eliminate B37, D37, E37 abends and other space-related failures? Possible FailuresBefore we get started, let's review some of the more common abends and space-related errors. These can be summarized as:
We know the problems, so how can we avoid them? 1. Non-VSAM Extended FormatExtended Format datasets were introduced with DFSMS 1.1 in 1993. Changing the way that sequential datasets are stored, they offer benefits including:
Bottom line: I believe that all sequential datasets should be extended by default. Non-extended should only be used for datasets that are not eligible (such as VIO datasets, datasets accessed with EXCP, system datasets and GTF trace datasets). 2. VSAM Extended FormatVSAM also has an extended format. It offers less immediate benefits than the non-VSAM extended format, but IBM says that it “increases the performance and reliability of an I/O.” It also opens the door to some optional features that may improve resilience:
Bottom line: I believe that all VSAM datasets should be extended unless there is a specific reason (for example, page datasets and catalogs cannot be extended format). CA Reclaim should be enabled by default for all VSAM datasets. Any VSAM dataset that is very large (say, greater than 3000 cylinders) should be compressed. 3. PDSEPDSE dataset have been around for longer than extended format datasets. The latest PDSE (version 2) provide the following advantages:
Bottom line: I believe that PDSE should be used for all partitioned datasets by default, unless there is a reason. 4. Multi-VolumeThere's an excellent chance that a single volume will run out of space. If a dataset has only one candidate volume, this will cause problems. You may not have heard of it, but the DFSMS Data Class parameter Dynamic Volume Count (DVC) can be used to automatically increase the number of candidate volumes if needed. Suppose a user specifies two candidate volumes, but the allocation needs four. Normally this allocation would fail. However, a DVC of 4 (or higher) would automatically and dynamically add another two candidate volumes: the allocation would succeed. Bottom line: I believe that all VSAM datasets should be multi-volume. Critical non-VSAM datasets should also be multivolume. Every data class should have a DVC parameter > 5 unless there is a good reason. 5. Large DatasetsVery large datasets are hard to manage. At one site, they had a dataset with a primary space requirement of 6000 cylinders (3390 device). They regularly had problems as few eligible volumes had a free extent of 6000 cylinders. DFSMS offers some features to help here:
Bottom line: I believe that Space Constraint Relief should be enabled for all data classes unless there is a good reason. Any dataset that is greater than 3000 cylinders should be compressed. Any dataset with space greater than 3GBytes should have extended addressability. 6. Too Many ExtentsWe've already talked about how PDSE and other extended format datasets can increase the maximum number of extents possible. However, there are some other options to help.
7. Enforce RulesWe've specified some recommended rules above. But how can we ensure that people will follow them? For a start, most of the above features can only be used for SMS managed datasets. It still surprises me to see some sites with non-SMS managed volumes. As they are SMS managed, ACS routines can be coded to enforce many of these rules.JCL check software such as ASG-Job/Scan, Broadcom CA JCLCheck o SEA JCLPlus+ can be configured with JCL rules, that could be used to enforce some of these rules. Compuware ThruPut Manager may also be able to perform some enforcement. ConclusionYou may have noticed that I haven't talked about disk volume space. This is also important, but most sites monitor disk space carefully. I also haven't talked about other products like DTS SRS, BMC STOP-X37, and CA Allocate. These can certainly be used to reduce space-related abends, and can provide some sophisticated features to modify allocation parameters, and create alerts when problems are detected. However, basic z/OS features can do most of what these products offer. I believe that there are sufficient tools and features to eliminate space related abends. So let's agree to implement these, and say goodbye to our space errors. |