Longpela Expertise logo
Longpela Expertise Consulting
Longpela Expertise
Home | Press Room | Contact Us | Site Map
FAQ


LongEx Mainframe Quarterly - August 2019
 

technical: Load Library Performance

Load libraries and program objects are always defined with an 'undefined' record format (RECFM=U). In our partner article RECFM=U: What Exactly Is It, we talk about how the LRECL definition for RECFM=U datasets is ignored.

But are there any other allocation issues for load and program libraries? What effect does blocksize have on their performance?. And does the size of the library matter? The answer is "maybe." Let's see what that means.

Blocksize

Best practices for non-VSAM disk datasets is to use half-track blocksizes: the blocksize is the same as half a track. Everyone uses DASD that pretends to be 3390s, so this is around 27,000 bytes and change. Everyone agrees that this gives good performance, while minimizing wasted DASD space. But in 2014, IBMs John Eells wrote a post on the ibm-main list that said that 32,760 is the best blocksize for load modules. Who is right?

Let's do some benchmarking.

Blocksize Performance

We used IEFBR14 as our first load module: very small (only 8 bytes). We wrote an assembler program to do a LOAD, followed by a DELETE of this program 5000 times. We used three load libraries (PDS) with blocksizes of 6144, 23440 (half-track blocking) and 32760. We specified these libraries in a STEPLIB or our JCL, so no LLA, library lookaside or VLF (no CA PDSMAN, Quickfetch or PMO either). Here's what we found:

BlocksizeCPU SecondsEXCPsElapsed Time (sec)
61440.13100063
234400.13100063
327600.13100063

No change: same CPU usage (0.13 seconds), number of EXCPs (or I/Os: 10006), and same elapsed time (3 seconds). This is no surprise: IEFBR14 fits comfortably into one block in all three PDS datasets. We repeated the test, but using the ARCCTL module (size 4MBytes), and only 1000 iterations:

BlocksizeCPU SecondsEXCPsElapsed Time (sec)
61444.651646000243
234402.69504000145
327602.61426000133

Now we see some differences. Using 32760 almost halved the elapsed time, and also greatly reduced the CPU time. So it's true: 32760 is the best blocksize for load modules.

Blocksize Space

But doesn't a 32760 blocksize increase the DASD space we need? We took sys1.linklib from a z/OS 2.3 system, and copied it into three PDS datasets with 6144, 23440, and 32760 blocksizes. The result:

BlocksizeTracks Used
61442895
234402640
327602620

Interestingly, 32760 used the least space of all the blocksize options. This is because the z/OS binder is smart. It will use 'small blocks' to fill up the space in a track: not all records will have a blocksize of 32760.

Blocksize and Program Objects

So how do PDSEs and program objects compare? We first ran a test with our large ARCCTL module, comparing load times for a PDSE (blocksize=32760) with our three load libraries. Here's what we found:

BlocksizeCPU SecondsEXCPsElapsed Time (sec)
61444.651646000243
234402.69504000145
327602.61426000133
(PDSE)0.321000004

The PDSE is much faster. It also uses far less EXCPs and CPU seconds. But there's a catch. By default, a program does not load the entire program object from a PDSE. Program objects have different classes of text (the actual program). Only some classes are brought into memory when the module is loaded. Other classes are loaded when needed. So, for our PDSE, we're not loading the entire ARCCTL.

We can disable this by using the FETCHOPT=(PACK,PRIME) parameter of the binder. Doing this for ARCCTL, our test results become:

BlocksizeCPU SecondsEXCPsElapsed Time (sec)
61444.651646000243
234402.69504000145
327602.61426000133
(PDSE)3.26128500031

The PDSE is still a lot faster, but uses a little more CPU than our optimal 32760.

So how do blocksizes affect program objects in PDSEs? We ran a similar test to our first two tests above, with FETCHOPT=(PACK,PRIME). We chose the modules CSQLINK (length 424 bytes) and CSQUDMSG (length 264kbytes), and performed 5000 LOAD/DELETEs in our program. The results:

ModuleBlocksizeCPU SecondsEXCPsElapsed Time (sec)
CSQLINK61440.4250062
CSQLINK234400.4250062
CSQLINK327600.4250061
CSQUDMSG61441.32300007
CSQUDMSG234401.32300006
CSQUDMSG327601.32300007

No difference. This is because program objects internally use 4kByte blocks. Changing the dataset blocksize has no effect.

Directory Search

At the beginning of this article, we talked about how the size of a load library may also affect performance. To load a module, the directory of the PDS/PDSE must be searched to find the module location, and then the module is loaded. So, does the number of members in a PDS or PDSE make any difference?

Let's first looks at load modules in PDS datasets. We compared our two modules (IEFBR14 and ARCCTL) in two PDS datasets, each with a blocksize of 32760. One PDS had 2 members, one had 4200 members. Here are the results:

ModuleBlocksizeMembersCPU SecondsEXCPsElapsed Time (sec)
IEFBR143276020.13100063
IEFBR143276042000.14100069
ARCCTL3276022.61426000133
ARCCTL3276042003.07502000209

A small CPU increase, and larger elapsed time increase when there were more members. This is because with PDS datasets, the directory is searched sequentially. PDSE datasets advertise a faster search time because of indexed directories. The results to load IEFBR14 5000 times from two PDSEs (one with 2 members, one with 4200 members) are:

MemberCPU SecEXCPElapsed
20.32250061
42000.32250122

Almost identical.

Conclusion

The results are pretty clear. For load libraries, use a blocksize of 32760, and don't have too many load modules in one PDS. However, programs objects in PDSEs will be faster. No need to worry about blocksize (keep it above 4k) or the number of program objects in the PDSE.

In many cases these changes won't be significant. Everyone uses LLA and library lookaside (or products like CA PDSMAN, PMO and Quickfetch) to avoid the overheads of loads and directory searches. Program prefetch in IMS, and residency in CICS does something similar. However, it's an easy thing to have a standard blocksize (32,760) for load libraries. Most software vendors already do this, as do many sites.

A quick search at one site showed that almost 80% of all load PDS libraries had a blocksize of 32000 or higher. 7% of load libraries had a blocksize of 8k or lower.


David Stephens



LongEx Quarterly is a quarterly eZine produced by Longpela Expertise. It provides Mainframe articles for management and technical experts. It is published every November, February, May and August.

The opinions in this article are solely those of the author, and do not necessarily represent the opinions of any other person or organisation. All trademarks, trade names, service marks and logos referenced in these articles belong to their respective companies.

Although Longpela Expertise may be paid by organisations reprinting our articles, all articles are independent. Longpela Expertise has not been paid money by any vendor or company to write any articles appearing in our e-zine.

Inside This Month

Printer Friendly Version

Read Previous Articles


Longpela Expertise understand what's 'under the hood' of z/OS and related systems like CICS and IMS.
We can read dumps, diagnose complex problems, and write and maintain assembler routines.
Contact us to get your own z/OS internals expert.
© Copyright 2019 Longpela Expertise  |  ABN 55 072 652 147
Legal Disclaimer | Privacy Policy Australia
Website Design: Hecate Jay