Longpela Expertise

Home | Contact Us | Site Map

About Us

Consulting

Our Consultants

FAQ

In the Community

Our e-zine

Our Books

Our Articles

Free Tools and Code

Mainframe Links

Lookup Mainframe Software

LongEx Mainframe Quarterly - February 2019

technical: Improving REXX Performance

REXX is sometimes seen as a forgotten issue. Word on the street is that IBM is not doing any more development on REXX. However, I often come across REXX routines that could use some tweaking to perform better, or use less CPU. Often these REXX execs run in batch, working through large files.

So, I thought we'd look at some of the ways to make REXX perform better, and do some testing to see if they're still relevant.

Baseline:

Let's start with a baseline REXX exec. This does nothing but loop around 10,000,000 times.

/* Rexx */
a=time('E')
Do 10000
   Do 1000
   End
End
Say time('E')
Exit 0

I'm using the time function to measure the elapsed time of the REXX.

This and all REXX execs in this article were executed in batch using IRXJCL:

//STEP1    EXEC PGM=IRXJCL,PARM=REXXB
//SYSTSPRT DD   SYSOUT=*
//SYSEXEC  DD   DISP=SHR,DSN=DZS.REXX
//SYSTSIN  DD   DUMMY

I've used the SMF Type 30 records to measure the CPU usage.

Performance of this REXX on our test system: CPU=0.36 secs, elapsed= 0.37 secs

Now we have our baseline. So, let's look at some traditional ways of improving performance:

Test 1: Do I vs Do

The idea here is that a Do var = num to num is less efficient than a Do Num statement. So, we've modified our baseline REXX with two such statements:

/* Rexx */
a=time('E')
Do i = 1 to 10000
   Do j = 1 to 1000
   End
End
Say time('E')
Exit 0

Performance: CPU=2.28, elapsed= 12.5 secs. A bit more expensive than our baseline.

Test 2: Do I To vs Do I For

In this case, we compare Do var = x to y against Do var = x for y (using for rather than to). In the code below, the function is the same as we're not modifying our loop variables. Here's our code:

/* Rexx */
a=time('E')
Do i = 1 for 10000
   Do j = 1 for 1000
   End
End
Say time('E')
Exit 0

Performance: CPU=1.78, elapsed= 1.81 secs. A little faster than Test 1, but slower than our baseline.

Test 3: One Operation

A basic REXX tactic is to limit operations within loops. Every operation takes CPU and time. To start with, let's see how expensive a simple arithmetic instruction is.

/* Rexx */
a=time('E')
Do 10000
   Do 1000
     a = a+1
   End
End
Say time('E')
Exit 0

Performance: CPU=3.57 secs, elapsed= 3.6 secs. More expensive than any of our plain loops; not a surprise.

Test 4: Comments

It's traditionally been a way of improving performance by putting comments at the end. The idea is that even comments use some CPU and reduce performance. So, we've included four comment lines inside our loop.

/* Rexx */
a=time('E')
Do 10000
   Do 1000
/*                                                                  */
/*                                                                  */
/*                                                                  */
/*                                                                  */
   End
End
Say time('E')
Exit 0

Performance: CPU=1.04 secs, elapsed= 1.05 secs. So, comments aren't free: they're about one-third the CPU and elapsed time of a simple arithmetic operation.

As an aside, I've heard that you should avoid comments with asterisks inside them. For example:

/********************************************************/

A quick test shows no difference between these such comments and comments with spaces. Similarly, a test with short comments like

/* */

had similar CPU and elapsed times. The length of the text inside the comment does not make a large difference.

Test 5: Single Line Compare

The idea is that all logical operations on one line are always evaluated, even if they don't need to be. So, if we have b = 1 | b = 2 | b = 3, all three are evaluated, even the first operation matches (because b=1). If we nest them, we can improve performance.

Our first code has them on one line:

/* Rexx */
a=time('E')
b = 1
c = 0
Do 10000
   Do 1000
   if (b = 1 | b = 2 | b = 3) then c = 1
   End
End
Say time('E')
Exit 0

Performance: CPU=6.87secs, elapsed= 6.96 secs.

Our second code nests them:

/* Rexx */
a=time('E')
b = 1
c = 0
Do 10000
   Do 1000
      If (b = 1) Then c = 1
      Else if (b = 2) Then c = 1
      Else if (b = 3) Then c = 1
   End
End
Say time('E')
Exit 0

Performance: CPU=5.79 secs, elapsed= 5.87 secs. A small reduction.

Let's try a Select statement to do the same thing:

/* Rexx */
a=time('E')
b = 1
c = 0
Do 10000
   Do 1000
      Select
         When b=1 Then c = 1
         When b=2 Then c = 1
         When b=3 Then c = 1
      End
   End
End
Say time('E')
Exit 0

Performance: CPU=5.74 secs, elapsed= 5.86 secs. SELECT has same performance as nested If.

Test 6: Inline Subroutine

Often, we use inline subroutines. But do these affect our performance? Our test code is:

/* Rexx */
a=time('E')
Do 10000
   Do 1000
      Call SUBREXX
   End
End
Say time('E')
Exit 0

SUBREXX:
   Return 0

Performance: CPU=9.06 secs, elapsed= 9.17 secs. So inline subroutines are about three times the cost of one arithmetic instruction.

Test 7: External Subroutine

The idea is that external subroutines are expensive. So, our SUBREXX subroutine from Test 6 is moved to a separate member in the PDS. Performance: CPU=960.26, Elapsed=2910.94. Ouch!

I've actually lied here. I did this test with one-tenth of the iterations in the loop, because I knew the overhead would be so great. So, I didn't really run a REXX in a hard loop for almost 50 minutes. The performance was actually 96.03 seconds CPU, 291.1 elapsed for 1000000 iterations. I've multiplied the results by 10 so we can compare with other tests. Not surprisingly, EXCPs went through the roof: every call did an EXCP to the PDS holding the REXX.

So external subroutines are really, really expensive. Some REXX environments such as IBM Netview provide facilities to load REXX execs into memory to reduce this overhead.

Test 8: EXECIO

The idea is that performing one I/O at a time is slower and more expensive that performing many I/Os in one operation, and putting the results in a stem or the data stack. Our first REXX does what I've seen a lot of programs do: read records one at a time. The record is put into the data stack:

/* Rexx */
a=time('E')
eof = 0
Do While eof=0
   EXECIO 1 DISKR INFILE
   If RC=2 Then eof = 1
End
Say time('E')
Exit 0

Note that the REGION of our job needed to be large enough to store all our records. Our sample file is RECFM=FB, LRECL=132,BLKSIZE=23440 with 250,000 records.

Performance: CPU=0.98, Elapsed=1.16 seconds.

We now use one EXECIO statement to load all records into a stem variable.

/* Rexx */
a=time('E')
eof = 0
"EXECIO * DISKR INFILE (FINIS STEM inrec."
Say time('E')
Exit 0

Performance: CPU=0.27, Elapsed=0.43 seconds. About one-quarter of the CPU, and one third of the elapsed time.

Conclusion

We've proved that some techniques to make REXX more efficient are still relevant. In particular, avoiding external subroutines is a large step to improving performance. Some of the other ideas will provide small savings, but the savings will only be noticeable if the instructions are inside a large loop. And this in itself is interesting. Every instruction will cost something. But it's when they are inside loops that things get interesting. Minimizing the number of loop iterations (for example, eliminating records before a loop), and the amount of work (and comments) inside loops will make a large difference to your REXX performance.

David Stephens

LongEx Quarterly is a quarterly eZine produced by Longpela Expertise. It provides Mainframe articles for management and technical experts. It is published every November, February, May and August.

The opinions in this article are solely those of the author, and do not necessarily represent the opinions of any other person or organisation. All trademarks, trade names, service marks and logos referenced in these articles belong to their respective companies.

Although Longpela Expertise may be paid by organisations reprinting our articles, all articles are independent. Longpela Expertise has not been paid money by any vendor or company to write any articles appearing in our e-zine.

Inside This Month

Printer Friendly Version

Read Previous Articles

	Longpela Expertise understand what's 'under the hood' of z/OS and related systems like CICS and IMS.
We can read dumps, diagnose complex problems, and write and maintain assembler routines. Contact us to get your own z/OS internals expert.


© Copyright 2019 Longpela Expertise \| ABN 55 072 652 147			Legal Disclaimer \| Privacy Policy

Website Design: Hecate Jay