technical: Using HEAPZONES to Fix C Storage Overlays on z/OS
It's not hard to create a storage overlay in a C program: even easier when that C program runs on z/OS. If you're tracking a storage overlay down, one of the z/OS tools is the Language Environment (LE) HEAPZONES parameter. In this article, we'll see how it works.
Storage Overlays
Suppose we have the following C program:
#include <stdlib.h>
char *str1;
main() {
str1 = malloc(15);
strcpy(str1,"This is a test string");
printf(str1);
free(str1);
}
C programmers will instantly see our problem: we're copying a 21-byte string into a 15-byte array. When we run this in a batch job, we get a user abend from Language Environment:
+CEE3798I ATTEMPTING TO TAKE A DUMP FOR ABEND U4094 TO DATA SET: DZS.D174
IGD100I 03DE ALLOCATED TO DDNAME SYS00001 DATACLAS ( )
IEA822I COMPLETE TRANSACTION DUMP WRITTEN TO DZS.D174.T0101429.DZSRUNP
+CEE3797I LANGUAGE ENVIRONMENT HAS DYNAMICALLY CREATED A DUMP.
IEA995I SYMPTOM DUMP OUTPUT 031
USER COMPLETION CODE=4094 REASON CODE=0000002C
TIME=01.01.42 SEQ=00141 CPU=0000 ASID=0036
PSW AT TIME OF ERROR 078D1400 85DDF9A6 ILC 2 INTC 0D
NO ACTIVE MODULE FOUND
NAME=UNKNOWN
DATA AT PSW 05DDF9A0 - 00181610 0A0DA7F4 001C1811
Language Environment has detected that some of its control information has been overwritten, and abended with a dump. The SYSOUT DD doesn't tell us much more:
CEE0802C Heap storage control information was damaged.
The traceback information could not be determined.
This is going to be hard to diagnose. This isn't the only message from a storage overlay. We may get S0C1 or S0C4 abends if the damage isn't detected by Language Environment. If running in CICS, we may see some ASRA/ASRB abends, or user abends like U4038, U4087, U4088, U4092 or U4094 abends. We may see other nasty messages that are victims of our storage overlay.
Introducing HEAPZONES
Introduced in z/OS 2.1, HEAPZONES is a lightweight way of detecting storage overlays from programs writing 'over the end' of variables. They work with heap storage: think malloc and friends. So, let's tweak our program:
#pragma runopts(HEAPZONES(8,MSG,16,MSG))
#include <stdlib.h>
char *str1;
main() {
printf(str1);
str1 = malloc(15);
strcpy(str1,"This is a test string");
printf(str1);
free(str1);
}
We've added a #pragma statement to the top. This tells Language Environment to create a buffer zone of 8 bytes (31-bit addresses) and 16 bytes (64-bit addresses) at the end of our variables. Now if we run our program, it ends with a return code of zero! We also see something different written to our SYSOUT:
CEE3716I The heap check zone following the storage at address 1AF39D40 for
length
X'00000010' has been overlayed at
address 1AF39D50. Each byte in the zone from 1AF39D50 to
1AF39D54 should contain the value X'55'.
CEE3717I Control information in a heap check zone has been damaged.
The value at address 1AF39D54 should be greater
than 1AF39D40 and less than or equal to 1AF39D50.
From entry point main at compile unit offset +000000C8 at entry
offset +000000C8 at address 1AF00A80.
1AF39D40: E38889A2 4089A240 8140A385 A2A340A2 |This is a test s |
1AF39D50: A3998995 874B004F |tring..| |
So, Language Environment has checked this buffer zone at our free() call, and found that it was overwritten. It's also telling us the text that stomped over our buffer zone.
Suppose we have two variables that we use as follows:
#pragma runopts(HEAPZONES(8,TRACE,16,TRACE))
#include <stdlib.h>
char *str1, *str2;
main() {
printf(str1);
str1 = malloc(15);
str2 = malloc(50);
strcpy(str1,"This is a test string.");
printf(str1);
free(str1);
free(str2);
}
This still works. Our buffer zone is added to the end of each variable. Nice.
HEAPZONES OPTIONS
So, we've achieved two things. Our program now executes with a zero return code, and we've got information that there's been an overlay. Suppose we want our program to abend when this happens, but give us the same information. We could tweak our #pragma statement to look like:
#pragma runopts(HEAPZONES(8,ABEND,16,ABEND))
Now, when we run our program, it abends with a U4042 abend:
+CEE3798I ATTEMPTING TO TAKE A DUMP FOR ABEND U4042 TO DATA SET: DZS.D175.
IGD100I 03DE ALLOCATED TO DDNAME SYS00001 DATACLAS ( )
IEA822I COMPLETE TRANSACTION DUMP WRITTEN TO DZS.D174.T0239105.DZSRUNP
+CEE3797I LANGUAGE ENVIRONMENT HAS DYNAMICALLY CREATED A DUMP.
IEA995I SYMPTOM DUMP OUTPUT 450
USER COMPLETION CODE=4042 REASON CODE=00000003
TIME=02.39.10 SEQ=00199 CPU=0000 ASID=0036
PSW AT TIME OF ERROR 078D1400 85DDF9A6 ILC 2 INTC 0D
NO ACTIVE MODULE FOUND
NAME=UNKNOWN
DATA AT PSW 05DDF9A0 - 00181610 0A0DA7F4 001C1811
Note that this is different to our 'old' abend of a U4094. U4042 with a reason code of 3 is always an abend because of HEAPZONES.
We could also simply ignore the abend by coding something like:
#pragma runopts(HEAPZONES(8,QUIET,16,QUIET))
Our job ends with a return code of zero, and there's no Language Environment messages in SYSOUT. But this isn't recommended: we're just hiding the error. A final option is to specify TRACE:
#pragma runopts(HEAPZONES(8,TRACE,16,TRACE))
This is the same as specifying MSG, but we also get some extra diagnostic information: a traceback:
Traceback:
DSA Entry E Offset Statement Load Mod
1 CEEVHMSG +00000846 CEEPLPKA
2 CEEV#FH +0000023C CEEPLPKA
3 main +000000C8 DZSC
4 EDCZMINV +000000C2 CEEEV003
5 CEEBBEXT +000001C6 CEEPLPKA
Helpful to find the area in our program where LE found the overlay: our free() statement is at x'C8' bytes after the beginning of our main statement.
Some Gotchas
There are some things to remember when using HEAPZONES. For a start, consider this program:
#pragma runopts(HEAPZONES(8,MSG,16,MSG))
#include <stdlib.h>
char *str1;
main() {
printf(str1);
str1 = malloc(15);
strcpy(str1,"This is a test string");
printf(str1);
}
We've removed our free() function call. If we run this job again, it will simply end with a zero return code, and no messages. Basically, the same as coding QUIET in the pragma statement. This is because we've instructed Language Environment to get our buffer at the end, but without a free() function call, LE never checks it for overlays.
Have a look at this program:
#pragma runopts(HEAPZONES(8,TRACE,16,TRACE))
#include <stdlib.h>
char str1[15];
main() {
strcpy(str1,"This is a test string.");
printf(str1);
}
Rather than using malloc, we've defined the variable as a character array. Now we have a similar situation to before when we omitted the free() function: zero return code as HEAPZONES obtained a buffer, but not messages as we're not freeing the area.
Let's look at another program:
#pragma runopts(HEAPZONES(8,TRACE,16,TRACE))
#include <stdlib.h>
char *str1, *str2;
main() {
printf(str1);
str1 = malloc(15);
strcpy(str1,"This is a test string. A really, really long one!");
printf(str1);
free(str1);
}
We get our original U4094 abend, with no nice messages from HEAPZONES.
This is because we're copying a 49-byte string into a 15-byte field. We've specified a buffer zone of 8 bytes in length, but 15+8 (23) bytes is still less than the string we're copying in. If we specified a larger buffer zone so that our string wouldn't run over the edge, we'd be fine.
So, it would be tempting to define a bigger buffer zone (the maximum size is 1024 bytes) to handle this. However, this would greatly increase the storage used. So, we need to balance our storage usage against the size of the strings that may overlay our storage.
There are a few other issues with HEAPZONES:
- Turning it on will affect performance, and not in a good way. It will also increase storage used, so best to only use it when you need it.
- IBM recommend that HEAPZONES and HEAPCHK are used separately – not together.
- Can's use the LE options HEAPZONES and RPTSTG(ON) at the same time.
Finally, you can't specify HEAPZONES at the system level or region level. So, we can't specify it in our PARMLIB CEEPRMxx member, of for an entire CICS region. For our batch job, we could specify the LE options using the CEEOPTS DD:
//STEP1 EXEC PGM=DZSC
//STEPLIB DD DISP=SHR,DSN=DZS.LOAD
//SYSPRINT DD SYSOUT=*
//CEEOPTS DD *
HEAPZONES(8,TRACE,16,TRACE)
We could also create a CEEUOPT module, and bind it with our module. If we can't code pragma statements, this will be the way to go for CICS and IMS programs.
Not a Solution for Everything
HEAPZONES is an excellent tool in our toolbox for diagnosing storage overlays: perfect when our programs are running 'off the edge' of our variables. Or in other words, perfect for C. But HEAPZONES comes at a cost, so it should only be used if necessary, and removed when problems have been diagnosed. Remember also that this isn't the only tool in our toolbox. We talk about some more in our partner article.
David Stephens
|