Longpela Expertise logo
Longpela Expertise Consulting
Longpela Expertise
Home | Press Room | Contact Us | Site Map
FAQ


LongEx Mainframe Quarterly - August 2021
 

technical: The Problem With z/OS Strings and C

The C programming language doesn't handle strings very well. Sure, there's lots of string functions defined in the stdlib.h file. But strings are definitely not a real strength of C/C++. And there are many articles explaining why. However, these string issues become a little worse with z/OS. Let's see how.

String Issues Recap

There's really two ways to work with strings. The first is to use a character array. We can define our string str1 using the code:

char str1[9];

We can then work with this string, using commands like:

strcpy(str1,"Hello");
printf("String holds: %s", str1);

The other option is to use pointers to areas of memory:

char *str2;
str2 = malloc(100);

We can now do the same things with str2 as we did with str1. The big issue is that C strings must end in the NULL character: x'00'. So, this statement

rc = strcpy(str1,str2);

will start at the beginning of str2, and copy each character to str1 until it finds an x'00' in str2. In fact, there are many C string functions that will continue processing characters in a string until a NULL character is found, including strcpy(), strcat(), gets(), streadd(), strecpy(), and strtrns().

This sounds fine. However, suppose we have this code:

char str1[5],str2[15];
main() {
   strcpy(str2,"Ten Chars!");
   strcpy(str1,str2);
}

This will compile fine, but there's a problem when it runs: we're copying 10-characters plus a terminating NULL (str2) into a five-character variable (str1). C/C++ will happily copy these 11 characters, overwriting storage at the end of our five-character string.

If we're lucky, we'll get a Language Environment or similar abend when storage is freed, specifying that storage has been overlaid. If we're less lucky, we won't get a message, and other storage we're relying on will be overlaid. We could get incorrect or corrupted output, z/OS abends, or worse.

Most C programmers will be yawning at this stage: this is nothing new. But z/OS makes this problem a whole lot worse.

z/OS and Strings

z/OS strings are rarely null terminated. If you've programming in COBOL, PL/1 or assembler, you have fixed length strings: you know the length of your variable. The following shows declarations of strings in COBOL, PL/I and Assembler:

COBOL:
STR1 Pic X(10);
PL/1::
dcl str1 char(10); 
Assembler::
STR1  DS  CL10

So, if C programs will be passing data to and from these traditional programming languages, they it will work with a lot of fixed length fields (with no terminating nulls).

But it gets worse. Many z/OS specific functions expect or return fixed length fields. For example, the cs() function performs the same as a z/OS assembler Compare and Swap instruction: it compares two values, and moves a third into one of them if they are equal. All three parameters of the cs() function must be exactly 4 bytes in length (no terminating NULL).

C programs can call other z/OS APIs, including Language Environment callable functions. The Language Environment function CEELOCT can be used to get the current date or time. This returns the date and time in a 17-byte string of the format YYYYMMDDHHMISS999: no terminating NULL.

If your C program is running under IMS, all DL/I calls require and return fixed length fields. Similarly, if you're running under CICS, all EXEC CICS commands receive and return fixed length fields. Same with IBM MQ. No terminating nulls in any of these.

When performing I/O to sequential, PDS and VSAM files, all functions will need to specify the length of the record being written (without the terminating NULL). If working with z/OS control blocks, all are fixed length format (no terminating nulls). You can see some examples of C programs accessing z/OS control blocks in our Control Blocks for Beginners articles.

Manually Adding a NULL

OK, so what can we do about this? Ideally, writing code in a format that removes issues with strings is the way to go. When using C with fixed length strings, we have two choices:

  • Add a null character to the end of each and every string.
  • Use string functions that specify a length.

Let's take the first option. The following code calls a CICS API to get the name of the CICS transaction, and stores it in our variable txn1. We could manually add a NULL at the end of our variable like this:

char txn1[5];
EXEC CICS ASSIGN TRAN(txn1);
txn1[5] = '\0';

This is a bit clumsy, and prone to error: it's too easy to forget to add the NULL, or assume it's been done somewhere else. A better way is to initialize our variable when we declare it:

char txn1[5] = "    " ;  /* 4 spaces */
EXEC CICS ASSIGN TRAN(txn1);

We've initialized txn1 with spaces, and C is kind enough to add a NULL to the end.

This becomes harder when we use malloc() and calloc() to get our storage. Consider the following code:

char *str1;                                 
str1 = malloc(5);                        
EXEC CICS ASSIGN TRAN(txn1);

Again, we could manually add a terminating NULL:

txn1[5] = '\0';

Or we could initialize our variable using the memset() function after the malloc() call, but before the EXEC CICS ASSIGN:

memset(txn1, '\0',5 );

Where NULL Won't Work

In many situations, we simply can't use a NULL terminated C string. Let's take an example. When working with MQ, we'll need to populate message blocks. For example, when opening an MQ queue, we need to create a message descriptor. This has many fields, including some strings:

struct tagMQMD {
      ...
   char ReplyToQ[48],
   char ReplyToQMgr[48],
   char UserIdentifier[12],
       ...
};

These strings are right next to each other, and could take up the entire length. So, the ReplyToQ could be a string of 48 characters: no room for a terminating NULL.

Field Lengths

The length of a field is also an issue. If we use the C len() function, it will return the length of a string: the length up to (but not including) the terminating NULL. If we have a fixed-length string, then we can't use the len() function. A better option is to use the lenof() function. But this has flaws as well.

Suppose we define a 4-character string as follows:

char str1[5];  /* 4 chars plus one NULL = 5 */
We need to define it with a length of 5, as we need to allow one byte for our terminating NULL. If we want the length of the string in str1, we would use len(str1). But if we want the length of the field (4), we need to specify:
str1_len = lenof(str1) – 1;  /* Take one for the end NULL */

Not really elegant programming.

Length-Based Functions

Rather than trying to manually insert NULLs, an alternative is to simply stop using any function that assumes a NULL terminated string, or at the very least, doesn't specify a maximum length. Let's take an example: we've been using the strcpy() function to copy one string to another. An alternative is the strncpy function. We could have code like:

char str1[5],str2[15];
strcpy(str2,"Ten Chars!");
strncpy(str1,str2,lenof(str1));

Strncpy() will copy characters from str2 into str1 until it encounters a NULL, or the number of characters specified in the third parameter is reached. So in this programming example, we won't have any storage overlays: the strncpy will stop when we've reached the length of str1.

There are many 'bounded' string functions that specify such a maximum length. The following table shows some 'normal' C string functions, and a better bounded alternative:

String Function BetterBounded Function
strcpy()strncpy()
strcat()strncat()
strcmp()strncmp()
strcasecmp()strncasecmp()

Unfortunately, IBMs XL C doesn't' offer a snprintf() function. However, the memcpy() is often the function of choice for fixed-length fields. It simply copies a set number of characters from one variable to another.

Even using these fields leaves some room for problems. Different fields will have different lengths. Manually entering string lengths can lead to errors.

IBM MQ gives us some interesting options for fixed length variables. In their include file, they define different types for different length character fields:

typedef char MQCHAR;
typedef MQCHAR4 MQCHAR4[4];
typedef MQCHAR8 MQCHAR4[8];
typedef MQCHAR16 MQCHAR4[16];
typedef MQCHAR48 MQCHAR4[48];
typedef MQCHAR128 MQCHAR4[128];

Each field is declared using the relevant type definition. For example, MQ queue names are always 48 characters long. So, a queue name variable would be declared as:

MQCHAR48 QName;

Every field also has a length defined. So, the include file includes the definition:

#define MQ_Q_NAME_LENGTH               48

To initialize a queue name variable, the following code can be used:

memset(QName, '\0', MQ_Q_NAME_LENGTH );

To copy a queue name to another variable, we can use:

strncpy(DestVar, QName, MQ_Q_NAME_LENGTH );

The Best Solution

Spending some time thinking about strings now will save a lot of pain and time later. Storage overlay problems are very hard to diagnose and fix. We discuss some options in our articles on storage overlays and LE HEAPZONES.

If working with C on z/OS, it makes sense to bite the bullet, and accept that you'll be working with fixed length strings. Once this is done, the best solution is to eliminate the use of any string function that requires a terminating NULL character (such as strcpy(), strcmp(), strcat()), and use other options. Spending time on smart declarations can make it easier to see the length of each fixed-length variable, and work with it.

Thanks to Peter MacDonald for pointing out a couple of erros in the original article


David Stephens



LongEx Quarterly is a quarterly eZine produced by Longpela Expertise. It provides Mainframe articles for management and technical experts. It is published every November, February, May and August.

The opinions in this article are solely those of the author, and do not necessarily represent the opinions of any other person or organisation. All trademarks, trade names, service marks and logos referenced in these articles belong to their respective companies.

Although Longpela Expertise may be paid by organisations reprinting our articles, all articles are independent. Longpela Expertise has not been paid money by any vendor or company to write any articles appearing in our e-zine.

Inside This Month

Printer Friendly Version

Read Previous Articles


Longpela Expertise can help with your C language projects. We can port C applications to z/OS, and tune C systems running on z/OS. Contact us to get your own mainframe C expert.
© Copyright 2021 Longpela Expertise  |  ABN 55 072 652 147
Legal Disclaimer | Privacy Policy Australia
Website Design: Hecate Jay