technical: Eight Ways to Improve Your Assembler Programs
A couple of years ago I had to work on an existing High Level Assembler (HLASM) program that wasn't performing very well. By not performing well, I mean it ran like a total dog. But as I started to look at the code to get a feel of what it was doing, my heart fell. No comments, no subroutines, no error handling, and no real structure. The code was a complete mess. This meant that a one week job was now going to take at least three times as long.
It's easy to write bad assembler code. In many cases, a module that starts life as a 'quick and dirty' fix is modified and changed, to the point that it soon gets out of control. Or a programmer hasn't coded for a few years and needed to create something in a hurry. But let's take a step back and have a think about this. When we write assembler code, we really want code that:
- Is efficient.
- Is well documented - so you (or the person following you) know what it does, why it does it, and how it’s used.
- Is easy to diagnose – someone may have problems with it later, and that someone may be you.
- Is easy to modify – there’s a good chance it will need to be modified at some time.
The good news is that writing quality assembler code doesn't have to take any more work. In fact, the chances are that it will save you work, both now when you do the coding, and later when you need to go back and work on it some more.
So, here are my top eight easy ways to quickly improve the quality of your assembler code.
1. Become a Writer
In assembler comments are free. They have no effect on the running of the module, so it makes sense to use them to make your program easier to understand. You can't do this with the odd comment or two. You need to become not just a writer of code, but also a non-fiction writer. This way you can explain clearly and in detail exactly what is going on.
To give you an idea of the number of comments needed, I usually have more lines of comments than lines of code.
I have a standard way of putting my comments together. I separate my comments into four different types:
These are blocks of comments that give an overall explanation of a module, subroutine, or section. My headings look like:
* Heading text
At the very beginning of a program, I will have a heading with a lot of information, including:
- Program name.
- What it does, and why.
- What parameters are passed in.
- What information is returned.
- Register usage (eg base register, workarea register).
- Return codes.
- A clear explanation of what goes on inside.
- Any external references (external programs called, subsystem functions used etc.).
- Binding information (31 bit or 64 bit, APF authorised or not, re-entrant or not).
- Change history – who did anything to it, what they did, and when they did.
I'll have similar information for subroutines.
These divide the code into easy to analyse sections. They look like:
I'll usually have one of these for every 20 - 30 lines or so of assembler code.
c. Full Line Comments
A single line explaining something. They look like:
--- Comment -----------------
The dashes make it stand out from the code.
d. Inline Comments
These explain a line of code. They are on the same line as the instruction, and start in column 40 (or as close as I can make it).For example:
LA R3,1(R3) Move to next character
You'll notice that my comment above doesn't say what the line does, but why it does it. So it doesn't say "increment R3" (you can see that from the code). And I like to have all of my inline comments start at the same column to make my code look more professional.
2. Don't Write in Assembler
During my first year at University, we had a lecturer that obsessed about structured code. By structured code, he meant:
- Code that starts at the top, and works down. No jumping around the code.
- Code that is broken into small subroutines – making it easier to read.
In our first year, we wrote everything in Pascal (this shows how old I am). In our second we wrote first in Pascal, and then converted the program to assembler (it was PDP-11 assembler at the time). Doing this resulted in an assembler program that was (more or less) structured, and also much easier to read and understand. Twenty-five years later, I still use this technique for my larger assembler programs. I’ll even write the high level version of the code in comments to the right.
LOOP1 DS 0H For i = 1 to 10 until string [i] = ‘0’ do
BCT R1,LOOP1 Next i
ENDLP1 DS 0H
This sounds silly, but check for errors. Always. Any time you call a program or service check to see if it worked and handle things if it didn't. Always. I've lost count how many time I've seen programs that just call a module and assume it works.
Similarly if a value is passed in from another program or a user, check it. Always. Have error handling routines that handle invalid or missing values.
I even go one step further. I like to always have an ESTAE routine to trap abends, particularly if I'm hopping through control blocks. What this routine does depends on the routine, but usually it will output a summary dump (you can specify this in the SETRP macro in your recovery routine), recover with a message (saying that an abend has been detected), and exit with a return code indicating that an error has occurred. And yes, I always test my ESTAE routines.
Because this single point will make such a huge improvement to your code, I'll say it again. Check everything. Always.
4. Write Re-Entrant Code
I automatically write all my assembler code as re-entrant. This has a couple of advantages:
- The size of my programs is smaller (it doesn't include my working storage)
- I don't have to think whether my module needs to be re-entrant (or refreshable), or not.
- Event if my routine does not need to be re-entrant now, there will be no problems (or work) if it needs to be in the future.
- If my routine is later used in a way I didn’t think of, there will be no re-entrancy problems.
- In today’s mainframe processors, re-entrant code can run faster.
I find writing re-entrant code is no more work than non-re-entrant code. You can find out more about re-entrant codes from the article Understanding Re-Entrant Programming.
5. Use Subroutines
Most assembler programs I see are just one program – no matter how many lines of code. I've seen assembler routines that need three or more base registers. A far better idea is to break up large programs into smaller subroutines. This gives you a lot of advantages, including:
- Less base registers.
- Easier to keep track of register usage.
- Easier to understand your program.
- If code is used in more than one place in your program, you only code once.
- Easier to steal code. If you have another program that needs similar code, it's much easier to use a subroutine, rather than try to use part of a program.
- Easier to solve problems. Let's say you have an abend in your program. If the abend occurs in your subroutine, you have only 20 or 30 lines of assembler to debug. Without subroutines, you debug the entire module.
I have a standard way of separating code into subroutines. I call a subroutine by:
LA R1,PARMS R1 -> Parms
L R15,=A(SUB1) R15 -> Subroutine SUB1
BASR R14,R15 Jump to subroutine
MAINEND DS 0H
PR End of main program
The subroutine code looks like:
* Perform some processing
* Input: R1 - Parms
* R13 - Workarea
* Output: R1 - Updated parms
* Register Usage:
* R10 – Input parms
* R12 – Subroutine addressability
* R13 - Workarea
SUB1 DS 0H
BAKR R14,0 Save callers' environment
LR R12,R15 R12 = Base Register
LR R10,R1 R10 -> Input parms
USING SUB1,R15 Addressability
USING WORK,R13 Address workarea
SUB1END DS 0H
XR R15,R15 Zero return code
LR R1,R10 R1 -> Updated parms
PR Return to caller
#SUB1NAME DC C'SUB1'
You'll see that the subroutine is completely separate from the main program. It has separate addressability, and the main program passes parameters via R1. In this example I use BAKR and PR, so no savearea needed. The subroutine and main routine share a workarea (rather than call Getmain/STORAGE for every subroutine), but I try to separate the areas used by the main program and each subroutine in the workarea. I also like to use labels that make it obvious whether it's for the main or subroutine (for example, all the labels in my subroutine above start with SUB1). Finally, I also like to have independent ESTAEs for my subroutines.
6. Think About Labels
I like a standard naming convention for my labels, so I quickly know what they are, and how they can be used. This reduces the chances of errors creeping into my code. For example
All my equates start with a @, so I can't mistake an equate for a data constant.
All my data constants start with a #, so I can't use a constant as a variable or equate.
All my subroutine labels start with four characters that identify the subroutine. This way I can't accidentally use a subroutine variable or label in a different subroutine.
7. Consider Standard Macros
Many assembler programmers use a standard set of macros to make their programs easier to understand. Find out more from the following articles:
IBMs HLASM Toolkit product includes a set of macros for structured programming. You can find some freeware versions from file 438 on the CBTTAPE website.
8. Test Everything
Everyone knows that you need to test your code. However it's easy to just test the basic program operation. Don't fall into the trap - test EVERYTHING. Here's an idea of what I test:
- If the program accepts parameters, I test if no parameters are passed.
- If the parameters need to be in a certain format, I test if the parameters are not in that format, or are invalid.
- I test all ESTAE routines by temporarily inserting an ABEND macro in the program.
- If I call another routine, I test my error handling. Ideally, I temporarily insert some code that will cause the other routine to fail.
- If I call a service, I test my error handling. For example, if I call the STORAGE macro, I may temporarily insert an instruction just before this macro that will cause the STORAGE macro to fail.
- I will test my re-entrant program to make sure it really is re-entrant (see the Re-Entrant Programming for Beginners article for more information).
Often I will write another program that I use for testing. This test program will:
- Call my program with every possible combination of input parameters, and will check the output of my program.
- This program will call my program a few thousand times. This checks that I've release any storage that I've obtained, and there aren't any other things I've forgotten.
That assembler program that ran like a dog really did turn out to be a nightmare. There were absolutely no comments, the code hopped around everywhere with branch statements, and it used undocumented ways of getting IMS information. I spent an entire week just trying to find out what the program did, and how it did it.
If the original programmer of that routine followed these eight hints, my job would have been far easier. And what's more, these hints are easy to implement in your programs. If you use them, you'll have fewer errors, and more robust code that is easier to understand and debug.
Thanks to Richard Corak for finding an error in one of our code fragments