opinion: Why You Should Let Your ML1 Go
Ever since I started working with mainframes in the late 1980s, migration level 1 (ML1) was an essential part of any DASD administrators toolkit. In DFHSM (now called DFSMShsm) terms, ML1 is a group of disk devices where archived files are stored in a compressed format. So DFHSM daily space management looks for datasets that have been unused for a set period of time, and then moves (migrates) and compresses them to a different disk volume: an ML1 volume.
This has worked great for ages. In the times when disk space was expensive, this freed up disk space and saved money. Archived datasets could also be stored on tape (migration level 2, or ML2), but in the days before automated tape libraries, these tapes were mounted manually by an operator. So a recall from ML2 could take a few minutes. ML1 was a nice 'halfway house'.
A Closer Look at ML1 Compression
Let's look a little closer at ML1. You actually can configure DFHSM not to compress datasets on ML1. This could possibly be done if migrating datasets from (expensive) high-speed disks to (cheaper) low-speed disks. But most sites compress all ML1 datasets. This compression is the Huffman Frequency Encoding algorithm, based on a paper by David Huffman in 1951. Loosely speaking it creates an optimal code where more often occurring characters (particularly spaces) are stored in a more compact format.
So how much compression do we get out of this? I've always used a rough rule of thumb of 50% - meaning the size of the dataset is halved when compressed. But this actually depends on the contents of the dataset. When writing this article, I went back and crunched some numbers. Let's take some examples:
|Dataset||Size Before Migration (MBytes)||Size After Migration (MBytes)||Compression|
|CSV data file||17.83||10.38||42%|
|COBOL Source Library||0.54||0.13||77%|
You can see why I used 50%: it's a nice midpoint or average. But in reality it can be very different. HSM writes (if the correct parameters are set) SMF records that record all DFHSM activity, including a count of bytes read and written. I took the SMF records for one day and looked at the ML1 compression statistics. Interestingly, an average of 70% compression was found.
Using the Huffman coding isn't the only way of compressing ML1. DFHSM also provides the Small Dataset Packing (SDSP) facility. This stores migrated datasets in a pre-allocated VSAM dataset. It is designed for small datasets less than one track in size. Many sites I've seen don't use SDSP.
Another recent option is the zEDC processor that offloads compression from General Purpose processors. For those lucky enough to have a zEDC and the other prerequisites, this is a great option. However most sites haven't gone down this path yet.
The Cost of Compression
OK, so we get something like 50% reduction in space using compression with ML1. But what does this cost? CPU.
DFHSM ML1 processing costs CPU. It costs CPU to move datasets between ML0 (normal DASD), ML1 and ML2. It costs CPU to update and maintain the DFHSM control datasets with the migrated datasets and where they are located. But in particular, it costs a lot of CPU to compress these ML1 datasets - it's all done on General Purpose (GP) processors in software (unless using zEDC).
So is this worth it? Let's do a 'back of the envelope' calculation, starting with prices. Prices for CPU and DASD will vary greatly from site to site, depending on many factors. But let's take some numbers that aren't too ridiculous:
- DASD costs 6 US cents per GByte of DASD per day
- CPU costs US$120 per CPU hour
The same DFHSM SMF records I used earlier showed moving data between primary and ML1 (migrate or recall) cost about 4 CPU seconds per (uncompressed) GByte. Suppose that on average, our data stays on ML1 for 14 days. After that, it is either recalled to primary, or migrated to ML2. So every GByte of data on ML1 (measured in uncompressed GBytes) costs 8 CPU seconds (once to compress onto ML1, once to uncompress off ML1) every 14 days. Or on average, 1 Gbyte-day costs 8/14 = 0.6 CPU seconds.
But ML1 itself consumes DASD - something like 50% of the original size. So in fact we're only getting 0.5Gbyte-days costs savings: 3 cents. That 3 cents of DASD savings has cost 2 cents (US$120 /3600 * 0.6 CPU seconds) of CPU. We've lost 67% of any DASD savings to CPU costs.
Now I know what you're thinking. Most sites perform DFHSM daily space management in periods of low activity. So if you're paying for peak CPU usage (which almost everyone is), this DFHSM CPU overhead is free. However, I've seen many sites that don't always have such a 'low' period, and so DFHSM daily space management sometimes runs in peak. Others use interval migration to ensure enough free space on disks, which can also occur during peak periods. Further, DFHSM recalls happen as they're needed, and so may occur during peak.
But the thing is that we can get our 6 US cents per GByte-days of savings without using ML1. Let's look at two alternatives:
When ML1 was first conceived, there were no fast tapes, and no automated tape libraries. This meant that a user could wait minutes to recall a dataset from ML2. So it made sense to have a 'halfway house' on disk for archived datasets. Today tape subsystems are fantastic. Automated tape libraries speed up tape mounts. Virtual Tape Subsystems (VTS) can eliminate them with a disk front-end, effectively doing the job of ML1 without the CPU costs. There's also a good chance that writing to ML2 isn't much slower than ML1, and recalls from ML2 should be satisfied in seconds, not minutes. So a smart option to consider is to simply remove ML1, and migrate everything straight to ML2.
On that z13 system I mentioned above, a recall from ML2 cost about one CPU second per GByte - no software compression for ML2. So this is 25% of the cost of ML1. Further, there's no second movement of archived data from ML1 to ML2. More CPU savings.
The whole purpose for ML1 is to save DASD space. Using the zEDC features to compress eligible datasets on normal disk achieves this. Some high performance users won't want the response time overhead of compression, even on the zEDC. However other users won't notice the difference. So using zEDC to compress suitable normal datasets save the DASD space, and lets you get rid of ML1.
ML1 has had its day. Today the benefits don't match up. Many sites will continue to have ML1 as a legacy from days gone by, though others I have seen are already moving away from ML1. Working through the numbers like we've done here will probably show that ML1 is no longer worth it.