management: The Best Mainframe DASD and Dataset Innovations
I remember when I first walked into our mainframe machine room. It was 1989, and I'd just finished university. As I walked in, I saw this huge line of blue and white boxes disappearing down the back of the machine room – our 3380 DASD 'farm'. Today, the lines are a lot smaller, and that's just the beginning of the changes I've seen with DASD and datasets. So what are the best 10 I've seen since I started?
At the time when I first saw that line of 3380, our company was just finishing up implementing catalogs. Most of our application data was cataloged, but some applications and datasets weren't there yet.
But think about what this meant. Every time you wanted to allocate any uncataloged dataset, you need to know and specify the volume serial number. Every time! The idea of a catalog to keep track of datasets and volumes has been around since OS/360 where the SYSCTLG dataset kept a list of datasets and their volume location. These CVOL catalogs weren't pretty: you used the batch utility IEHPROGM to catalog and uncatalog datasets, and IEHLIST to list catalog entries. VSAM catalogs came with VSAM datasets in the 1970s, and cataloged VSAM datasets only (still needed CVOL for non-VSAM). VSAM catalogs owned VSAM datasets, and you couldn't have VSAM datasets in different catalogs on the same volume.
In reality, many shops (including ours) cataloged their VSAM (they had to), but not much of their non-VSAM. This changed when the Integrated Catalog Facility (ICF catalogs) replaced both CVOL and VSAM catalogs in 1979. ICF catalogs were the way they should be – much easier to use and manage. So by 1989, all our catalogs were ICF.
2. Automated File Archiving and Management
We'd also just finished implementing IBMs Hierarchical Storage Manager (HSM). This brilliant product would automatically archive old data, remove unneeded temporary datasets, delete expired datasets, and even backup datasets. But that's not all. Archived datasets (to compressed disk or tape) were automatically brought back if needed. And it was so easy to restore deleted datasets using the TSO hrestore command. Just brilliant.
OK, today this doesn't seem so special. But then it was. Originally released in the 1970s, DFHSM wasn't the only product in this area. In 1989 we were actually moving away from CA-ASM2 for dataset archival, and other products like CA-Disk were also around. But most of these were batch oriented. DFHSM was automatic.
In 1989, we were using 3880 disk controllers with cache. And this cache was amazing. First introduced in 1981, it would store recent data read from, or written to, DASD in memory kept in the DASD controller. Subsequent reads for this data could be satisfied without a disk read. And it made a big difference. But IBM went further in 1991 with the 3990 controller.
The 3990 added Cache Fast Write (CFW), and DASD Fast Write (DFW). DFW allowed writes to DASD to be ‘satisfied' at the cache level. Once received in cache, the DASD subsystem would acknowledge the write, and do the actual physical disk write a little later. This improved dataset write times as seen by z/OS (then MVS). The 3990 would write the cache data to non-volatile storage, so the data would not be lost during a power failure or other cache failure. CFW was the same, but without the non-volatile storage part.
We were nervous about DFW long and hard when we bought our first 3990 controller. But once in, it really improved our performance, and we never had a problem.
In these early days, we didn't have a lot of cache to play with. So we tended to use it for the most important volumes, leaving the less-important to fend for themselves. Later models increased the amount of cache, to the point that today you would need a very good reason to have an uncached volume.
I remember a four-week period when we were getting a 3380 head disk crash every few days. We'd just moved our data centre, and had some old 3380 DASD. One such head crash would stop four devices (4 volumes per head disk assembly, or HDA). So we'd call in IBM, they're replace the HDA, and we'd restore the four devices from backups. This was bad enough. However at a couple of points we lost critical volumes, bringing down all systems – no working z/OS system to restore from. Great disaster recovery experience, but stressful.
This would never have happened if we used RAID DASD. The IBM RAMAC DASD subsystem announced in 1994 was such a device, with emulation of 3380 or 3390 disks on a matrix of cheaper 3.5 inch PC-style disks. If one would fail, there was enough redundancy for the subsystem to continue processing with no problems.
However IBM were playing catch-up. StorageTek announced their Iceberg disk subsystem in 1992, though it took them another two years to release it. EMC weren't far behind, announcing the Symmetrix system in 1993.
Interestingly, IBM later decided to use StorageTek Icebergs as their own, rebranding them as RAMAC Virtual Array (RVA) in 1996. And this was smart. The StorageTek Shared Virtual Array (SVA) architecture was the first to totally decouple the logical 3380/3390 architecture from the physical disk. Rather than setting aside 1GBytes for a 3380 disk device, the SVA would only use the storage if needed.
RAID greatly improved the reliability of DASD subsystems. But it also provided an additional advantage. Now disk suppliers could use a common platform for both mainframe and non-mainframe disk. Just needed an emulation layer so it could pretend to be mainframe DASD. Today all DASD subsystems can be configured on the floor to be mainframe or non-mainframe DASD.
StorageTeks SVA added another new feature: Snapshot. This could create an instantaneous copy of a dataset or volume instantly without consuming any extra space. Pretty cool huh?
IBM and StorageTek parted ways in 2000 when IBM announced their Shark (Enterprise Storage Server). Based on IBMs Seascape architecture, the Shark was a small step backwards, without the RVA technology. IBM have brought back Snapshot functionality in their DASD line – now called FlashCopy. EMC and HDS supply something similar (TimeFinder and ShadowImage respectively).
6. Solid State Disk
Solid state disks (SSDs) are a great idea, providing amazing disk performance at a higher price. And this is brilliant when you really need the extra performance: MQ and DB2 logs, z/OS logger DASD logs, JES checkpoints and more. However SSDs – or disk without the moving disk parts, have been around for a long time. StorageTek started the ball rolling in 1979 with their RAM-based 4305 solid state disk. This worked like flash drives today with one difference: the disk was wiped when the disk unit was switched off (or lost power).
Memorex followed in 1983 with the 3864, and National Advanced Systems (NAS, later HDS) was a couple of years behind in 1985.
However solid state disks were expensive, and lost their popularity until 2009. This was the year that both IBM and EMC released flash SSDs that could be used on a mainframe. Today their usage increases as storage providers make it easy to mix and match the (still) cheaper spinning disks with SSD.
7. Fibre Mainframe Channel
Moving data centres is fun. There's excitement, tension and a lot of stress. You also get to see a side of the hardware you've never seen before as boxes are broken up and lifted in and out of machine rooms. But the boxes are the easy part – it's all the connections between them that are the hard work.
Before 1991, you connected your DASD to the mainframe using Bus and Tag channels. A pair of copper wires as thick as a baseball bat, bus and tag channels had a transfer rate of around 3-4Mbytes per second up to a couple of hundred feet. That's what we had when we moved our data centre in 1993. I remember helping our hardware engineer at 2am lay down these monsters under the false floor as we tried to put our mainframe back together. My arms hurt for a week after the move.
ESCON changed this, replacing the big copper cables with a single fibre cable. It increased transfer speeds to 17MB, and distances up to more than 43km. It also eliminated that upper body workout hardware engineers used to get when moving them around.
FICON was the next generation of fibre cable, with current transmission speeds of up to 1600Mbps bidirectional throughput on a 16Gbps link up to 100km for FICON Express 16S. Further distances can be considered with a FICON switch, of from a request price quotation. High Performance FICON (z/HPF) is the latest hardware/software feature to squeeze even more from FICON channels.
8. Disk Mirroring
In 1990, my mainframe site commenced disaster recovery tests. Assuming that our data centre was no longer available, we would get all our offsite tape backups, board a plane and fly 800km to Sydney, and proceed to restore our backups at another data centre. This worked, but took us over 48 hours. 48 hours can be too long.
IBM, EMC and HDS all solved this problem with disk mirroring technologies – echoing any changes to local disks on remote disks. With this technology, that 48 hour recovery time can be shrunk to minutes. Our article “A review of Disk Mirroring: XRC, PPRC and Friends” looks at disk mirroring in more detail.
9. System Managed Storage
VOL=SER=VOL001. Must have seen that code so many time in JCL. Not to mention hard-coding volume serial numbers when allocating dataset using ISPF 3.2 or something similar. Catalogs removed this coding for existing datasets. Eligible Device Tables (EDTs) also helped, so you could replace your VOL=SER= code to UNIT=PROD1 (assuming your systems programmer has defined an EDT called PROD1 with some DASD volumes).
But coding the VOL=SER wasn't the hard part. The hard part was looking through all your disk devices for enough room for your datasets. Or keeping a list of those devices, and how full they are. Painful.
So System Managed Storage (SMS) was great. Forget about VOL=SER=, forget about keeping lists of DASD. In fact, forget about LRECL=80, RECFM=FB or DSORG=PS. Rather, let the DASD administrator look after all of that. Introduced with MVS/ESA V3.1 in 1988, SMS was a renaissance. Today we forget how hard it was to manage DASD before SMS. But now DASD administrators can not only better control DASD, they can stop users from mis-allocating datasets, using volumes not designed for them, and allocating un-cataloged datasets.
Didn't expect this one did you? For number 10, I considered Parallel Access Volumes (multiple UCBs and paths to a device for improvement), and Extended Address Volumes (EAV) – more space on logical 3380/3390 devices. I also gave serious thought to ISPF 3.4 for dataset management. But I couldn't leave out DFDSS, or DFSMSdss as it's called today. Innovation also have a similar product: FDR/ABR – but I've never used it.
Before DFDSS, moving datasets was done with terrible utilities like IEHMOVE and IEBGENER. But these couldn't move VSAM. So add IDCAMS. All are difficult to code, slow and just painful.
Enter DFDSS. Simply move or copy any dataset you want – VSAM, sequential, tape or disk. It doesn't matter. And it will do it faster than anything else. But that's just the start.
Want to defragment a pack? Done. Want to copy an entire volume, or more than one? Done. Want to backup a volume or dataset– no problem. Want to restore a volume when you don't have a working system up – sure, use the standalone version. Want to backup several datasets at the same time (not one after the other)? Can do.
DFDSS also has kept up to date. So it uses FlashCopy automatically, and can work with CICS BWO (Backup While Open) to backup datasets while they're open to CICS.
DFDSS just has to be the most under-rated utility today.