LongEx Mainframe Quarterly - November 2019
In a client site a while ago, we did some analysis on how they could convert their CICS/VSAM to CICS/Db2. Or in other words, move their data from legacy VSAM datasets to Db2 databases. But why would they want to do that? Data AccessTalk to any Db2 DBA, and they'll be happy to tell you all the reasons why we should use Db2. But in this case, the primary reason was access to the data from other platforms. Let's take a look at this. Accessing Db2 data from remote platforms is relatively easy. There are gateway solutions such as Db2 Connect and z/OS Connect. Db2 offers a RESTful interface directly from Db2 v11. And of course, anyone can access Db2 directly using Db2 DDF, and something like the drivers for JDBC and SQLJ supplied by IBM. Once you connect to Db2, then you're away; using the same SQL, SQLJ or JDBC that you'd use for a local database. Db2 can be setup to do all the hard conversions like EBCDIC to ASCII, big-endian to little-endian, and even code pages. VSAM is a little harder. There is no way to access VSAM datasets directly: we need some kind of gateway. There are a couple of products providing gateway solutions out there, but there's work setting them up for each VSAM dataset. Even if we can access the VSAM dataset, if we want to use something familiar like SQL or JDBC, then we need to map our VSAM dataset. Let's take an example: suppose we have a VSAM dataset with a fixed length record with the following format:This is all stored as bytes in one VSAM record. VSAM doesn't know how many fields are in the record, where they are (except the key for KSDS), or if they're character, integer, or something else. The applications that access this record must map the record contents using something like the C definition here: typedef struct recformat { char date[11]; char custid[5]; unsigned short itemid; unsigned short quantity; float price; unsigned short staffid; } recformat; recformat *vsamrec; So, our gateway will need to do this mapping as well. Gateway solutions will have some way of taking in COBOL copybooks or C includes, and using them to map each VSAM record. We'll also have to define any conversions that have to happen, field by field. Let's look at date fields in our example record. You'll see that our date field is 11 characters of text in the format dd-mmm-yyyy (for example, 12-Nov-2019). This will need to be converted into some SQL-friendly date format. Locking and SyncpointsAnother problem we have is with locking. Our VSAM datasets are opened to a production CICS region. So, if we want to access it from outside the CICS region, we'll need to use VSAM RLS. A similar problem with syncpointing. If we want to update a VSAM record, we need to think about locking the record, and managing syncpoints. VSAM RLS will handle the locking, and is fine if this is the only update in a unit of work. But if we're updating other things (like a database) and want two-phase commit functionality, we'll need a syncpoint manager: probably DFSMStvs. It gets a little ugly. Db2 has none of these problems. It handles all the locking, and has some fancy functions to reduce any waits for locks, and lockout problems. It can handle all the syncpoints, working with CICS or anything else that plays with z/OS RRS. PerformanceBut that's not the only benefit our client will get from converting to VSAM (as any DBA will tell you, at length). If you look at our example, you'll see a few ID fields: customer ID, staff ID, item ID. There will be other VSAM datasets holding these IDs, and information about them.If we want to produce a web screen with the details for this VSAM record, including the customer's name and address (from customer ID), staff member's name (from staff ID), and text describing the item (from item ID); we'll need three more queries to three different VSAM datasets. Drives DBAs crazy. With Db2, we can put together an SQL statement that will get all the information for us in one query. Each of the tables will have relationships, and Db2 will do all the hard work. And it will do it fast: much faster than separate queries to VSAM datasets. When we access a VSAM dataset, we always get the entire record. Suppose we have a record that is 24kBytes in length, and we want one 10-byte field from it. With VSAM we'd need to read the entire 24kBytes of the record, and then parse it to get our field. If we update the field, we have to write the entire 24kBytes out again. With Db2, it's a little more elegant: we'd only get the 10-bytes we want. So in many cases, we'd expect better performance from Db2 than VSAM. Other ReasonsIn our site, we did some digging, and found programs using different COBOL copybooks to map the same VSAM record. This is a resilience issue: and makes maintaining the VSAM file harder. In another example, a change to the format of a VSAM dataset was needed, but it was too hard to change all applications using the VSAM dataset at the same time. In this case, a duplicate of the origingal dataset was created with the new format, leaving the original format in the old one. Both had to be updated whenever a change to a field was made. Ugly. Db2 gets around these issues. It also introduces strong 'typing': all programs will read a field in the same format. There's also no doubt that Db2 will provide application programmers (both on and off the mainframe) with a superior way of accessing and manipulating data. For example, adding a new index on a Db2 table is easy. With VSAM, we need to define an alternate index, and change the program to use it. Once converted to Db2, the data is no longer chained to z/OS. If we were to retire our CICS programs, we could possible move it to a Db2 subsystem on another platform without changing a whole lot. It gives us options. So, Why VSAM?So, with all of these advantages, why would we use VSAM at all? The easy answer is that when many of these applications were created, Db2 wasn't around. Or in the case of some program products, the vendor didn't want to make Db2 a prerequisite. But that's not the whole story. The reality is that VSAM is great when you need a file, with some extra bells and whistles. For example, I've seen CICS/Db2 applications that still use a VSAM file for a log. VSAM is a lot easier and straightforward to setup and use, so if our data format and requirements are simple, then VSAM is an attractive option. VSAM is also very fast. If we're only accessing a single VSAM dataset (we don't need to get data from others), VSAM will often give Db2 a good run for its money. ConclusionOur client wanted easier access to their data, and with Db2 they'll get it. They'll also get a lot of other benefits that come from using a mature DBMS over a file system. Of course, the fun will be converting from one to the other. But that's another article. |