LongEx Mainframe Quarterly - May 2023
Often when people talk about the z/OS Workload Manager (WLM), they're really talking about things that the z/OS System Resource Manager (SRM) does. So, what does SRM really do? And how is it different to WLM? Let's take a look. SRM: CPU GatekeeperSRM is a z/OS internal feature that determines which tasks are given resources, and how long they can use them. By resources, we mean CPU, I/O and storage (memory). CPU is usually the resource of interest, so what does SRM do with it? Suppose we have a z/OS system with four CPUs (we'll assume they're GP CPUs, but this also applies to zIIPs). z/OS systems have many address spaces, and many tasks within those address spaces. There will be times when there are more than four tasks that want to use a CPU (they're not doing I/O or waiting for a user to hit a button: they're ready to do work). So, some of these tasks will need to wait: we'll have a queue of ready work. SRM determines which work will get access to each CPU: it manages this queue. And the way it manages it is by a number called the dispatching priority. This is a number from 191 to 255: 255 being highest. Take a look at SDSF, and you can see the dispatching priority of each address space (all tasks in the same address space have this dispatching priority). So, CEA has the highest possible dispatching priority (255, or 'FF' in hex), and DBCGDBM1 201 (C9 in hex). SRM also does other things to control the tasks that do, and don't, get access to the CPU. It is the component that swaps address spaces in and out. It can increase the dispatching priority of tasks that hold ENQs if other tasks are waiting for that ENQ. And it's not just CPU that SRM looks after. If z/OS is short on storage, SRM is the component that detects this, and does things like swapping out address spaces to fix it. Service UnitsSRM is also responsible for counting service units. Service units are intended as a measurement of computer resource usage that can be compared between different processor configurations or models. It was also originally designed to consider I/O and storage (memory) usage, but today we're usually only concerned with CPU. SRM will keep a tally of the service units used by each task. To calculate the service units, SRM multiplies the CPU seconds by a constant. IBM publishes the SRM constants used for each processor model. For example: So, a z16 3931-404 processor has around 13,233 service units per 'real' second. CPU service units (Millions of Service Units, or MSUs) are used for z/OS software pricing, and when setting capacity or workload processing limits. They are also recorded in SMF records and can be used as a measurement of computer resource usage that doesn't need to be changed every time the mainframe hardware is upgraded. WLM: Who Gets the High Priority TicketWLM is the z/OS feature that assigns the dispatching priority. It provides a mechanism to allow z/OS systems programmers to divide work into classes and assign goals to each class. It monitors each class to see if it is meeting its goals and modifies the dispatching priority as appropriate. If it can't allow all classes to meet their goals, z/OS systems programmers can assign importance to each workload: WLM will work with high importance workloads first. WLM also provides a lot of information used by z/OS and PR/SM. In Hiperdispatch mode, PR/SM uses WLM data to determine whether to park a processor or not. PR/SM also uses WLM information when implementing capacity caps. Difference Not ImportantIn a nutshell, SRM manages the resources, and WLM sets the rules. But it's not always as simple as that. IBM often isn't clear about what each component does. Messages and module names can help: SRM modules and messages usually start with IRA, WLM with IWM. But, at the end of the day, it isn't important which component does what. Both are included for free with z/OS, and both must be running in every z/OS system. Together, they do an amazing job of managing the mainframe's resource, allowing z/OS systems to run comfortably at 100% CPU usage. Many books and articles use SRM and WLM interchangeably or specify WLM/SRM. And that's just fine. |