Workload Manager
Encyclopedia
In IBM mainframe
IBM mainframe
IBM mainframes are large computer systems produced by IBM from 1952 to the present. During the 1960s and 1970s, the term mainframe computer was almost synonymous with IBM products due to their marketshare...

s, Workload Manager (WLM) is a base component of MVS/ESA mainframe operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

, and its successors up to and including z/OS
Z/OS
z/OS is a 64-bit operating system for mainframe computers, produced by IBM. It derives from and is the successor to OS/390, which in turn followed a string of MVS versions.Starting with earliest:*OS/VS2 Release 2 through Release 3.8...

. It controls the access to system resources for the work executing on z/OS based on administrator-defined goals. Workload Manager components also exist for other operating systems. For example an IBM Workload Manager is also a software product for AIX operating system
AIX operating system
AIX AIX AIX (Advanced Interactive eXecutive, pronounced "a i ex" is a series of proprietary Unix operating systems developed and sold by IBM for several of its computer platforms...

.

z/OS Workload Manager

On a mainframe computer many different applications execute at the same time. The expectations for executing work are consistent execution times and predictable access to databases. On z/OS
Z/OS
z/OS is a 64-bit operating system for mainframe computers, produced by IBM. It derives from and is the successor to OS/390, which in turn followed a string of MVS versions.Starting with earliest:*OS/VS2 Release 2 through Release 3.8...

 the Workload Manager (WLM) component fulfills these needs by controlling work's access to system resources based on external specifications by the system administrator.

The system administrator classifies work to service classes. The classification mechanism uses work attributes like transaction names, user identifications or program names which specific applications are known to use. In addition the system administrator defines goals and importance levels for the service classes representing the application work. The goals define performance expectations for the work. Goals can be expressed as response times, a relative speed (termed velocity) or as discretionary if no specific requirement exists. The response time describes the duration for the work requests after they entered the system and until the application signals to WLM that the execution is completed. WLM is now interested to assure that the average response time of a set of work requests ends in the expected time or that a percentage of work requests fulfill the expectations of the end user.

The definition of a response time also requires that the applications communicate with WLM. If this is not possible a relative speed measure – named execution velocity - is used to describe the end user expectation to the system.
This measurement is based on system states which are continuously collected. The system states describe when a work request uses a system resource and when it must wait for it because it is used by other work. The latter is named a delay state. The quotient of all using states to all productive states (using and delay states) multiplied by 100 is the execution velocity. This measurement does not require any communication of the application with the WLM component but it is also more abstract than a response time goal.

Finally the system administrator assigns an importance to each service class to tell WLM which service classes should get preferred access to system resources if the system load is too high to allow all work to execute. The service classes and goal definitions are organized in service policies together with other constructs for reporting and further controlling and saved as a service definition for access to WLM. The active service definition is saved on a couple data set which allows all z/OS systems of a Parallel Sysplex
IBM Parallel Sysplex
In computing, a Parallel Sysplex is a cluster of IBM mainframes acting together as a single system image with z/OS. Used for disaster recovery, Parallel Sysplex combines data sharing and parallel computing to allow a cluster of up to 32 systems to share a workload for high performance and high...

 cluster to access and execute towards the same performance goals.

WLM is a closed control mechanism which continuously collects data about the work and system resources; compares the collected and aggregated measurements with the user definitions from the service definition and adjusts the access of the work to the system resources if the user expectations have not been achieved. This mechanism runs continuously in pre-defined time intervals. In order to compare the collected data with the goal definitions a performance index is calculated.
The performance index for a service class is a single number which tells whether the goal definition could be met, has been overachieved or was missed. WLM modifies the access of the service classes based on the achieved performance index and importance. For this it uses the collected data to project the possibility and result of a change. The change is executed if the forecast comes to the result that it is beneficial for the work based on the defined customer expectations. WLM uses a data base ranging from 20 seconds to 20 minutes to contain a statistically relevant basis of samples for its calculations. Also in one decision interval a change is performed for the benefit of one service class to maintain a controlled and predictable system.

WLM controls the access of the work to the system processors, the I/O units, the system storage and starts and stops processes for work execution. The access to the system processors for example is controlled by a dispatch priority which defines a relative ranking between the units of work which want to execute. The same dispatch priority is assigned to all units of work which were classified to the same service class. As already stated the dispatch priority is not fixed and not simply derived from the importance of the service class. It changes based on goal achievement, system utilization and demand of the work for the system processors. Similar mechanisms exist for controlling all other system resources. This way of z/OS Workload Manager controlling the access of work to system resources is named goal oriented workload management and is in contrast to resource entitlement based workload management which defines a much more static relationship how work can access the system resources. Resource entitlement based workload management is found on larger UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 operating systems for example.

A major difference to workload management components on other operating systems is the close cooperation between z/OS Workload Manager and the major applications; middleware and subsystems executing on z/OS. WLM offers interfaces which allow the subsystems to tell WLM when a unit of work starts and ends in the system and to pass classification attributes which can be used by the system administrator to classify the work on the system. In addition WLM offers interfaces which allow load balancing components to place work requests on the best suited system in a parallel sysplex cluster. Additional instrumentation exists which helps database and resource managers to signal contention situations to WLM so that WLM can help the delayed work by promoting the holder of resource locks and latches.

Over time z/OS Workload Manager became the central control component for all performance related aspects in a z/OS operating system. In a Parallel Sysplex cluster the z/OS Workload Manager components work together to provide a single image view for the executing applications on the cluster. On a System z with multiple virtual partitions z/OS WLM allows to interoperate with the LPAR
LPAR
A logical partition, commonly called an LPAR, is a subset of computer's hardware resources, virtualized as a separate computer. In effect, a physical machine can be partitioned into multiple logical partitions, each hosting a separate operating system....

 Hypervisor
Hypervisor
In computing, a hypervisor, also called virtual machine manager , is one of many hardware virtualization techniques that allow multiple operating systems, termed guests, to run concurrently on a host computer. It is so named because it is conceptually one level higher than a supervisory program...

 to influence the weighting of the z/OS partitions and to control the amount of CPU capacity which can be consumed by the logical partitions.

Literature

  • Paola Bari et al.: System Programmer's Guide to: Workload Management. IBM Redbook, SG24-6472

External links


See also

  • Unit Control Block
    Unit Control Block
    In IBM mainframe operating systems from the OS/360 and successors line, a Unit Control Block is a memory structure, or a control block, that describes any single input/output peripheral device , or an exposure , to the operating system...

    , for a description how WLM controls dynamic Parallel Access Volumes (PAVs)
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK