Parallel sysplex

One of the most distinguishing features of the z/OS operating system is the way you can cluster z/OS systems in a Parallel Sysplex. Parallel Sysplex, or Sysplex in short, is a feature of z/OS that was built in the 90s that enables extreme scalability and availability.

In the previous post we highlighted the z/OS Unix part. Here we will dive into the z/OS Parallel Syplex.

A cluster of z/OS instances

With Parallel Sysplex you can configure a cluster of z/OS operating system instances. In such a sysplex you can combine the computing power of multiple of z/OS instances on multiple mainframe boxes into a single logical z/OS server.

When you run your application on a sysplex, it actually runs on all the instances of the sysplex. If you need more processing power for your applications in a sysplex, you can add CPUs to the instances, but you can also add a new z/OS system to the sysplex.

This makes a z/OS infrastructure is extremely scalable. Also, a sysplex isolates your applications from failures of software and hardware components. If a system or component in a Parallel Sysplex fails, the software will signal this. The failed part will be isolated while your application continues processing on the surviving instances in the sysplex.

Special sysplex components: the Coupling Facility

For a parallel sysplex configuration, a special piece of software is used: a Coupling Facility. This Coupling Facility functions as shared memory and communication vehicle to all the z/OS members forming a sysplex.

The z/OS operating system and the middleware can share data in the Coupling Facility. The type of data that is shared are the things that members of a cluster should know about each other since they are action on the same data: status information, lock information about resources that are accessed concurrently by the members, and caching of shared data from databases.

A Coupling Facility runs in a dedicated special operating system, in an LPAR of its own, to which even system administrators do not need access. In that sense it is a sort appliance.

A sysplex with Coupling Facilities is depicted below. There are multiple Coupling Facilities to avoid a single point of failure. The members in sysplex connect to the Coupling Facilities. I have not included all the required connections in this picture, as that would become a cluttered view.

A parallel sysplex

Middleware exploits the sysplex functions

Middleware components can make use of the sysplex features provided by z/OS, to create clusters of middleware software.

Db2 can be clustered into so-called Datasharing Group. In a Datasharing Group you can create a database that can process queries on multiple Db2 for z/OS instances on multiple z/OS systems.

Similarly WebSphere MQ can be configured in a Queue Sharing Group, CICS in a CICSPlex, IMS in an IMSPlex and other software like WebSphere Application Server, IDMS, Adabas and other middleware use parallel sysplex functions to build highly available and scalable clusters.

This concept is illustrated in Figure 15. Here you see a cluster setup of CICS and Db2 in a sysplex. Both CICS and Db2 form one logical middleware instance.

A parallel sysplex cluster with Db2 and CICS
A parallel sysplex cluster with Db2 and CICS

You can see the big benefit of parallel sysplex lies in it’s a generic facilties to build scalable and high available clusters of middleware solutions. You can achieve similar solutions on other operating systems, but every middleware component needs to supply its own clustering features to achieve such a scalable and highly available configuration. This often needs additional components and leads to more complex solutions.

How is this different from other clustering technologies?

What is unique about a parallel sysplex is that it is a clustering facility that is part of the operating system.

On other platforms you can build cluster of middleware tools as well, but these are always specific solution and technologies for that piece of middleware. The clustering facilities are part of the middleware. With parallel sysplex, clustering is solved in a central facility, in the operating system of z/OS.


An extension to Parallel Sysplex is Geographically Dispersed Parallel Sysplex, GDPS for short.  GDPS provides an additional solution to assure your data remains available in case of failures. With GDPS you can make sure that even in the case of a severe hardware failure, or even a whole data centre outage, your data remains available in a secondary datacentre, with minimal to no disruption of the applications running on z/OS.

In a GDPS configuration, your data is mirrored between storage systems in the two data centres. One site has the primary storage system, the storage system in the other data centre receives a copy of all updates. If the primary storage system, or even data centre fails, GDPS automatically makes the secondary storage device the primary, usually without disrupting any running applications.

A very short summary of replication solutions (for Db2)

  • Post category:Db2
  • Reading time:4 mins read

Some time ago I did a short summary presentation on my experience with replication solutions for Db2 on z/OS. The pictures and text are quite generic, so I thought it might be worthwhile sharing the main topics here. The picture below summarizes the options visually:

Queue replication

Synchronizes tables. The synchronization process on the capture side reads the Db2 transaction log, and puts the updates for which a “subscription” is defined on a queue. On the apply side, the tool retrieves the updates from the queue and applies them to the target database.

SQL replication

Also synchronizes tables. In this case the capture process stores the updates in an intermediate or staging table, from which the apply process takes the updates and applies them to the target tables.

Data Event Publishing

Takes the updates to the tables for which a subscription is defined and produces a comma-delimited or xml message from it which is put on a queue. The consumer of the message can be any user-defined program.

Change Data Capture

CDC provides a flexible solution that can push data updates to multiple forms of target media, whether tables, messages or an ETL tool.


After my short summary, we dug a little in the requirement’s for the specific problem this team was aiming to address. They needed:

  • A lean operational database for the primary processes.
  • Ad-hoc and reporting queries on archival tables, where data in archive table can be kept over several years.
  • The amount of data is relatively large: it should support tens to hundreds of millions of database updates per day, with a peak of tens of millions in an hour.
  • Target database management system was not decided yet; could be Db2 or Oracle.

So a solution should replicate the operational database to an archive database, while data must be very current, demanding near-realtime synchronization.

We focused a it on the Queue Replication solution. The target DBMS for Queue replication can be Db2, Oracle and SQL Server (and a few more). Furthermore, in my experience this solution can support:

  • High volumes in peak periods: millions of row inserted/updated in short period of time
  • Latency can remain within seconds, even in peak periods – this does require tuning of the solution, such as spreading messages over queues.For selected table you can specify suppress deletes, which allows for building up of historical data. 

There are a few concerns in the Queue Replication solution:

  • Data model changes will require coordination of changes in source, Queue Replication configuration and target data model.
  • Very large transactions (not committing often enough) may be a problem for Queue Replication (and also a very bad programming practice).

Hope this helps with your replication needs.

Db2 SQL in batch

  • Post category:JCL
  • Reading time:1 mins read

Again a simple solution for a common problem: how to run a Db2 query from a batch script. Here we use the utility DSNTEP2 that is provided for this purpose with the Db2 installation.

In the STEPLIB, specify your names for Db2 runtime libraries.

In the SYSTEM (xxxx) clause specify your Db2 subsystem.

The SQL in the SYSIN label can be taken from in-stream, or from a dataset as below.

 DSN SYSTEM (xxxx)                                       
//SYSIN    DD *