Some time ago I did a short summary presentation on my experience with replication solutions for Db2 on z/OS. The pictures and text are quite generic, so I thought it might be worthwhile sharing the main topics here. The picture below summarizes the options visually:
Synchronizes tables. The synchronization process on the capture side reads the Db2 transaction log, and puts the updates for which a “subscription” is defined on a queue. On the apply side, the tool retrieves the updates from the queue and applies them to the target database.
Also synchronizes tables. In this case the capture process stores the updates in an intermediate or staging table, from which the apply process takes the updates and applies them to the target tables.
Data Event Publishing
Takes the updates to the tables for which a subscription is defined and produces a comma-delimited or xml message from it which is put on a queue. The consumer of the message can be any user-defined program.
Change Data Capture
CDC provides a flexible solution that can push data updates to multiple forms of target media, whether tables, messages or an ETL tool.
After my short summary, we dug a little in the requirement’s for the specific problem this team was aiming to address. They needed:
- A lean operational database for the primary processes.
- Ad-hoc and reporting queries on archival tables, where data in archive table can be kept over several years.
- The amount of data is relatively large: it should support tens to hundreds of millions of database updates per day, with a peak of tens of millions in an hour.
- Target database management system was not decided yet; could be Db2 or Oracle.
So a solution should replicate the operational database to an archive database, while data must be very current, demanding near-realtime synchronization.
We focused a it on the Queue Replication solution. The target DBMS for Queue replication can be Db2, Oracle and SQL Server (and a few more). Furthermore, in my experience this solution can support:
- High volumes in peak periods: millions of row inserted/updated in short period of time
- Latency can remain within seconds, even in peak periods – this does require tuning of the solution, such as spreading messages over queues.For selected table you can specify suppress deletes, which allows for building up of historical data.
There are a few concerns in the Queue Replication solution:
- Data model changes will require coordination of changes in source, Queue Replication configuration and target data model.
- Very large transactions (not committing often enough) may be a problem for Queue Replication (and also a very bad programming practice).
Hope this helps with your replication needs.