Big Data Disaster Recovery: 4 Reasons Why DB2 Cloning is Excellent

Big Data disaster recovery is a big issue.  Of course, any Big Data business sponsor vows that the entire Big Data database is necessary, and demands it be included in the disaster recovery plans.  This requirement makes getting a Big Data disaster recovery sync point and understanding the intimate Big Data processing transaction details to determine the important caching requirements important parts of your disaster recovery planning.

Thankfully the DB2 Cloning technologies, mostly used in the DB2 zOS environment, make Big Data disaster recovery possible.  The following 4 Big Data disaster recovery cloning techniques can help you set up your own solution that can take the pain out of Big Data disaster recovery execution.

  1. Within DB2 zOS there is the simple powerful BACKUP SYSTEM utility.  This BACKUP SYSTEM utility, in combination with DFSMS, provides the facilities to create a backup image of the entire DB2 subsystem, all its system components, and its various application databases.  This BACKUP SYSTEM utility leverages all of your previously completed work using standardized storage groups within the DB2 system setup and within your application database definitions.  These new storage devices, their frame arrays, standardized storage group definitions, and properly defined DFSMS are all vital components of a Big Data disaster recovery solution.  These storage frames can be mirrored locally or globally to provide a comprehensive first-class bullet-proof solution which can provide an instantaneous Big Data disaster recovery solution.
  2. Using proper DFSMS definitions, storage frame designs, and DB2 Cloning facilities, Flash Copying or cloning storage volumes can also be quickly done.  The DB2 Cloning process facilitates managing your DB2 system and application databases at a broad storage volume level.  Using this level of Flash Copying and cloning of storage volumes can done quickly to clone a DB2 system, an entire infrastructure, or only a distinct application database for a quick point-in-time backup.  Terabytes of volumes defined properly within your storage framework can quickly be cloned to acquire a system or database Big Data disaster recovery image.
  3. Another Big Data disaster recovery option is called Partial Subsystem Cloning (PSSC).  This PSSC facility is great for Big Data because it provides a way to quickly replicate an entire system or only a portion of a DB2 system’s databases into another cloned DB2 subsystem.  Again, using a proper DFSMS framework and standardized storage group definitions, terabytes of system resources, databases, and associated application flat files can be quickly cloned.  This PSSC method and the previously mentioned methods are especially beneficial because almost any point in time can be used for your Big Data disaster recovery point.  Also, the actual Flash Copy cloning process is very quick, cloning huge amounts of terabytes with no processing downtime.
  4. Another granular technique to use for Big Data disaster recovery is the database table space (TS) cloning technique.  This TS cloning method requires the same DFSMS and storage group discipline and these backups can be taken in a Flash Copy instantaneous process.  This technique is great for the backup process, but if the backup is to be used as input into another QA system the Flash data copy needs its internal DB2 identifiers changed to the target DB2 environment.  Depending on the amount of Big Data, this TS translation process is time prohibitive.  If your situation only requires a portion of the data be included then this technique works great as a Big Data disaster recovery method.

Flash Copy and cloning of data sets with the coordination of your DFSMS storage framework and your DB2 systems is the state of the art for Big Data Disaster recovery scenarios.  Given these four different techniques you can quickly implement the right solution that fits your DB2 system, sync point, table space, or Big Data disaster recovery requirements.
Dave Beulke is a system strategist, application architect, and performance expert specializing in Big Data, data warehouses, and high performance internet business solutions.  He is an IBM Gold Consultant, Information Champion, President of DAMA-NCR, former President of International DB2 User Group, and frequent speaker at national and international conferences.  His architectures, designs, and performance tuning techniques help organization better leverage their information assets, saving millions in processing costs.

I will present twice at the International DB2 Users Group in Barcelona.  The first presentation will be talking more about Big Data design considerations, the new BLU technology, Hadoop considerations, UNION ALL Views and Materialized Query Tables during my presentation at the International DB2 Users Group IDUG EMEA conference in Barcelona, Spain, October 13-17, 2013.  My speech is Wednesday October 16th at 9:45 “Data Warehouse Designs for Big Data” in the Montjuic room.

This presentation details the designing, prototyping and implementing a 22+ billion row data warehouse in only six months using an agile development methodology.  This complex analytics big data warehouse architecture took processes for this federal government agency from 37 hours to seconds.  For more information on the conference go to

The second presentation is “Agile Data Performance and Design Techniques” Tuesday October 15, at 16:30-17:30.  Would you like to have enough time to look at all the DB2 database performance options and design alternatives?  In this presentation you will learn how Agile application development techniques can help you, your architects, and developers get the optimum database performance and design.

Learn how Agile development techniques can quickly get you to the best database design.  This presentation will take you through the Agile continuous iterations with releases that reflect the strategy of optimum database design and application performance.  Using these techniques you and your developers will be able to uncover the performance considerations early and resolve them without slowing down the Agile time boxing processes and incremental development schedule.  For more information on the conference go to


I will also be presenting at the Information on Demand (IOD) conference in Las Vegas November 3-7, 2013.  I will be presenting “Big Data Disaster Recovery Performance” Wednesday November 6, at 3 pm in the Mandalay Bay North Convention Center – Banyan D.

This presentation will detail the latest techniques and design architectures to provide the best Big Data disaster recovery performance.  The various hardware and software techniques will be discussed highlighting the Flash Copy and replication procedures critical to Big Data systems these days.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>