Big Data Analytics: Union ALL View – Uniqueness through INSTEAD Triggers

The discussion continues about the building of a petabyte table by putting multiple tables under the definition of a UNION ALL View.  This provides unlimited growth potential for the logical big data table to all types of regular and analytical users.  Sometime you want to have uniqueness across the entire petabyte table and this can be an issue if you don’t know about INSTEAD Triggers within DB2.

In the example at the end of this post we have the UNION ALL View definition with check constraints on each of the individual yearly tables included in the UNION ALL View.  These check constraints make sure the data matches the year of the table name.  This eliminates the issue of making sure the correct data gets into the right tables.  This also helps parallelism because we can set up multiple partitions per day for faster access, utilities and reporting improving overall performance tremendously for all areas of the application.

One of the issues with setting up a UNION ALL View petabyte table is the ability to keep the uniqueness and relationships across the database.  When you have all these yearly individual tables sometimes the data uniqueness must be guaranteed across all the underlying tables.  When this is the case INSTEAD OF INSERT and UPDATE triggers can be used to verify uniqueness within all the tables within UNION ALL View definition.

In our example if we wanted to guarantee that the PROD_ID data was unique across the many years we could implement the following INSTEAD OF INSERT trigger.  The INSTEAD OF INSERT trigger will execute before the data is put into the table and test the values to make sure it is unique across all the underlying UNION ALL View tables.  So in our example we are guaranteeing that the PROD_ID is unique within both of the underlying tables.

– -#SET TERMINATOR ~
CREATE TRIGGER
PROD.I_MST_PROD_MST_ALL_1
INSTEAD OF INSERT
ON PROD.MST_PROD_MST_ALL
REFERENCING
NEW AS INPUT
FOR EACH ROW
MODE DB2SQL
BEGIN ATOMIC
INSERT INTO PROD.MST_PROD_MST_CV
SELECT
INPUT.PROD_ID,
INPUT.DEPT_KEY,
INPUT.METER_NBR,
INPUT.METER_VALU,
INPUT.PRS_VALU,
INPUT.DGE_VALU,
INPUT.SFT_VALU,
INPUT.CRT_TS
FROM SYSIBM.SYSDUMMY1 A
WHERE NOT EXISTS
(
SELECT 1
FROM PROD.MST_PROD_MST_V2012 A2012
WHERE A2012.PROD_ID   = INPUT.PROD_ID
)

; END
~
– -#SET TERMINATOR ;

Another consideration with working with UNION ALL Views is that when inserting through the UNION ALL View the new row to be inserted must only map to one of the base tables.  By defining the CHECK CONSTRAINTs on primary key column CRT_TS, the timestamp column guides the insert data to the appropriate single base table.

ALTER TABLE PROD.MST_PROD_MST_2012
ADD CONSTRAINT
RANGKEY1 CHECK ( CRT_TS >= ‘2012-01-01-00.00.00.000000’
AND   CRT_TS < ‘2013-01-01-00.00.00.000000’);

ALTER TABLE PROD.MST_PROD_MST_2013
ADD CONSTRAINT
RANGKEY1 CHECK ( CRT_TS >= ‘2013-01-01-00.00.00.000000’
AND   CRT_TS < ‘2014-01-01-00.00.00.000000’);

UNION ALL Views and INSTEAD OF INSERT Triggers can provide many flexible options for designing petabyte data warehouse and operational tables on both DB2 for z/OS and DB2 for LUW.  Through these UNION ALL Views it allows the database architect a full range of design options to grow a table to an unlimited big data size and partition, segregate and spread the data through as many base tables as the application or architect needs.

DROP  TABLESPACE PSIDB088.MSTTS001;
DROP  TABLESPACE PROD.MSTTS001;
COMMIT;
—–
CREATE  TABLESPACE MSTTS001
IN PSIDB088
USING STOGROUP PSISG
PRIQTY 1440
SECQTY 720
FREEPAGE 30
PCTFREE 19
SEGSIZE 64
BUFFERPOOL BP1
CCSID EBCDIC
COMPRESS YES
LOCKMAX SYSTEM
LOCKSIZE ANY
MAXROWS 255;

CREATE TABLE PROD.MST_PROD_MST_2012 (
PROD_ID   INTEGER NOT NULL WITH DEFAULT 0,
DEPT_KEY SMALLINT NOT NULL WITH DEFAULT 0,
METER_NBR INTEGER NOT NULL WITH DEFAULT 0,
METER_VALU INTEGER NOT NULL WITH DEFAULT 0,
PRS_VALU  CHAR(1) NOT NULL WITH DEFAULT ‘*’,
DGE_VALU  CHAR(1) NOT NULL WITH DEFAULT ‘*’,
SFT_VALU  CHAR(1) NOT NULL WITH DEFAULT ‘*’,
CRT_TS    TIMESTAMP NOT NULL
WITH DEFAULT ‘1900-01-01 01:01:01.000000’,
CONSTRAINT P_KEY_1 PRIMARY KEY (CRT_TS))
IN PROD.MSTTS001;

CREATE UNIQUE INDEX PROD.X2012001
ON  PROD.MST_PROD_MST_2012 (
CRT_TS ASC
)
CLUSTER
USING STOGROUP PSISG
PRIQTY 720
SECQTY 360
ERASE NO
FREEPAGE 0
PCTFREE 10
GBPCACHE CHANGED
BUFFERPOOL BP1
CLOSE YES
;

CREATE UNIQUE INDEX PROD.X2012002
ON  PROD.MST_PROD_MST_2012 (
PROD_ID ASC
)
USING STOGROUP PSISG
PRIQTY 720
SECQTY 360
ERASE NO
FREEPAGE 0
PCTFREE 10
GBPCACHE CHANGED
BUFFERPOOL BP1
CLOSE YES
;

CREATE TABLE PROD.MST_PROD_MST_2013 (
PROD_ID   INTEGER NOT NULL WITH DEFAULT 0,
DEPT_KEY SMALLINT NOT NULL WITH DEFAULT 0,
METER_NBR INTEGER NOT NULL WITH DEFAULT 0,
METER_VALU INTEGER NOT NULL WITH DEFAULT 0,

PRS_VALU  CHAR(1) NOT NULL WITH DEFAULT ‘*’,
DGE_VALU  CHAR(1) NOT NULL WITH DEFAULT ‘*’,
SFT_VALU  CHAR(1) NOT NULL WITH DEFAULT ‘*’,
CRT_TS    TIMESTAMP NOT NULL
WITH DEFAULT ‘1900-01-01 01:01:01.000000’,
CONSTRAINT P_KEY_1 PRIMARY KEY (CRT_TS))
IN PROD.MSTTS001;

CREATE UNIQUE INDEX PROD.X2013001
ON  PROD.MST_PROD_MST_2013 (
CRT_TS ASC
)
CLUSTER
USING STOGROUP PSISG
PRIQTY 720
SECQTY 360
ERASE NO
FREEPAGE 0
PCTFREE 10
GBPCACHE CHANGED
BUFFERPOOL BP1
CLOSE YES
;

CREATE UNIQUE INDEX PROD.X2013002
ON  PROD.MST_PROD_MST_2013 (
PROD_ID ASC
)
USING STOGROUP PSISG
PRIQTY 720
SECQTY 360
ERASE NO
FREEPAGE 0
PCTFREE 10
GBPCACHE CHANGED
BUFFERPOOL BP1
CLOSE YES
;

ALTER TABLE PROD.MST_PROD_MST_2012
ADD CONSTRAINT
RANGKEY1 CHECK ( CRT_TS >= ‘2012-01-01-00.00.00.000000’
AND   CRT_TS < ‘2013-01-01-00.00.00.000000′) ;

ALTER TABLE PROD.MST_PROD_MST_2013
ADD CONSTRAINT
RANGKEY9 CHECK ( CRT_TS >=’2013-01-01-00.00.00.000000’
AND   CRT_TS < ‘2014-01-01-00.00.00.000000’);

COMMIT;

CREATE VIEW PROD.MST_PROD_MST_ALL AS (
SELECT
PROD_ID,
DEPT_KEY,
METER_NBR,
METER_VALU,
PRS_VALU,
DGE_VALU,
SFT_VALU,
CRT_TS
FROM PROD.MST_PROD_MST_2012
UNION ALL
SELECT
PROD_ID,
DEPT_KEY,
METER_NBR,
METER_VALU,
PRS_VALU,
DGE_VALU,
SFT_VALU,
CRT_TS
FROM PROD.MST_PROD_MST_2013);
COMMIT;

CREATE VIEW PROD.MST_PROD_MST_V2012 AS (
SELECT
PROD_ID,
DEPT_KEY,
METER_NBR,
METER_VALU,
PRS_VALU,
DGE_VALU,
SFT_VALU,
CRT_TS
FROM PROD.MST_PROD_MST_2012);
COMMIT;

CREATE VIEW PROD.MST_PROD_MST_CV  AS (
SELECT
PROD_ID,
DEPT_KEY,
METER_NBR,
METER_VALU,
PRS_VALU,
DGE_VALU,
SFT_VALU,
CRT_TS
FROM PROD.MST_PROD_MST_2013);
COMMIT;

– -#SET TERMINATOR ~
CREATE TRIGGER
PROD.I_MST_PROD_MST_ALL_1
INSTEAD OF INSERT
ON PROD.MST_PROD_MST_ALL
REFERENCING
NEW AS INPUT
FOR EACH ROW
MODE DB2SQL
BEGIN ATOMIC
INSERT INTO PROD.MST_PROD_MST_CV
SELECT
INPUT.PROD_ID,
INPUT.DEPT_KEY,
INPUT.METER_NBR,
INPUT.METER_VALU,
INPUT.PRS_VALU,
INPUT.DGE_VALU,
INPUT.SFT_VALU,
INPUT.CRT_TS
FROM SYSIBM.SYSDUMMY1 A
WHERE NOT EXISTS
(
SELECT 1
FROM PROD.MST_PROD_MST_V2012 A2012
WHERE A2012.PROD_ID   = INPUT.PROD_ID
)

; END
~
– -#SET TERMINATOR ;

COMMIT;

SELECT *
FROM PROD.MST_PROD_MST_ALL;

INSERT INTO PROD.MST_PROD_MST_2012
VALUES (
002,  –INTEGER
002,  —SMALLINT
002,  —INTEGER
002,  —INTEGER
‘B’,  —CHAR
‘B’,  —CHAR
‘B’,  —CHAR
‘2012-02-02-22.00.00.000000’  — TS
);
COMMIT;

SELECT *
FROM PROD.MST_PROD_MST_ALL;

COMMIT;

INSERT INTO PROD.MST_PROD_MST_CV
VALUES (
099,  —INTEGER
009,  —SMALLINT
009,  —INTEGER
009,  —INTEGER
‘Z’,  —CHAR
‘Z’,  —CHAR
‘Z’,  —CHAR
‘2013-12-31-11.00.00.000000’  — TS
);

COMMIT;

SELECT *
FROM PROD.MST_PROD_MST_ALL;

INSERT INTO PROD.MST_PROD_MST_ALL
VALUES (
002,  –INTEGER
004,  —SMALLINT
004,  —INTEGER
004,  —INTEGER
‘D’,  —CHAR
‘D’,  —CHAR
‘D’,  —CHAR
‘2013-04-04-11.00.00.000000’  — TS
);

COMMIT;

SELECT *
FROM PROD.MST_PROD_MST_ALL;

_______________________________________________________

I look forward to supporting the DB2 community through the local DB2 User Groups.  I will be at the St. Louis DB2 Users Group September 11 presenting my “Performance Discoveries and Recommendation” speech detailing the many system, database, application and SQL performance issues and discoveries I have found at clients over the many years of DB2 zOS and DB2 LUW performance health checks.

Also I am finalizing coming to Dallas and Austin, Texas October 10th and 11th and look forward to presenting my “Agile Big Data Analytics: Implementing a 22 Billion Row Data Warehouse” and “Java DB2 Developer Performance Best Practices” speeches.

I will be talking more about Big Data, UNION ALL Views and Materialized Query Tables during my presentation at the Information on Demand (IOD) conference in Las Vegas October 21st through 25th through my speech “Agile Big Data Analytics: Implementing a 22 Billion Row Data Warehouse” Monday, October 22, 10:15 – 11:15 am in the Mandalay Bay North Convention Center – Islander C.  This presentation details the designing, prototyping and implementing a 22+ billion row data warehouse in only six months using an agile development methodology.  This complex analytics big data warehouse architecture took processes for this federal government agency from 37 hours to seconds.

Also I look forward to supporting the International DB2 Users Group (IDUG) conference in Berlin, Germany November 5th-9th with two topics “Data Warehouse Designs for Performance” and “Java DB2 Developer Performance Best Practices” on Tuesday November 6th.

Please come by any of these presentations and say, “Hi.”

______________________________________________________

Dave Beulke is an internationally recognized DB2 consultant, DB2 trainer and education instructor.  Dave helps his clients improve their strategic direction, dramatically improve DB2 performance and reduce their CPU demand saving millions in their systems, databases and application areas within their mainframe, UNIX and Windows environments.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>