Big Data: Managing by Your Management Expectations

Big data continues to be all the rage for government, private and public companies as the new vendors scramble to make their products work and others rebrand last year’s to handle bigger data than their competitors.  The problem is that new and old vendors are still missing the scalability or performance mark for making big data reliable, scalable and available.

Every vendor talks about the 3 Vs, volume, variety and velocity, of big data and discovering the hidden gems of unknown data or patterns that make the company millions.  But unfortunately many of these vendors are only making minor improvements and their solutions are not truly scalable.  While other new big data vendors are recreating existing products with only new marketing slogans instead of breakthrough technology.  The following are three reality check items that should be discussed with your management to bring expectations into reality.

Is the big data discussion focused on big data technology or what the big data can bring as a solution?  Being a veteran of IT trends over the years, I’ve discovered every new technology value is always exaggerated but truly holds value if implemented properly.  Just as the trends of the past have proven there are real value, real IT savings, and optimization available through client/server, Internet e-commerce, and data warehousing, this will also be true with the big data solutions now dominating the conversations.  The companies that were successful with these past trends were not the companies that implemented trendy technologies but companies that thought about what the client/server distributed model, the internet, and data warehousing business analytics could truly do for the company and what big data can do to bring value.  The big data trend is not the new technology or vendor.  Is the new DBMS providing a difference for your processing?  Or can your existing systems provide scalable reliable massively parallel processing also? Ask yourself for how many of the trendy vendors with their proprietary application interfaces will you be able to find skills or support for in five years.  How many of the former vendors from those previous IT waves were still around after five years?  It will be the same with all these new big data vendors. Using existing infrastructures might make more sense for your proof of big data concepts.

Are the solution discussions based on competitors, past big data IT experiences, or business sector big data case studies?  As I mentioned a couple of weeks ago in this blog, Jeff Jonas talked about his experiences dealing with fantasy ideas for big data systems.  Just because big data is rumored to contain that extra missing piece of data doesn’t mean that it truly exists.  Research into all possible different types of big data is needed to make sure that what management desires is even possible.

Big data systems are being built around all types of structured and unstructured big data; pictures, videos, emails, web logs and other non-conventional big data are inputs.  With these traditional and non-conventional big data the standard cleansing, validation and integrity checks are still necessary to achieve reliable results.  Determine your big data use cases for these critical preprocessing activities and then a post processing profit or value conclusion use case so that everyone understands the goals for your big data processing.  It is a very interesting and a diverse discussion hearing everyone’s various profit ideas for the big data system.

So when a new technology DBMS or technology is mentioned for your big data solution, understand that there are additional items that need discussion before the purchase order is cut.


I look forward to supporting the DB2 community through the local DB2 User Groups.

I am coming to Dallas and Austin, Texas October 10th and 11th and look forward to presenting my “Agile Big Data Analytics: Implementing a 22 Billion Row Data Warehouse” and “Java DB2 Developer Performance Best Practices” speeches.  Check the website  for Dallas for more information.  The Austin one should be updated soon.

I will be talking more about Big Data, UNION ALL Views and Materialized Query Tables during my presentation at the Information on Demand (IOD) conference in Las Vegas October 21st through 25th.  Through my speech “Agile Big Data Analytics: Implementing a 22 Billion Row Data Warehouse” Monday, October 22, 10:15 – 11:15 am in the Mandalay Bay North Convention Center – Islander C.  This presentation details the designing, prototyping and implementing a 22+ billion row data warehouse in only six months using an agile development methodology.  This complex analytics big data warehouse architecture took processes for this federal government agency from 37 hours to seconds.

Also I look forward to supporting the International DB2 Users Group (IDUG )conference in Berlin, Germany November 5th-9th with two topics “Data Warehouse Designs for Performance” and “Java DB2 Developer Performance Best Practices” on Tuesday November 6th.

On December 4th 5th and 6th I will be presenting at the Minneapolis, Milwaukee and  Chicago DB2 User groups.

Please come by any of these presentations and say, “Hi.”


Dave Beulke is an internationally recognized DB2 consultant, DB2 trainer and education instructor.  Dave helps his clients improve their strategic direction, dramatically improve DB2 performance and reduce their CPU demand saving millions in their systems, databases and application areas within their mainframe, UNIX and Windows environments.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>