Big Data: 3 Criteria for Invaluable Analytics

In last week’s blog post on company and government analysis of our digital footprints, I mentioned several aspects of the Big Data digital horizon.  The last point that I made in that post about the other data aggregators was highlighted in the movie preview of “Terms and Conditions May Apply” that details these data harvesting activities and privacy issues. (You can see a preview by clicking here).  In the movie preview there is a comment that notes that each person can potentially have over 1,500 data points gathered by  various organizations.

While it may be a concern for the individual that a big bad company or government may have 1,500 data points about him or her in their system, the company has a different issue.  There are many ways that Big Data can be misunderstood.

When starting any Big Data project, the following three criteria are critical for developing invaluable Big Data analytical success.

First discover, understand and evaluate any existing data points and analytics.  There is never time to do things right, but always time to do them over.  As old data warehouses get repurposed or expanded into Big Data projects take the time to understand the context, value, and existing return on investment from these earlier data points and their associated analytics.

Even if there isn’t an IT data warehouse there is usually a process or methodology within marketing, sales or customer service that is using analytics.  Research any of the company’s existing analytical processes, data points, and formulas to uncover the veteran company data steward with deep industry expertise who understands the context of the analytical data points.

Expanding into a new Big Data system can really magnify the business analytical processes for greater success than previously known. Alternatively, the expansion can highlight problems in existing processes.  Hopefully, there is documentation about the existing data points and analytics being used.  Discover and augment any documentation on the existing data points, understand their context and value, and realize why existing decisions have worked so they can be expanded and applied properly within your new Big Data efforts.

Next expand the data points.  Big Data is about using additional new data sources. In addition to aggregation of standard data, it is supplemented with structured, unstructured, or unconventional data sources from social data to tweets to machine data.  Gather and analyze more data points so that you can understand and leverage your business Big Data situation easier and faster.
If possible segment, partition, and split your existing data points into finer grain data points.  This provides a way to analyze the validity of more of your assumptions and discover a more precise tipping point for your Big Data business processes and analytical activities.

Finally, weigh, analyze, and refine your data point valuations.  Within all of your analysis there are certain data points that are more important than others.  If possible, develop a Big Data relevance or importance rating to weigh each of your data points.  By evaluating and scoring each of your data points your Big Data system can make more detailed and reliable decisions.  These evaluations of the importance or weight of the data point can help provide more business confidence and proper return on investment evaluation criteria for bringing more data points into your Big Data system.

Big Data driven analytics is only as good as the understanding of your context, weighting and grain of your data points and the analytic calculations performed.  Expand your existing knowledge processes, refine and expand your data points and understand the weight or impact of each of these data points to fully implement the best Big Data system and analytics possible.


Dave Beulke is a system strategist, application architect, and performance expert specializing in Big Data, data warehouses, and high performance internet business solutions.  He is an IBM Gold Consultant, Information Champion, President of DAMA-NCR, former President of International DB2 User Group, and frequent speaker at national and international conferences. His architectures, designs, and performance tuning techniques help organization better leverage their information assets, saving millions in processing costs.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>