Big Data: Leverage the New Fantasy Football Data Model?

Big Data analytics are all in fashion these days, but there are many issues with how analytics are used, and how developing the appropriate analytics takes a correct and thorough data model.  As I talked about in my previous Big Lies, Big Damn Lies and Statistics blog post, there are many different ways analytics can be distorted, misused or interpreted incorrectly.

Recently there have been a lot of issues with Big Data systems and their analytics.  Google, one of the biggest Hadoop users in the world, had a worldwide system outage. Referencing the 12 predictive analytics screw-ups article in ComputerWorld, you can see it highlights even more analytics issues in numerous situations. This ComputerWorld article is wonderful because it reiterates a number of the concepts and ideas about Big Data analytics that I have recently blogged about.

One of the most interesting articles I read was about the 11 points that Nate Silver talked about recently at the Joint Statistical Meeting.  Mr. Silver’s comments and analysis are always good because it gives us all a reality check and helps put our Big Data analytics efforts into the proper perspective.  Also, check out the Twitter comments at #jsm2013. The analytics, theorems, and counterfactuals discussed at the JSM conference were filled with complex analytical formulas, visualization theories, and graphic display controversies from scientific to comical, such as “How Many Licks to the Tootsie Roll Center of a Tootsie Pop.”

Big Data analytics is also being applied to everything from baseball’s Bill James’ sabermetrics or “Moneyball” movement to the new NFL fantasy football data analytics model detailed here.  Big Data and analytics are being applied to everything.  I am using their NFL data model as a starting point for my own NFL data model to help me win my pool.  So remember all the injury reports when factoring in your team for the coming games and realize that big lies, big damn lies and statistics can sometimes lead to errors determining Tootsie Roll Licks or choosing fantasy football choices over the pick of the novice that wins by selecting the blue team over the green team.

____________________________________________________
Dave Beulke is a system strategist, application architect, and performance expert specializing in Big Data, data warehouses, and high performance internet business solutions.  He is an IBM Gold Consultant, Information Champion, President of DAMA-NCR, former President of International DB2 User Group, and frequent speaker at national and international conferences.  His architectures, designs, and performance tuning techniques help organization better leverage their information assets, saving millions in processing costs.

________________________________________________________

Also as President of the Washington DC DAMA National Capital Region, I wanted to let you know of the great speakers and topics that we have for our September 19th DAMA Day.  Register today!

dama-header-01

gartner-logo

The Roadmap for Successful Big Data Adoption

Svetlana Sicular – Research Director – Data Management Strategies

Companies pursuing big data solutions are unsure what to expect and when. They want to know where they are and how to move forward. The common big data adoption patterns are becoming apparent. This presentation outlines a roadmap with typical stages and milestones of a big data journey, from initiation to a data-driven enterprise.  The roadmap will help you to develop a successful big data strategy. You will avoid common mistakes caused by big data myths. The actions necessary to advance along the stages will guide you on the way to information centricity where big data will become the new normal.

ibm-logo

A Holistic Approach to Big Data

Raul F. Chong – Senior Big Data and Cloud Program Manager – IBM Cloud Computing Center of Competence and Evangelism

By now, everyone in the IT industry has probably heard about Big Data, and why tackling the big data problem is important. While there is no standard definition of Big Data, most people equate it to dealing with Hadoop technology. In this presentation we take a holistic approach and describe not just Hadoop-related data as big data, but also data in motion (real-time analytics), and data in place (data exploration); and explain how these different big data is related and used. We also describe the most common big data use cases we’ve been hearing from customers working on Big Data problems in recent years and provide demos of the technology that can be used to tackle them. The presentation is geared towards both management and technical audiences.

think-big-logo

Big Data Within the Large Enterprise; Navigating Implementation and Governance

John Adler – Data Management Group

Madina Kassengaliyeva – Think Big Analytics

Companies are transforming their businesses by leveraging the power of Big Data. At the same time organizations have to operate within their existing business and meet operational and environmental demands and constraints which are frequently dictated by IT and Data Governance bodies.  The transformation beyond traditional business intelligence to a truly data-driven organization requires a Big Data roadmap (tools, processes, and training to build deep data science & engineering expertise) which is aligned with existing organizational structures and enables new capabilities.   So how do large Policy driven organizations adjust to be able to capture the value of Big Data?

In this presentation we will discuss governance as part of a larger Big Data roadmap and use case studies to walk through a number of Big Data projects within large companies. We will discuss how they navigated existing oversight processes and structures to enable the organization to innovate on top of Big Data platforms while meeting its governance requirements.  We will review key decisions and strategies focused around whether to go organizationally wide or narrow, how to align with current and future projects, development of Stewardship and metrics, and alignment with organizational priorities and culture.

embarcadero-logo

Vendor Presentation

Embarcadero Technologies, Inc. is a leading provider of award-winning tools for application developers and database professionals so they can design systems right, build them faster and run them better, regardless of their platform or programming language. Ninety of the Fortune 100 and an active community of more than three million users worldwide rely on Embarcadero products to increase productivity, reduce costs, simplify change management and compliance, and accelerate innovation. Founded in 1993, Embarcadero is headquartered in San Francisco, with offices located around the world. Flagship database suites include ER/Studio and DB PowerStudio.

For more information please visit www.embarcadero.com.

button-register-now

________________________________________________________

Also I will be talking more about Big Data design considerations, the new BLU technology, Hadoop considerations, UNION ALL Views and Materialized Query Tables during my presentation at the International DB2 Users Group IDUG EMEA conference in Barcelona, Spain, October 13-17, 2013.  My speech is Wednesday October 16th at 9:45 “Data Warehouse Designs for Big Data” in the Montjuic room.

This presentation details the designing, prototyping and implementing a 22+ billion row data warehouse in only six months using an agile development methodology.  This complex analytics big data warehouse architecture took processes for this federal government agency from 37 hours to seconds.  For more information on the conference go to www.idug.org.

__________________________________________________________

I will also be presenting at the Information on Demand (IOD) conference in Las Vegas November 3-7, 2013.  I will be presenting “Big Data Disaster Recovery Performance” Wednesday November 6, at 3 pm in the Mandalay Bay North Convention Center – Banyan D.

This presentation will detail the latest techniques and design architectures to provide the best Big Data disaster recovery performance.  The various hardware and software techniques will be discussed highlighting the Flash Copy and replication procedures critical to Big Data systems these days.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>