![]() |
Oh boy, I love this time of year! Lots of candy and cookies to eat, and time to gaze into my crystal ball and, like every other industry “expert”, make outrageous predictions for 2015. So without further fanfare, let’s jump into it! The Data Lake Gains TractionThere will certainly be plenty of hype around the data lake this year, and I’m going to do my share of contributing to that hype. I’ll be speaking at the Santa Clara Strata event in February where I’m talking about the role of the data lake within the Big Data MBA framework; that is, how the data lake helps organizations optimize key business processes and uncover new monetization opportunities. However, my take on the data lake is a bit different than that of others. Although I think long-term the data lake has huge potential to empower and/or disrupt the traditional data warehouse market, I think the immediate benefits are much more mundane: 1) provide a line of demarcation between the data warehouse and the newly christened analytic sandbox, and 2) off-load ETL processing off of your expensive, SLA-constrained data warehouse (see Figure 1). In fact, I think the data warehouse managers will be the biggest beneficiaries of the data lake in 2015 since ETL processing consumes 60 to 70% of the processing cycles on some of today’s largest data warehouses. Boring! ![]() Figure 1: Modern Data / Analytics Environment More Relevant Real-world Business Success Stories…We’ll start hearing more big data “business success” stories, and not just the same old stories from the same old companies. We’ll start hearing from municipalities and state governments, casinos and resorts, schools and universities, energy providers, credit unions, small retailers, high-tech manufacturers, distributors and wholesalers, health care providers and payers, and other organizations. And these success stories will start with small successes – improving marketing campaign effectiveness, increasing customer store visits, improving customer / employee / teacher/ nurse retention, predictive maintenance. And these small successes will build upon each other. For example, what you learn from improving marketing campaign effectiveness will impact how you improve customer retention and product design. But then again, maybe these companies won’t talk. Why give away trade secrets in how they are gaining insights about their customers, products, employees and operations that help them to optimize key business processes and uncover new monetization opportunities? So then again, maybe we’ll be stuck with the same old stories from the same old companies… …But Also a Couple of Colossal Hadoop Failures
No matter the reason, failure is important because you can sometimes learn more from failure than from success. We just need to capture, triage and share these failures if everyone is to benefit. More Native Hadoop Tools and ProductsThe Silicon Valley and the VC community are working hard to make the data scientist obsolete, even before we’ve come to realize how valuable these folks are. “Business Objects Killers” and “Tableau Killers” and “SAS Killers” are lurking everywhere and these start-ups are doing a two things that may make them viable options: 1) they are building upon open source technologies (standing on the shoulders of others) and 2) they are building tools that run natively on Hadoop and HDFS; that is, they are building tools and products to run natively on Hadoop and not just treat Hadoop as yet another data source. If I hear one more RDBMS or Business Intelligence vendor announce “Don’t worry, our products will interoperate with Hadoop,” I think I’ll throw up. Here’s what I think of that “let’s just interoperate with Hadoop” strategy… Data Governance Moves Front and CenterI love the industry pundits who quickly jump on the “What about data governance?” issue when we talk about big data and the data lake. Well, what about it? Of course we know it’s important and of course smart organizations never forgot about it (remember, I said smart organizations). As the volume of data grows in the data lake, governance becomes even more of a critical tool for answering the data “What is it?”, “Where is it?”, and “Who has access to it?” questions. However the data governance discussion takes on a new wrinkle when you contemplate data in the data warehouse versus data in the data lake. As my friend Rachel Haines writes and speaks about data governance in a big data world, organizations are going to realize that there needs to be different “degrees” of data governance:
As Rachel says, in the big data world, the goal for the smart organization should be “Just-enough Data Governance”. Why waste cycles governing data when that data might not even be used by the organization? But once the value of that data has been ascertained, then the appropriate degrees of governance need to be determined and applied. The Rise of the CDMOWhat is the CDMO? It’s the “Chief Data Monetization Officer” and I think CDMO is a much better moniker for the organization’s data champion than “Chief Data Officer” or CDO. They more I talk about it, the more I don’t like the title Chief Data Officer; it misses the primary responsibility of the CDMO role which is to lead the organization in identifying, valuing, acquiring, analyzing and monetizing the organization’s data assets. To be successful, the CDMO will have to become proficient at identifying and valuing both internal and external (public, third party, open data) data sources in order to uncover new monetization opportunities. If the role is only managing data, well, that’s what the Chief Information Officer did. Organizations need a senior executive whose 100% focus is on how to leverage data in order to create competitive differentiation and drive a more compelling, more profitable customer relationship. That’s all. Data Scientist Shortage Shrinks, But…Universities, colleges and large organizations are scrambling to fill the data scientist resources gap. Online or on-premise, the number of data science classes and associated degrees and certification are exploding. And while these educational organizations scramble to teach advanced statistics, data mining and predictive algorithms, analytic tools, and visualization techniques, these data scientists for the most part will continue to fall short of expectations for one simple reason – they just don’t and likely won’t ever understand the business as well as the business stakeholders and Subject Matter Experts (SME). That’s why our Big Data Vision Workshops and Proof of Value Labs couple the data science team with the business SME’s. The SMEs live in the business, so they are in the best position to lead the data science team by:
Open Source Software Gets Bigger, and Profits More ElusiveOpen source software is wonderful, unless you are in the software business. It’s really hard for a software company to figure out how to make money when the base software platform is free and openly modifiable by anyone. And 2015 will show that many start-ups haven’t figured out how to make money in this market either.
Now I’m seeing several start-up companies trying to create some unique software capability that sits on top of Hadoop. However the challenge is that software advantages are fleeting in an open source community driven by open source software generating giants such as Facebook, Google, eBay, Yahoo and many, many others. The bottom line is that the best ideas can and will be eventually replicated by the open source community. So that leaves two ways to make money on open source products like Hadoop:
There are a lot of software start-up companies that believe that there are other options. I hope that they prove me wrong. My Next “Big Data MBA” Book?A lot has happened since I released by book “Big Data: Understanding How Data Powers Big Business” in October 2013. Lots of new learnings and new approaches have surfaced that can help organizations identify not only where and how to start their big data journeys, but more importantly can help them identify business opportunities to leverage customer, product and operational insights to optimize key business processes and uncover new monetization opportunities. Maybe I’ll find the time on airplane flights (or in the terminals weathering yet again another flight delay), sitting in hotel lobbies or grabbing a coffee at one of my local watering holes to undertake that next edition – The Big Data MBA! That edition would further focus on helping to empower the business stakeholders and leveraging big data to power an organization’s value creation processes. I also plan on continuing to find time to teach my Big Data MBA course. I find teaching both exhilarating as well as educational…for me. These classes give me a chance to apply learnings from customer engagements and fine tune approaches and methodologies. And hopefully everyone wins in that case. Here’s to a BIG #BigData 2015! |
