Advanced Analytics Platforms – Big Changes in the Leaderboard
Summary: The Magic Quadrant for Advanced Analytic and ML Platforms is just out and there are some big changes in the leaderboard. Not only are there some surprising upgrades but some equally notable long falls.
The Gartner Magic Quadrant for Advanced Analytic and ML Platforms came out on February 22nd and there are some big changes in the leaderboard. Not only are there some surprising upgrades (Alteryx, KNIME, H20.ai) but some equally notable long falls for traditional players (IBM, Dataiku, and Teradata).
Blue dots are 2018, gray dots 2017.
For those of you keeping track Gartner split this field in 2017 so that “Advanced Analytic & Machine Learning Platforms” (Machine Learning added just this year) is separate from “Business Intelligence and Analytic Platforms”. There’s significant overlap in the players appearing in both including SAS, IBM, Microsoft, SAP, Alteryx, and a number of others. Additionally you’ll find all the data viz offerings there like Tableau and Qlik.
From a data science perspective though we want to keep our eye on the Advanced Analytics platforms. In prior years the changes had been much more incremental. This year’s big moves seem out of character, so we dived into the detail to see if the scoring had changed or if the nature of the offerings was the key.
Has the Scoring Changed?
We read the 2018 and 2017 reports side-by-side looking for any major changes in scoring that might explain these moves. We didn’t find any. The scoring criteria and eligibility for each year remain essentially unchanged.
Of course raters are always influenced by changes in the market and such impacts can be subtle. In the narrative explanation of markets and capabilities we found only a few hints at how scoring might have been impacted.
New Emphasis on Integrated Machine Learning
We all want our platforms and their components to operate seamlessly. Last year the criteria was perhaps a bit looser with Gartner looking for “reasonably interoperable” components. This year there is much more emphasis on a fully integrated pipeline from accessing and analyzing data through to operationalizing models and managing content.
Machine Learning is a Key Component – AI Gets Noticed but Not Scored
It was important that ML capability be either included in the platform or easily accessed through open source libraries. To their credit, Gartner does not fall into the linguistic trap of conflating Machine Learning with AI. They define the capabilities they are looking for as including “support for modern machine-learning approaches like ensemble techniques (boosting, bagging and random forests) and deep learning”.
They acknowledge the hype around AI but draw a relatively firm boundary between AI and ML, with ML as an enabler of AI. Note for example that deep learning was included above. I’m sure we’re only a year or two away from seeing more specific requirements around AI finding their way into this score.
Gartner is looking for features that facilitate some portion of the process like feature generation or hyperparameter tuning. Many packages contain some limited forms of these.
While some of the majors like SAS and SPSS have introduced more and more automation into their platforms, none of the pure-play AML platforms are yet included. DataRobot gets honorable mention as does Amazon (presumably referring to their new SageMaker offering). I expect within one or two years at least one pure play AML platform will make this list.
Acquisition and Consolidations
Particularly among challengers, adding capability through acquisition continues to be a key strategy though none of these seemed to move the needle much in this year.
Notable acquisitions called out by Gartner for this review include DataRobot’s acquisition of Nutonian, Progress’ acquisition of DataRPM, and TIBCO Software’s acquisition of Statistica (from Quest Software) and Alpine Data.
Several of these consolidations had the impact of taking previously ranked players off the table presumably providing room for new competitors to be ranked.
Big Winners and Losers
So if the difference is not in the scoring it must be in the detail of the offerings. The three that really caught our eye were the rise of Alteryx and H20.ai into the Leaders box and the rapid descent of IBM out.
From its roots in data blending and prep, Alteryx has continuously added to its on-board modeling and machine learning capabilities. In 2017 it acquired Yhat that rounded out its capabilities in model deployment and management. Then, thanks to the capital infusion from its 2017 IPO it upped its game in customer acquisition.
Alteryx’ vision has always been an easy to use platform allowing LOB managers and citizen data scientists to participate. Gartner also reports very high customer satisfaction.
Last year Gartner described this as a ‘land and expand’ strategy moving customers from self-service data acquisition all the way over to predictive analytics. This win has been more like a Super Bowl competitor working their way up field with good offense and defense. Now they’ve made the jump into the ranks of top contenders.
H2O.ai moved from visionary to leader based on improvements in their offering and execution. This is a coder’s platform and won’t appeal to LOB or CDS audiences. However, thanks to being mostly open source they’ve acquired a reputation as thought leaders and innovators in the ML and deep learning communities.
While a code-centric platform does not immediately bring to mind ease of use, the Deep Water module offers a deep learning front end that abstracts away the back end details of TensorFlow and other DL packages. They also may be on the verge of creating a truly easy to use DL front end which they’ve dubbed ‘Driverless AI’. They also boast a best-in-class Spark integration.
Their open source business model in which practically everything can be downloaded for free has historically crimped their revenue which continues to rely largely on subscription based technical support. However, Gartner reports H2O.ai has 100,000 data scientist users and a strong partner group including Angoss, IBM, Intel, RapidMiner, and TIBCO, which along with its strong technical position makes it revenue prospect stronger.
Just last year IBM pulled ahead of SAS as the enthroned leader of all the vendors in this segment. This year saw a pretty remarkable downgrade on ability to execute and also on completeness of vision. Even so, with its huge built in customer base it continues to command 9.5% of the market in this segment.
Comparing last year’s analysis with this year’s, it seems that IBM has just gotten ahead of itself in too many new offerings. The core Gartner rating remains based on the solid SPSS product but notes that it seems dated and expensive to some customer. Going back to 2015 IBM had expanded the Watson brand which used to be exclusive to its famous Question Answering Machine to cover, confusingly, a greatly expanded group of analytic products. Then in 2016 IBM doubled down on the confusion by introducing their DSx (Data Science Experience) platform as a separate offering primarily aimed at open source coders.
The customers that Gartner surveyed in 2017 for this year’s rating just couldn’t figure it out. Too many offerings got in the way of support and customer satisfaction, though Gartner looked past the lack of integration to give some extra points for vision.
IBM could easily bounce back if they clear up this multi-headed approach, show us how it’s supposed to be integrated into one offering, and then provide the support to make the transition. Better luck next year.
About the author: Bill Vorhies is Editorial Director for Data Science Central and has practiced as a data scientist and commercial predictive modeler since 2001. He can be reached at: