Dining Preferences of the Cloud and Open Source: Who Eats Who?


The hyperclouds at work

TL;DR: If the cloud didn’t eat Hadoop Inc., Pivotal and Red Hat, what explains their diminished prospects?

Marc Andresseesssen’s “software is eating the world” has given rise to an entire technological food chain, with a succession that includes “open source is eating software”, “cloud is eating open source,” and the most recent supposition that “multi-cloud is eating cloud“.

The new food chain?

Not everyone is happy with their place in the chain. Who wouldn’t prefer to be an apex predator or keystone species? In particular, some reject the tidy sequence above and insist open source is actually “eating” cloud (and presumably also eating multi-cloud on a prix fixe menu, perhaps accompanied by a nice pinot).

I don’t get the “open source eating cloud” argument but keep hearing it. Admittedly, “eating” is not the most precise term, allowing different interpretations. Nevertheless, attempts to understand how exponents of open source doing the eating score this contest quickly get fuzzy and even metaphysical (“Sure, the clouds may take most of the revenue, but it is a moral victory for open source…”).

The public clouds are taking (dare we say “eating”?) open source software and operating that software as a service. One can say the public clouds are powered by open source (though they have plenty of proprietary software too), but that still seems like the clouds are the ones doing the consuming. From an economic perspective (which is what all the industry think pieces and analogies are about), the clouds seem to make a better business from open source than the companies built around particular projects. If you squint, open source could be seen as a very generous charitable donation to some of the largest and wealthiest corporations on the planet.

Our dining dichotomy stems from open source and clouds playing fundamentally different games. Open source enthusiasts and companies are focused on specific pieces of software and how that sausage gets made. The public clouds transcend software and operate on a vastly more expansive plane of existence where software is an important but not the sole ingredient of a service.

The public clouds knit together transoceanic cables, slabs of concrete, a reliable flow of electrons, millions of CPUs, exabytes of disk, software runtimes aplenty, legal standing and an army of people providing 24×7 operations and support, all integrated into a transactable utility accessible by anyone with a credit card. Software people often fail to appreciate that cloud services are so much more than just an instance of software, and operations is its own competency.

A huge part of the value of cloud is orthogonal to the underlying software: it lets customers get out of low value/high complexity operations (an attribute which applies equally to both saintly open source and perniciously proprietary software). Open source software often skews to the complex, sometimes to the very complex (oh, hi, Kubernetes!), making it all the more attractive to package and deliver as a service.

The unexpected and asymmetric competition from the clouds confounds open source companies, who must confront the fact the competitive advantage of knowing their software better than anyone else isn’t the insurmountable moat they had hoped. It is never fun to wake up and discover your product is now just a feature of a broader offering, but this is what is happening with software. Claiming open source is eating the cloud is like coffee bean farmers claiming they’re eating Starbucks: it willfully (or just out of delusion) ignores the vast majority of what the customer is buying.

The argument for open source winning our eating contest seems to boil down to a tautological assertion that, “at the end of the day”, victory is inevitable, because – ah – the self-evident and glorious properties of open source!

I was triggered to write this by an earnest young IBM employee who was courageously defending his employer’s hyperbolic contention that hybrid cloud “changes everything about the cloud market”. Not many IBM employees will go to bat for their company, especially when not on the clock, so I have to applaud his effort (however hopeless the task). But his argument was to simply repeat the incantation that “open source will eat the cloud”, essentially a religious faith in some inherent righteousness and superiority of open source, without regard for the broader customer problem, respective value propositions, positions in the value chain, or underlying economics.

So with this post I invite hostile fire from all directions (and expect – actually even hope – to equally annoy the open source purists, the open-but-not-actually-open open source revanchists, as well as every company named herein) in an effort to better understand this debate. (Disclosure: I worked at Micro$oft during the heyday of proprietary licensed software so may just not be able to even begin to comprehend the most basic precepts of open source).


Andy Jassy (left)

Most of the current debate focuses on Amazon and a few open source companies they have startled, like gazelles on the savannah, specifically Elastic and MongoDB. All while chronically prefacing their messaging with “customers tell us…”, AWS is offering its own services that are built on (Elastic) or are compatible with (MongoDB) popular open source projects, thereby competing with the relatively successful commercial open source companies associated with those projects. In the case of Elastic, AWS has generously created a new open source distribution of the features that Elastic had held back as proprietary software.

The prey have responded with both pluckily defiant blog posts and a frenzy of license engineering to impede AWS’ ability to use their ostensibly open source software. Others, like Cockroach Labs and Redis Labs, have followed with their own new licenses. This has renewed an existential and philosophical debate about open source: is it about free speech or does it also include the right to a free moat for key project contributors? In the end, the high priests of open source do not seem to be endorsing the “open except for people who compete with us” approach.

And it is not just AWS that is putting open source software to work. Both Google and Microsoft have many services built on open source software (and are open sourcing some of their own software). Some of their efforts are simply to drive the contrast with AWS who have assumed the role of the new open source Antichrist (much to the amusement of former Antichrist Microsoft, who meanwhile is creating services via deep partnerships with the likes of open source companies Databricks and Hashicorp).

The emergence of the cloud has also forced many open source companies to take their own service offerings more seriously. Both Elastic and MongoDB have successful cloud services that they run on the big public clouds, where they have the opportunity to walk their talk that no one is better at operating their own software. It has even been argued that AWS’s entry has been a boon for these companies’ services.

But the fundamental question is whether customers prefer a “better” individual service from the OSS companies that created a particular piece of software or is the version from the public clouds “good enough”? The public clouds may not have written the original software, but they can offer it at global scale (because CAPEX) combined with a single pane of glass to manage all your services, a single bill for all services, deeper and easier integration with complementary services, and a lower cost of customer acquisition. As I framed it previously, the question is “whether commercial open source companies can withstand and/or deserve to withstand the immense and feature-crushing gravitational pull of the public cloud black holes.” There are more serious discussions about this topic, some by people who write even longer than I do (albeit with fewer GIFs and amusing alt text).

“Fully Displaced”

Peter Levine at A16Z argues we have gotten a tad overwrought on this topic:

“I also think we have over-rotated on the threat from public cloud vendors. While these vendors may host open source projects, to date, there isn’t a single open source company I am aware of that has been fully displaced by a cloud provider.”

“Full displaced” is a gentle euphemism and leads me (at long last!) to my contribution to this discussion. Rather than focusing on the possible fate of the prey currently being pursued across the cloudy savannah (negotiations continue with David Attenborough to narrate the audio version of this post), let’s look at the scoreboard to see what is happening in some of the games between open source and the cloud that started earlier. Those other games aren’t over yet, but the outcomes look increasingly clear (cloudy, actually).

The fact is some of the very largest OSS companies have recently lost sales momentum, relevance, valuation, and/or their independence. And judging from the size and shape of the bite marks on their bodies, it looks like the work of the new apex predator, the cloud (and here ends the metaphor mixing).

Proponents of OSS winning the eating contest need an explanation for these companies diminished prospects, particularly during boom times for software companies and as stock markets hit all-time highs. They have not (yet) been “fully displaced”, but it is worth looking at the predicaments of the Hadoop companies, Pivotal, and, what was until recently the biggest of all open source companies, Red Hat.

The Hadoop Industrial Complex

Not that long ago, Hadoop and its commercial flag bearers were a big deal. Cloudera, HortonWorks and MapR collectively raised over $1.5 billion in capital ($1B, $248M, $280M respectively). That includes Intel’s whacky secondary investment in Cloudera – somehow premised on the idea that what we really needed were x86 instructions specifically for Hadoop – so the net investment was more like half that.

Cloudera and Hortonworks both IPOed, collectively raising another $335 million. Yet disappointing financial results forced them to move in together, and they quietly merged earlier this year, while their founders slipped out the side door. Their combined value dropped from $5.2 billion at the time the merger was announced to around $2.5 billion at this writing. Still private MapR was subsequently sold for scrap to HPE, who bragged they got “a very good deal” (insert your own Microstrategy joke).

Hadoop has left behind a trail of tears with customers who spent vast sums to construct “data lakes”, yet struggled to successfully deploy and manage them much less find a business return snorkeling in those lakes. Meanwhile, the big data business has moved to the cloud, by virtue of being both cheaper and easier. As Mathew Lodge said, “Ironically, there has been no Cloud Era for Cloudera.”

Alternative hypothesis (i.e. it wasn’t the cloud eating open source): Hadoop was just overhyped. The claim it would replace a variety of focused and mature database technologies made for a good TAM story, but Hadoop turned out to be a jack of all data trades but master of none. Dumping your valuable data in a lake also wasn’t a great metaphor. The lesson from Hadoop is, once again, we should be wary of grand technology promises about the “next big thing” that sweeps away everything that came before it.


(Disclosure: I worked on Cloud Foundry at VMware until it was spun off into Pivotal. It was open source from the outset, despite my troglodyte presence. Fortunately, I departed before it got foundationed).

Pivotal was an amalgam of acquisitions including Pivotal Labs, SpringSource and Greenplum, plus VMware’s investment in those businesses as well as building Cloud Foundry from scratch. Pivotal raised  $555 million in an April 2018 IPO at a price of $15. It had a peak value of $7.4 billion and subsequently missed a couple of quarters, citing “sales execution” problems and a “complex technology landscape” (which I submit is investor relations-ese for “cloud”). The valuation had fallen to $2.25 billion when it received an incestuous bailout offer from – wait for it – VMware, at – wait for it – $15 per share. Legal papers with allegations of self-dealing are probably already being served. Ironically, Cloud Foundry was originally designed as a service, but found itself ensnared in the complexities of selling to enterprises who would have to deploy and manage their own services. Instead, they opted for the cloud.

Alternative hypothesis: Pivotal got eaten by “Dockernetes” aka containers (ironically because Google was pissed off about Hadoop, but that is another story) which is of course open source, so the cloud had nothing to do with the company’s fleeting tenure as a public company. This alternative history is popular with senior Pivotal management who were busy selling consulting services when perhaps they should have been selling cloud services.

Red Hat

Last but not least is Red Hat, the longtime poster child for open source. Red Hat was the original (and for a long time the only) existence proof that you could build a good business around open source, and also the most successful. Yet the poster child is gone, off the table, and soon to be another footnote in the IBM middleware museum. Why? Because of the cloud.

Red Hat had billions in revenues, nice margins, billions in the bank, double digit growth, and until last year, the valuation of a high-flying growth stock. IBM paid $34 billion for Red Hat, the largest software acquisition ever. I argue elsewhere that IBM overpaid (Watson presumably helped set the price), but the fact Red Hat management and shareholders took IBM’s money underscores they didn’t believe the company had a future in the cloud era. That they were happy to get out at a price that represented their all-time high stock price just six months previously suggests little confidence in their ability to get back to that valuation, much less surpass it. And they took cash, foregoing any participation in the possibility of a Red Hat-led renaissance that reverses IBM’s ongoing and inexorable decline (likely a wise move).

To reuse some previous prose:

Red Hat has its own challenges (and at the acquisition price, has found a wonderful resolution that leaves IBM and its shareholders holding that bag). Red Hat may look like a gem to IBM (anything that isn’t shrinking would), but they too have a cloud relevance problem. The fact Red Hat is the poster child for commercial open source is an orthogonal irrelevance. Red Hat faces a very traditional technology industry problem: generational obsolescence. The bulk of their revenues come from “infrastructure-related offerings”, namely the Red Hat Enterprise Linux (RHEL) server operating system. As computing shifts from customer data centers to the public cloud, RHEL is not moving along with it. You may have heard that the cloud runs on Linux. It does, it just doesn’t run on RHEL. AWS, Azure and Google don’t pay Red Hat for Linux (they do let customers run RHEL as a guest operating system if desired, but the case for paying grows ever more tenuous – if the hyper-scale clouds don’t need it, why do you?). This shrinking TAM finally started to bite in 2018 as Red Hat’s core growth slowed and they missed Wall Street estimates for two straight quarters, which is considered problematic for a growth stock, as evidenced by a third of their valuation disappearing. Those misses combined with their visibility going forward are likely the catalyst for Red Hat deciding it was finally time to give IBM a call.

Alternative hypothesis: Red Hat management, who were telling everyone who would listen that they had a great cloud strategy, looked around the industry to see who they could partner with to supercharge their really, really strong cloud strategy, and picked IBM… (ok, that is really a stretch, but I had to put something here).

The End of History Usually Isn’t

“Success is a lousy teacher” has been attributed to a lot of people, including Bill Gates. The current open source situation is eerily similar to that of Microsoft in the early years of the 21st century. The company had a good thing going and was very much enjoying the status quo. But as open source and SaaS shook up that cozy world, the company resisted change, preferring the previous world order.

Some open source reactions to the rise of cloud are eerily similar. It is perturbing when you think you have the perfect model and something then disrupts it.  Open source is not the End of History for software. End of History arguments are deeply unsatisfying, especially for technology, as they are almost always followed closely by new if unexpected history. The fact open source was a tenuous business strategy (relying on a loose affinity between projects and software companies) and not a business model is now being laid bare by the cloud.

Just as Microsoft and the previous generation of software developers ultimately had to accept and embrace change (some made it, some didn’t), so too does the world of commercial open source. Clinging to what you thought was an ideal and eternal model, in the face of contrary evidence, isn’t a good strategy. Adapt or die.

Open source is here to stay as a development model. It is hard to imagine any kind of infrastructure or developer software that isn’t open source. But there is work to do on the accompanying business strategy. The next great open source endeavor may be to make multi-cloud a reality, at least for key workloads. But the new associated business models will have to embrace services as the primary delivery model and make a serious commitment to a level of integration that is the hallmark of cloud services.

Thanks to my reviewers for helping drag me into the 21st century. The rest of you please tell me what I still don’t get.

17 responses

  1. As someone involved with RHEL 2005-2011, it was very noticeable that the company got hooked on per processor Enterprise subscription margins, and considered cloud vendors as a “sell to” rather than “sell through” channel of distribution. That immediately exhausted their emerging market share vs Canonical (Ubuntu got something like 70% share there). Buying CentOS back in was very much after that horse had bolted.

    This in turn is getting substituted by cloud vendors own Linux distributions and software stacks built on these. Now the open source folks will have to think in line with the IBM fundamental question: “What makes you Special”.

  2. […] Charles Fitzgerald / Platformonomics: Red Hat’s big sale, disappointing exits of Hadoop-based startups and Pivotal show how public c…  —  TL;DR: If the cloud didn’t eat Hadoop Inc., Pivotal and Red Hat, what […]

  3. Pretty much nailed it.

  4. Seems like eating is the wrong analogy.”Modern-man” did not eat the Neanderthal they merged (so to speak) voluntarily or otherwise. It is really just a matter of ecosystem evolution. Quantum computing enabled machine learning with make both immaterial but no less important to the evolutionary process.

  5. ‘@patrick

    Agree “eating” is both vague and inaccurate, but what has been used. Ecosystem evolution is much better description.

  6. Open Source is ultimately anti-capitalist and as such, will inevitably succumb to capitalist economic forces. Capitalist predators are doing eating. That’s too bad because open source benefits and has benefitted everyone, as demonstrated by the software stack underpinning the Internet we have now. Even some at your dear Microsoft recognized the incredible value and benefit of open source.

  7. ‘@intosh

    Open source looks like a zero marginal cost complement to me, and thus perfectly consistent with “capitalism”. Certainly true for the hyper clouds which is kind of my argument. But fun to see Marxist analysis applied in this day and age in a (seemingly) non-ironic way.

  8. The reason for Hadoop’s demise is not all due to the cloud. In the literature (Designing Data-Intensive Applications, a book that came out in 2017, has an entire Chapter 10 dedicated to this), MapReduce and YARN have been replaced by other technologies / better algorithms. In most cases, this is Apache Spark, which came from Databricks, which, admittedly, is primarily cloud (although my evidence to support this theory is centered in other products from Databricks such as Zeppelin), but regardless — it’s due to both algorithm improvements, retooling from Hadoops and towards Spark, and Hadoop itself becoming a nearly-compliant SQL derivative through Hive, Incoop, Impala, Drill, Presto, Hawk, Stinger, et al, without ability to easily add graph algorithms across a feature-complete graph database. Scaling and algorithm choice can be levered for performance in many ways, including storing compressed JSON, or Snappy/Parquet/etc, data partitioning, et al.

    Spark has had 50 percent or more growth over the past few years, with Hadoop slowing down from 15 percent growth rates in 2018 to slightly-less in 2019.

    It may be that you mean that HDFS was largely-supplanted by IaaS storage and storage-access features such as S3 and AWS Athena? Even this isn’t entirely-true, but could be the focus of a more-detailed review of why things happened the way that they did, citing sources.

  9. Damn you’re smart. 🙂

  10. ‘@bill

    Be careful not to confuse smartass with smart…

  11. Was a great read. Thank you!

  12. Charles, have you heard of Spinnaker, the OSS software delivery orchestration project from NFLX & GOOG that powers multi-cloud? I.e., Spinnaker sits “above the clouds.”

    From https://www.theinformation.com/articles/big-customers-pressure-aws-to-step-up-open-source-support :

    “ The case of Spinnaker offers a window into AWS efforts to balance one of its founding principles—doing what is best for customers—with its own business interests, which can diverge occasionally. The open source tool can be used to make it easier for customers to run their online applications inside the data centers of multiple cloud providers. For AWS, by far the dominant player in the cloud market, there are risks to throwing its support behind a tool that can make it easier for customers to shift business to hungry rivals such as Google and Microsoft.”

    Armory is commercializing Spinnaker for G2K. Example: http://go.armory.io/JPMorgan

  13. Clouds tend to eat software companies that drive core CPU intensive workloads regardless of opensource or not. Databases, dataware houses, cluster management software, are all in this category. Who gets away – monitoring companies. There has been many monitoring unicorns with healthy exits – appdynamics, newrelic, and more recently datadog, SignalFX and IOPipe. Want to escape the cloud vendor wrath ? Start a monitoring company.

  14. I don’t think eating is the right analogy, parasitism is, or at least it’s perceived that way by Sentry, MongoDB, Redis Labs et al. Never mind that parasites are usually much smaller than the hosts they infest. The cloud hyperscalers have no intention of acquiring the open core software startups, however.

    I have very little sympathy for the open core companies. One of the greatest benefits of open source is that you are not locked into the vendor for support, and that also goes for hosting. The likes of AWS are simply much better at operating services at scale, after all that’s their core competence.

  15. ‘@fazal

    I didn’t choose the eating metaphor — agree it is imprecise.

    It is hard to have a lot of sympathy for the “open-but-not-open” open core companies. They want to have their proverbial cake and eat it too…

  16. ‘@alan

    Monitoring is pretty crowded which brings its own issues.

  17. Charles, well-researched and -written piece. Would like to add to it because the truth runs deeper than pluckily defiant blog posts and a frenzy of license engineering.

    MongoDB may or may not be exactly like the female lion in the photo above, but as a former analyst and now as an employee I see it as more predator than prey and one built for speed, power, and survival.

    MongoDB Atlas lets you implement a multi-cloud strategy with our partners AWS, Microsoft, and Google by letting you intelligently place data, literally at the push of a button, where you need it — in AWS, GCP, Azure, or a combination of all three — to support workload isolation, distributed processing, privacy, and more. MongoDB Atlas Data Lake lets you query data in-place in S3 buckets so you can apply the power of Atlas against low-cost S3 data storage. Data Lake querying of Azure and GCP is on our roadmap to support the rest of what is becoming, if not already, the modern storage layer.

    DocumentDB is Amazon’s imitated version of MongoDB’s API containing no MongoDB server code running on Amazon’s proprietary Aurora product. DocumentDB is closest to MongoDB 2.6 (2014) and contains none of the newer features in MongoDB 3.6-4.0-4.2 like transactions, materialized views, retryable reads/writes, and aggregation stages that enable expressive data handling. It fails 61% of MongoDB correctness tests, which means existing apps will break. All of which can be avoided by using the real MongoDB to write less app code, innovate faster, and run your business in real time.

    So in closing =:-D I love “pluckily defiant blog posts.” If I use it, I’ll do so in its fully-functional entirety and with full attribution.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Get Updates By Email