When did you first hear about Hadoop?
In the early summer of 2008, I re-encountered Hadoop. I heard about this technology when Google first published its research papers about it in 2004. But like everyone else at the time, I totally wrote it off. It didn’t look anything like the systems we built. No query language, no transactions. We knew how to build big data processing, and this looked nothing like that. What we had not understood back in 2004 was that Google wasn’t trying to solve transaction processing problems; it was solving scale-out data problems. So when I left Oracle in 2008, no longer going to be a database guy, I came across this technology again.
How did that develop into you creating Cloudera?
I had a chance, just by networking, to talk to folks at Facebook and Yahoo! about how the software was being used. And the scales fell from my eyes. At the time, there were three other guys – one at Facebook, one at Yahoo! and one at Google – who had actually been working with Hadoop on building the software, and were planning to start a company out of it. Each of them had decided that there was commercial opportunity around the software. I believed the same thing, so Christophe [Bisciglia], Jeff [Hammerbacher], Amr [Awadallah] and I decided there will be a company. There were going to be four companies, with each of us the CEO of our own company.
As the only one without any practical experience of Hadoop, how did you go about acquiring these well-skilled partners?
I did what you do when you’re a business guy: I walked around the Valley and talked to everybody. I found those three guys as a result. Two of them were working at Accel Partners, and a partner of Accel decided basically to fund a Hadoop company and had recruited these guys to kind of incubate it. So I found my way to those guys and my mission that summer was to be sure that four companies didn’t come into existence – just one. There had been some prior relationships but the four of us didn’t know the four or us. We spent the summer dating, and then in August convinced ourselves that it could be good to get married. Then in September we really sat down and wrote a business plan and put together a pitch deck to take out to raise funding. The guys at Accel wanted to take the deal. They came in with great terms, so it was off to the races.
That was around the time the recession hit. What effect did that have?
In the middle of October 2008 we raised our Series A round. Three days later, Lehman Brothers declared bankruptcy and nuclear war decended on the financial markets. The steel doors just slammed down shut and you could not raise venture capital for anything. Even if you knew how to turn lead into gold, you could not get that funding. So the crash was a disaster, except that we had $5 million in the bank three days ahead of that and then nobody could get funded to compete with us. So I think we spotted some trends, but we were just incredibly lucky. We turned our luck into good execution, but it never hurts to be standing in the middle of a market that’s exploding around you with no competition.
The big data market has become extremely congested with everybody trying to gain a slice – how have you managed to maintain that early lead you achieved?
I think our depth and expertise on this opportunity of Hadoop has been a big deal. The fact that Doug Cutting (Hadoop co-founder) works for Cloudera, and that the people who drove its adoption at Facbeook and Yahoo! were among our founders, meant that we had a much clearer view as to the opportunity than anyone else in the market. We could go out and win a bunch of customers without any competition, and learn from them.
When Hadoop was born, what Google invented was a way to store any kind of data very cheaply at scale. It invented this one processing model called MapReduce that allowed for large-scale, parallel analytics over all that data really fast. It was transformative and powerful – no such capability had ever existed prior. But from the very beginning at Cloudera we believed that the platform would do more. One big storage repository with an engine that could analyse its data begged to have other engines added. We believed it to be the case early on, and Cloudera from the very beginning drove that vision forward.
Speaking of vision, where is the big data trend heading in 2014?
Our customers are now deploying this system at the centre of their data centres. Once you’ve got all the data in there then you’ve got interactive query, so you can do self-service BI and you can run analytic and search workloads. It’s begun to be deployed as the hub of data in the data centre connected to a data warehouse to deliver derived results there. It is now the central repository in most of our large enterprise customers. Our vision now is that we believe the enterprise data hub will be the primary repository for enterprise data of the future, and we’re just now seeing that transition in the market.