In theory, I am on vacation this week – except that I have not managed to get out of email yet. And to top that, I caught up on a lot of reading and as usual, have some thoughts to share. As always – these are just my personal opinions, not that of my employer.
The good(?) and bad thing about big data is that it has no rigid definition – it is more or less what we want it to be at that point in time. I like to think of “big” in terms status quo of a customer. If the customer only reports out of their ERP data, and now wants to combine CRM and SCM data with it , or maybe combine social data with it – it is already BIG for them. For a stock exchange – they have already figured out how to manage lots of fast and furious data. It is hard to tell them data is BIG now – data was always big for them. It might be a bit “bigger” now, but you get the point. “big” is relative to status quo.
At the moment – there are very few big data projects (as a % of all projects ) that add outrageous business value. Its like a white tiger – a majestic animal with great abilities and all, but rarely seen in the wild. But they are not a myth and they really exist. I saw one just this week and took a photo. I know many friends who tend to look at big data as a white elephant – and I now like to think of the current situation of  big data as a white tiger 🙂
So where is it all trending to? I have some opinions/guesses – not backed by any scientific research. So take it with a pinch (or pound) of salt.
1. Vendor Consolidation
I definitely think there is some consolidation about to happen in 2 to 5 years time. This could take many forms – established firms like Oracle, IBM, MS, Intel, SAP etc could buy some Hadoop/NoSQL type companies. Some of the newer players could merge. The big ones have enough cash in their war chest and those that don’t have cash might not mind a bit of stock dilution to be successful in the brave new world . Big data is what is going to sell more analytics, more hardware and more databases in future – so the incumbent big players in those areas will have all the incentive to jump in as soon as they feel a big data start up is up for grabs.
Oracle might be the one exception to the rule here. A lot of big data startups were founded by people who can’t stand Oracle for some reason or other . So they might not sell to Oracle (but then we know what happened to peoplesoft few years ago, so who knows). End of the day, almost everyone has a price at which they will sell – and these big vendors might pay such a premium given the long term gains. Oracle, IBM etc have an enviable customer base, and have some very driven leaders – I won’t for a second under estimate them.
2. Partnerships
I am a guy that heads a team focused on partnerships, and if I didn’t think partnerships are an integral part of this big data story, I would not have taken my current job. I am firmly convinced that no one vendor can solve an end to end business problem for a customer. Most customers don’t want to spend a lot of time and money in integration work. While no integration is exactly seamless – customers expect a significant part of the integration to be available off the shelf.
Not only that, they probably don’t want the hassle of chasing multiple vendors for making a solution. So I expect to see a lot of OEM/reseller licensing in big data world where even competitors sell each other’s products. It is not new – IBM and Oracle, SAP and Oracle etc are all relationships based on co-ompetition. I think we will see this become mainstream across the board in less than 5 years. I also think that some companies will shy away from this model – and probably perish in the process.
Even the vendors who buy out big data startups cannot avoid this partnership thingy – customers will force the issue if they shy away.
3. Open source will become the norm for software business models
I will go out on a limb to say in 5 to 7 years, open source will be the normal way a company thinks of software. Even the closed source vendors will embrace it more openly (pun intended). The reason for this is simple – as the newer big data vendors take a stance that they don’t need to maximize any given transaction, and are willing to give customers more value than they pay for , it becomes harder and harder for closed source, maintenance revenue based businesses to keep up their current model. Even if they get bought out by closed source vendors – customer expectation will not change much I think.
So why should it take 5 to 7 years ? It takes time to make a dent in the universe 🙂
a. Open source big data vendors need time to mature their code base. They need to build tooling around their products that enterprises need. Remember the time before say Oracle 8i? I think open source vendors will go through that kind of evolution, except that they have to do it in 5 years instead of 20 .
b. Also, for the incumbent vendors to take notice of the new companies seriously – they need to be at least a $500M revenue (or a  trajectory to that kind of revenue) business . I saw this first hand with SAP Hana – Oracle and IBM did not say anything against Hana for a while till SAP started showing big numbers in hundreds of millions of dollars. I fully expect the same to play out against open source too.
c. Unlike ERP etc – I don’t think there will be one big mothership of a big data project at most customers. Instead, I think there will be a lot of parallel projects of smaller size. Some will go live and some will get killed as better use cases evolve. I think big data will drive a trend towards “disposable apps” in enterprise. Â But it aint happening overnight.
And all this will take time to shake out – which is why I am pegging it at 5 to 7 years. But I have no doubts that open source will dent the universe !










