Big Data : Switching Between Top Dog And Fire Hydrant


Most of you know I am a huge fan of big data. In my mind – this is the new top dog in enterprise software world. Companies can leap into a totally new world of insights and in all probability will revolutionize how business world makes use of information. I have not met a customer this year who does not have plans to go for some big data initiative. And no wonder – the vendor side of the world is all excited too. It is heartening for me to see both customers and vendors excited at the same time – that should curtail the hype cycle for big data.

Big data is capable of changing everything for the better – notions of what a platform does, what an application does, and rapidly advance data science in the academic world.  The technology and the theory of big data is fast changing – both on open source side, and on commercial vendor side.. There are a variety of options available for customers now for all aspects of big data. And while choice is good – it also tends to increase confusion. It is important to make sure that the bets on big data are made with eyes open, and not based on hype.

Big data has real benefits – and while the technology is evolving , there is already plenty available to make good use of for many use cases – ergo, you can use it today. CFOs will always have budget to spend for making more money – or to reduce cost. But they won’t be happy to write a check with neither possibility. So lets try to keep big data in its top dog persona, and try to avoid the evil twin , the fire hydrant.

There are 2 general ways to use big data

1. Find answers to questions you already have by sifting through a lot of data

2. Keep looking at a lot of data, and see if you can spot something – without having a very specific question to ask upfront. Once you spot something, you start asking questions like in option 1.

Option 1 is mostly just an extension of traditional BI – but with more data, coming in at a higher speed and probably of more types than we have dealt with before. 

As I think through all those past BI projects (and the associated blood, sweat and tears) – I can say with some certainty that most of the data pumped into data warehouses were never used. Vast majority of the customers got most of their answers from less than 30% or so of the data available to them. Of course it can be argued that they probably did not even know that the other 70% existed. But these are big companies with excellent top line and bottom line and they believe they have a mature BI platform. So lets say 50% of the data is useful instead of 30% I estimated. Yet, if you ask for the requirements for the next project at these customers – I will bet you dollars to doughnuts , they will ask for everything just like they did in the past. The fact that they don’t use half their data is of no consequence for them. That is how the BI world rolls.  Now with big data – there is a chance that more of the previously unused data will be used to enhance the quality of insights

Any kind of big data solution that is built for a customer who follows the option 1 route needs very cheap storage, or a way to store useful information alone, or have a BI solution that can sift through a lot of unnecessary data quickly to find answers.  This also means BI practitioners need even more due diligence in figuring out requirements so that wastage is kept to a minimum. Easier said than done .

In our familiar world of data warehouses – there is some data duplication across layers for various reasons (performance, transformations and so on). When you think of big data – don’t assume redundant data just goes away magically. On the contrary, many big data solutions (including Hadoop) need redundant copies of data . Storage is getting cheaper – but you will need a LOT of it if you keep agreeing to user requirements like before. There are also trade offs between using SAN vs several machines networked together with their own (cheap) disks.

What about performance? You can of course get excellent performance – but that depends on the question. If a system needs to use massively parallel processing – the question you ask should be able to split the data it looks at into many chunks and look at all the chunks in parallel, and then add up the results.

If you ask ” how many icecreams of each flavor did we sell today?” – and you have a million transactions, you can easily do that chunking and aggregation. However, not all questions can be answered that way. What if I ask “What is the correlation of the brand of icecream sold and rise and fall of local day time temperature ?”. This question is hard to split into many chunks because there are more variables . So while it can be computed, it is a fairly serial process. Now of course you can try some other way of solving this by looking at a smaller set of data (whether you cook one pot of rice or one barrel, you check if it is done in both cases using just a spoonful ), Or if you knew upfront that this question will be asked, you can do the old data warehousing technique of pre calculating data in useful forms and wait for the question to be asked and so on.

Essentially – depending on the question you ask, you might need a combination of big data solutions (say like Hana and Hadoop) to get a good answer in an acceptable time.  You can reuse and build on a lot of skills you already have in the shop today. But walk into option 1 knowing all the trade offs you have to live with. And I haven’t even skimmed the surface of all the things you need to consider.

What about option 2 ? No predefined questions – but you look at data and see if there is anything useful there.  The good news is that the technology is already there to do this. Bad news is that you need a lot of hardware, consulting etc to get it done. Well there is one more thing to keep an eye on – in the wrong hands, it is fairly easy to bark up the wrong tree when you are dealing with this kind of big data. False positives and false negatives galore. You might be chasing solutions to non-existent problems. An interesting side question while we are at it – have you ever run into a data scientist who said he/she has enough data ? I have not – they all would like even more data.  I am told they exist somewhere 🙂

What about disaster recovery ? Better start getting used to recovering a few petabytes of data.  High availability is probably not a big issue since it is kind of part and parcel of the big data design in most cases. And of course option 2 has to deal with all the issues with option 1 – just that you might not know upfront of extent of the stuff you have to deal with.

Big data will be a fun ride – but keep your seat belts fastened low and tight across your lap for any little bumps along the way.

 

What do I want to be when I grow up ?


When I was three – I was pretty clear what I wanted to become . I wanted to be a Station master for Indian Railways , wearing a white uniform and controlling the huge big trains with green and red flags . My grandparents bought me a pair of flags and I used to wave them around in cars and trains .

Couple of years later, I wanted to become a trainer of elephants .
20130727-195627.jpg

Two things sparked it – first the movie Guruvayoor Keshavan , and second the fact that my grand father had couple of elephants at home , one of which practically was my mom’s pet before she got married. And then I saw my first circus and this career ambition of becoming elephant trainer got expanded to the next “rational” thing – I wanted to be a circus ringmaster! Someone who can train not just elephants, but also horses, lions , dogs etc do cool stuff . I never cared for the human artists who did dare devil acts – I thought that was boring 🙂

This fascination for training elephants continued for a while till I saw how elephants are trained by inhumane techniques . And to add to that, I also read about how elephants are hunted in India and Africa . That was it – I becane dead against the idea of domesticating elephants and other wild animals . They are wild animals who should be allowed to live free .

20130727-200611.jpg

After a brief consideration of being a policeman , I decided that finally what I “really” wanted to be was a doctor. That lasted till tenth grade when my biology teacher pretty clearly let me know that I had no future in that field . The only part of biology I could relate to at that time was genetics – pretty much every other part made no sense , and I gave up the idea of being a doctor .

Meanwhile , I had developed a BIG fascination for dog shows – both obedience and breed shows . And as luck would have it , I also had two amazing dogs in quick succession . A female yellow Labrador and a male german shepherd . With the former, I did some decent winning in breed ring, and with the latter I pretty much became a top contender in obedience . The dog show bug bit me so hard that my parents were seriously worried that I will drop outbid school and become a professional handler or trainer .

20130727-202155.jpg

Well , that didn’t happen – and I have an uncle to thank . He was a dog show judge who convinced me that if I become an engineer and earn a good pay – I can buy any dog I wanted . And so I became an engineer , and with my first salary I bought a German Shepherd from Germany .
20130727-202224.jpg

I found in three months flat that I hated the shop floor life . I ran away from that job and did my MBA . And halfway through I was sure that I didn’t like becoming an investment banker either . I wanted to be a programmer . And so I joined TCS, India’s largest tech shop . And few months later , I landed in USA through them .

The first half of my career, I shifted employers frequently .Either I got bored with repetition, or I couldn’t stand by manager . In the second half – I was happy staying with one employer for long. And from being a programmer – I went on to do other things like functional consulting , Business intelligence , CRM , Project management and sales . And finally when I was going through the partnership appointment process at IBM, I left and joined SAP to start afresh as an engineer again.

It is not that I didn’t enjoy sales – I was always making sales way above my quota when I had a sales role in past. I just never could decide if I am a seller or a builder . Hopefully it will become clearer when I grow up . There are only two things that have been consistent in my career – an undying love for enterprise technology , and treating people the best I can . The love for technology is natural – that has remained rather flat without ups and downs . The part of treating people well is something that I learned along the way, and continue to do. Wll there is another “constant” that I enjoy – work takes me to several different countries, and it is nothing short of fascinating to make friends with people of other cultures.
20130727-212259.jpg20130727-212242.jpg20130727-212212.jpg20130727-212312.jpg

Tonight I found my little daughter playing with our new puppy, Ollie .

20130727-205047.jpg

I snapped a photo and put it on Facebook and a friend commented that maybe my daughter will take Ollie to junior handling at dog shows . Nothing would please me more if she does .

That is what triggered this post – it reminded me of my own childhood, growing up with my dogs . And how my ideas of what to aim for in life changed every so often .

Who knows what is ahead when I grow up ? May be everything I learned along the way would finally make me qualified to be a circus ringmaster after all – or may be a full time dog trainer ? I certainly don’t want to be a station master at Railways anymore – I hear they don’t wave red and green flags anymore .

And I certainly hope that my daughter grows up chasing (and changing) her own dreams and balance it with realities of life . I can only hope that she will have a dog by her side throughout her adventures .

20130727-210547.jpg

Watching Leadership Transitions at IBM and SAP


I “grew up” in IBM – and one thing IBM does really well is how to transition responsibilities from one leader to another . Although not at very close quarters , I watched how smoothly Sam Palmisano transitioned over to Ginnie Rommety . And what I still remember the most from that time is Sam saying Ginnie got the job because she earned it fair and square , not because she was a woman . I had watched Ginnie with a lot of admiration as she moved from running global services to global sales and then on to CEO .

No one that I know – employees , customers or partners of IBM – ever mentioned to me that they were worried with the transition . There was no chaos or confusion .

Ginnie was well groomed for the job – and inherited an awesome team. She made several leadership changes too at all levels of IBM. Proof of the pudding is in the eating – and here Ginnie has had a mixed year with some bad quarters , unlike her predecessor who went from strength to strength (from a financial point of view that is ). Yet, I never heard someone asking – “what would Sam have done ?” .

As Sanjay Poonen recently wrote in his SCN blog – there is no success without successors . I have a faint memory of a teacher mentioning this in my high school class in the context of sports -so I am guessing this is a biblical phrase . I can’t agree more – the hallmark of a great leader is how little he will be missed when he walks away from his current role . Any one who knows the guy taking over from Sanjay – Steve Lucas – would get what I am saying here .

My managers in IBM groomed me well too ( and I am eternally grateful) for my next steps up, and I was always encouraged to watch out for my team . It was extremely gratifying to see my pal Gagan Reen take over the innovation team after I left and never missing a beat . And I know he is grooming others too . Words cannot express how cool it is to watch the cycle propel itself by paying forward .

This is not to say all transitions are smooth . I worked in a small company before that got acquired . It was anything but smooth . I went from a position where my boss and I could talk on any topic at any point in time , to me needing an appointment with his EA to get an audience . Long story short neither me nor my boss stayed there for long . I later figured almost all of the acquired team dispersed because the transition wasn’t well managed . This was before I joined IBM . I also heard the big company that bought us did not get much repeat business from old customers we had .

And then I joined SAP about seven months ago. When I was considering the offer , I happened to run into Sanjay Poonen . Sanjay invited me to sit in his leadership class for a day in Palo Alto, at the co-innovation lab area. In my entire career – there are only two classes that I found useful for me . Sanjay’s was one . The other was done by Bill Smilie at IBM ( a program called Cornerstone). It was clear to me that Sanjay and Bill were both very good with mentoring the next generation of leaders by practical examples . Neither one preaches – it is learning by open discussion . If anyone who reads this has an opportunity to learn from these two guys – don’t think two times about it , Just DO IT !

A big reason I decided to join SAP was the faith I have in the leadership team . I had known Bill, Jim, Vishal, Sanjay,Rob Enslin, Steve Lucas and many other leaders from before – either via IBM or via the blogger program at SAP. So I had no doubts that I will be joining a company that had tremendous executive bench . Of course, I had no clue that people like Sanjay and Jim will be leaving the company so soon to begin their next adventures .

But having seen how smooth transitions can happen – I figured quickly that SAP won’t miss a beat either . Sanjay and Jim are leaving SAP in great hands .

And there is plenty of continuity – development is under the best possible guy – Vishal . And Hasso is deeply involved in technology direction. These are no ordinary people – and combine vision and execution very well . Bill has been Co-CEO for a while now – and is more than capable of running the show solo . And the strategy that SAP executes to now was formed by the leadership team that included Bill . That should ensure stability of plans .Look at the transition plan – Jim will continue for almost a year. The CFO transition also has a long time line . In short – this is as well planned as it could be .

So as a relatively new employee – I am as comfortable about this as I was when Ginnie took over from Sam when I worked at IBM . In fact, I am even a little more comfortable – given I know these SAP leaders a lot better than I knew the IBM leaders . And it is not just the very top levels – SAP has solid leadership talent right below these folks as well.

And just as no customer expressed any concern about IBM leadership changes – I am confident no one will ask me about SAP leadership changes either .

Last but not least – Sanjay and Jim, I wish you the very best .