The fascinating world of advanced analytics 

I had a lengthy conversation with an old client yesterday on his plans to start some cool new projects . That is what triggered this blog 

Most of you already know that I am not too big on categorizing this topic into predictive , prescriptive , cognitive and so on . I never cared about the old “is it analytics , is it BI , is it just reporting” debate either . All I care is whether data can be used to solve problems somehow . So when I say advanced analytics – all I mean is using data slightly more sophisticated than to report past performance . 

On the software side – volume is no longer as big a challenge as it once was . Storage ( and RAM and CPU and …) is getting cheaper and compression is getting better . It’s still not trivial ( for example – when a semiconductor chip is produced , one optical scan can produce 10 to 30 TB of data per wafer . Even at current prices , that is expensive to store all of it) . 

The hard parts are still velocity and variety . Everyone can eventually get to the same result – but competitive advantage is only for the first few who can see the result . Even within that set – only a small number can actually act of information quickly . Now if the raw data hits you really fast – there are real challenges . 

Software ( whether it is app or database)  is hardly optimized for read and write at the same time when the incoming data is variable . If you need to put a lot of data into your system at a high speed like say in some IOT scenarios – there are databases that are optimized for it . But those databases need extra fittings to make that data available to be analyzed in real time . There are others that can do sophisticated analysis , but they don’t always allow data to be put into it at the speed it arrives . Essentially a lot of compromises and data duplication are still daily struggles for many of us . Granted it is getting better – but it’s not there yet . 

In many of my customers – even after all the software puzzles are solved , we hit a wall on network bandwidth . Cloud is sexy and all – but every hop takes a toll . 

After we figure out the right way to put all the data into the places we need – then starts the analysis pieces . Between COTS and opensource , there is no shortage of software that can do the job . But the idea of democratized advanced analytics is still a distant dream .

There are many aspects to this problem 

1. There just isn’t enough talent who understand statistics ( err data science I mean)to begin with 

2. Generic data scientists won’t cut it . Sales data  from an automobile company cannot be analyzed the same way as sales data for an aerospace company . Industry knowledge is key . That makes it even harder to get the right people on the job – and hence you need teams that have data scientists and industry experts for foreseeable future . 

3. There is hardly any consistency in legal matters across the world . Now we also need lawyers in such teams to make this work in a way that no one goes to jail 😉

4. Legal does not mean ethical always (there is a surprise) . So now we need an ethicist (such people do exist) to help answer some questions on what data and what analysis is ethical . Then you might need an MBA to figure out the solution with all the constraints applied 😉

5. Even if you have all these people and all the right software , you still need to convince the customer that it’s a big production and it comes at a price , although for all the right reasons . Then you need to explain why it is not a good idea to get two data scientists from the body shop to create a model 

6. Even if the customer sees the value , and spends the money – after the team shows the model , it could look really simple and customer will again ask “why didn’t I just hire two junior data scientists to get it done ?” . (The sausage making ( fitting the models) is not fun – it usually is darn tiring grunt work- to watch unless you are a data scientist yourself )

7. Neither Models nor data stay static . Unforeseen things can come up and there might not be a way to predict meaningfully using past data and analysis . For example – lookup why Nate Silver could not predict Donald Trump becoming GOP nominee 

8. Every prediction comes with caveats – some trivial and some complex . Trivial ones can usually be ignored and an automatic action triggered ( like for example ABS kicking in a car when some conditions are met) . The complex ones need significant explanation – and that is not easy unless the recipient of the information understands some basic statistics . There are software vendors who claim their wares can make predictive analytics available to lay users . What they don’t do is explain what caveats apply when those users see results of their analysis . 

9. Like with everything else, things go wrong all the time in analysis too . Complex analysis is really difficult to debug today 

10. Even if all these challenges are over come , and you tell the customer there is a 90% chance of door 1 being the one to open to find the pot of gold , it could still be that door 2 was the right answer that time . So now you have to explain why that happens 

I can go on , but I am sure you get the rough idea already . If not – buy me a beer and I will give you some examples of real life situations I have dealt with 🙂


Sapphire now 2016 – an event report 

To begin with , I am so tired now ! I thought I had an easy schedule but between lack of sleep and walking around OCCC , I am exhausted . I skipped the concert , ordered in some food and thought I will post this blog before sleeping

It was great to meet friends from all over the world . It’s absolutely an annual pilgrimage for many of us . Steve Wozniak on the ASUG keynote stage was a great bonus – what a great guy !

First off – SAP needs to be congratulated on tremendous customer focus this year . Having an iconic customer like Nestlé on keynote stage with a strong endorsement – that was huge!

I did have a chance to talk to several SAP customers this year accross the event , the various parties and at the Hilton bar . Now I clearly know why SAP leaders hit on the “empathy” theme on their keynotes – customers are looking for more action (perhaps less talk too) from SAP on implementation management , integration etc . Surprisingly no one said anything about SAP support . Maybe 3rd party support is not as big a deal now as I expected it to be .

The “surprise” for me honestly was that SAP leaders themselves seemed to be surprised by what they heard from the CIOs they spoke to. This frustration has been there for a while amongst customer ecosystem .

Integration is a complex topic for SAP for many reasons like

1. SAP has a bloated portfolio . How many ERP versions are there today ? Three S4 things , A1, B1, BByD , Anywhere … And some probably overlapping with each other . Sure it can be argued that the customer base can be segmented in a way that makes a case for each ERP product . But that is mostly an excuse in my mind . If they don’t rationalize portfolio – real integration will remain a dream

2. SAP has a complex organization , which is matrixed . It’s not always easy to find one owner for any integration . Someone at executive board level will need to be a task master to pull this off and this will take some time to get it done

3. There are way too many acquired products – which don’t have similar technology or metadata to core SAP products . It won’t be trivial to build meaningful and seamless integrations

Nevertheless , I think SAP absolutely needs to be congratulated for taking on this challenge heads on . I wish Bernd and others the best

Just as integration is a challenge , so is tackling customization . SAP has had many different ways to customize and enhance its products over time . If the engineering team can find a way to extract those into an abstract layer in an automated way – SAP can move its entire ERP customer base to cloud in one shot without any of them losing their supposedly unique differentiators . It will be a moonshot R&D project for SAP and I am not sure if that is something they will take on now or in future

I spent some time with Nayaki Nayyar on IOT . It’s probably the first time I heard a clear articulation of what SAP wants to do in this space . Good start and I will be keeping an eye on it to see what unfolds

Carsten Thoma , the president of SAP Hybris , was impressive in his vision and his pragmatic approach . Integrating legacy CRM , SD etc with SAP Hybris is not trivial. New channels keep cropping up and this team has a lot on its hands . For those of you who understand pricing engines – think of the effort to rationalize three or four disparate approaches ! What impressed me the most was the idea of a customer profile that SAP Hybris can generate on the fly across channels . Carsten confirmed they use SAP HANA and MongoDB for it in back end.

Alex and Dinesh hit it out of the park with what they showed me about Ariba . Brilliant UX and probably the product that convinced me that SAP can deliver on simplicity . The guided buying scenario , triggered by a free form search – I loved it . Dinesh showed us how the APIs are exposed – and I hope they open it up to every developer in the world , and not just the registered SAP partners . My other wish list was that suppliers be allowed to push campaigns and promotions to their catalogs .

Talking about UX – I did not get as big a kick as I expected out of the digital boardroom solution . Granted it looks very pretty – but seemed like more of visualization and less about insights and action . Also , if you are showing cross enterprise data – should it not have search as primary interaction medium ? Also – it seemed like there were a lot of “clicks” on the touch screen when I saw the demo – instead of “touch” . I really think SAP should rethink the design from ground up

I absolutely loved the web tool to help plan S4 migrations that was presented during Hasso’s keynote . Excellent investment by SAP !

Talking about overlapping products – BI portfolio is a classic example of SAP historically resisting rationalization . This year , Steve Lucas announced they are going to categorize everything into two bundles . One for cloud self service and other for enterprise . That is a good logical move .

And they are bringing back “BusinessObjects” brand in a big way! I thought it was a bit odd to double down on BOBJ brand to fight the likes of tableau – but we will see how that plays out in the market . Irrespective of branding , I hope these two buckets are a good start in streamlining the portfolio . Success can be declared when in future we don’t see sold out sessions on “how to pick the best SAP BI tool” at events like sapphire 🙂

I miserably failed in my quest to visit every single booth on the show floor this year . I visited 12 on Tuesday , 8 on Wednesday and 2 on Thursday . I was not exactly thrilled with what I saw in the 20 that I did manage to visit . Way too many power points , and the few demos that were there – it was mostly about dashboards . Clearly SAP’s design thinking philosophy hasn’t yet caught up with several partners .

Many thanks to SAP for having me – it was great to be back here after a gap of few years !

Going to SAP Sapphire Now as an outsider for the first time 

I am sure someone will correct me immediately that it’s the big ASUG show too 😉

For a few years now , I have been away from the SAP field in general . That is weird in itself , given SAP dominated every part of my job from the time I left business school till I left SAP Labs . I have kept in touch with several friends from SAP land ( and Bill McDermott did assure me few weeks ago that I will always be family to SAP – thanks Bill) , but I have lost track of SAP products and technologies . Although sapphire is a mega sales event , for me – this trip is mostly for education . Well,that and some networking at the Hilton bar 

There are three things I remember most fondly about the time at SAP Labs . 

1. Putting a free trial of BW and BO on HANA on AWS .

2. On my last day in office , Debugging “simple finance” along with Hasso and realizing we are probably the only available people at that point in building 1 who knew how to work with SAP FI 😉 

3. Inviting IBM to let Watson and HANA play together

I am sure there were many more good things , just that I can’t see to remember at the moment. There were a bunch of severe disappointments too , but I will write those off as valuable learnings 

While engineers and researchers at sap and ibm both thought bringing together these two technologies together will be awesome , for the most part nothing much happened in terms of actual integration . I left SAP and went to MongoDB , and later returned to IBM. By that time Watson had become a real business and my team was involved in selling and delivering it . 

In my second term at IBM, the focus has been away from enterprise applications and more on big data , cognitive , IOT etc . Other than occasional conversations with my friends leading our SAP practice , I had no idea of how HANA and S4Hana and HCP and all have progressed . And then lo and behold – there I see the announcement that SAP and IBM are partnering Hana and Watson .

As excited as I am about my wish finally coming true , the most gratifying thing for me was that this was spearheaded on the technical side by my buddy ( some might say protege) Gagan Reen. Gagan was the first to jump in when I had the crazy idea to POC HANA for a Teched 5 years ago . He is still just as passionate about SAP technologies as he was when I first met him. 

Seeing him and others that I mentor grow into well respected leaders beats every other career accomplishment I might have had . Now it is even more gratifying to see these leaders paying it forward and groom another set of leaders . 

Special thanks to Mike Prosceno and Stacey Fish (absolutely the best in the business ) for getting me to sapphire again this year. I can’t wait to catch up . It’s been a while since I caught up with SAP mentors , bloggers and analysts. All I can hope is that the Hilton bar has enough beer stocked 😉

And for the first time ever , I plan to visit every vendor booth at sapphire . 

Here are the things I want to learn this time on priority 

1. Details of the new Apple partnership beyond the PR message that came out

2. What is new with HCP ? 

3. Details of cognitive solutions on S4Hana beyond what I know today 

This is going to be a blast !

Election 2016 might shake our faith in data science

First off – I am about as independent and undecided as one can get in this country. In general I am fiscally conservative and socially liberal. I am not a big fan of Clinton, Sanders or Trump – and can’t even decide who I dislike the least. But – as the primaries wind down, I am starting to follow the election with great interest. This is not merely political interest or for entertainment – it just gets me all geeked out on the potential of data science to help these candidates.

Trump is the “presumptive” candidate for GOP. That does not mean anything about his chance of winning. He is not anywhere close on any dimension to the kind of GOP candidates that went before him in past elections. So the idea of “red states” and “blue states” as it exists today do not really matter to figure out how he can win.

I am not sure if his campaign used extensive data analysis so far. On TV, it seemed like he just used his big personality (and the free media coverage in drew) as the primary weapon and it threw off the conventional campaigns of his competitors. Even the grand daddy of all data scientists who cover elections – Nate Silver – was thrown off his game by the Trump campaign. But now that he is the presumptive candidate – his campaign will probably take a more data driven approach to his approach in general elections.

The modeling approaches could get very complex for figuring out what Trump should do to attract votes. I read somewhere that his big voter base is white males without a college degree. Since education has generally increased over the years, It will be interesting how the percentages work in each state. The general theory is that black community do not like Trump because of him challenging Obama’s birth certificate issue. But if all things remain equal, even if he takes a small chunk of black community votes – he might carry the state. But then women don’t like him either apparently – which adds to the complexity of a predictive model . Past polling data and all kinds of analysis that RNC must have done – my bet is that it won’t be of much use and new models will need to be created and tweaked.

Its not any easier to predict what works best for Clinton to get the 270 magic number in general election. Math looks to be in her favor to win the primaries of her party. That said – Bernie Sanders has an extremely loyal base , especially amongst young voters. Even if Bernie himself endorsed Clinton at the convention as I think he will – I am not sure whether his supporters will care for Clinton. In that case – will they vote for her, stay home or god forbid, vote for Trump ? Also – just as GOP data scientists  will have to find what exactly works for Trump – Clinton camp will need to find messages that work against him. Given no history exists for a candidate like Trump – this exercise should wear out a lot of keyboards.

While Trump is famous for his gut instinct driving his primary wins, there is one aspect that makes me think there is a bit of data analysis that has helped his cause. His idea of tagging Bush as “Low energy”, Rubio as “Little”, Ted Cruz as “Lying” and finally Clinton as “Crooked” seems to me like a data driven strategy. May be it did not start that way and his gut instinct gave him the idea to begin with Jeb Bush. But my best guess is that his team picked up on it and tested the other adjectives before he used them effectively in his debates and stumps. Of course I can only guess – and I would love to see what comes out after the election cycle is over and someone writes a book.

I am sure a book or two will be written – this election will put to test a lot of data science and its practitioners. I can’t wait.