And Vijay Says...

Making workflows sexy again with machine learning

Since I grew up in ERP space , workflows are near and dear to my heart . I have set up a lot of workflows myself and I have been subject to the tyranny of bad workflows a lot too . Over time “collaboration” became a thing but classic workflows still largely rule our work life . The first time I directly set up a workflow was for a purchase order scenario in the late 90s – and I remember the client VP took me and a colleague to a fancy dinner to thank us . It solved the biggest pain for him in routine business and for two young consultants – that was like winning backstage tickets to a rock concert 🙂

So why do people use workflows ? The “useful” reason is that some decisions are usually complex and can’t be taken by one person – because of skills , legal and other reasons . There are many “useless” realms too . What is not talked about often in polite company is that lack of trust in fellow human beings is a big reason for the zillion workflows we all live with .

I have several friends who specialize in workflow and collaboration systems – and they take great care of their clients in setting up the most efficient and effective workflows . What doesn’t always happen is that life changes often, but workflows don’t mostly change with it . And this can lead to comic and tragic and tragi-comic situations !

For example – Lets say there is an executive who runs a business, which has annual revenue that has a lot of zeros on the right side . But if she damages her phone and needs a new phone , she will need approval from her manager to get one .

If a company trusts her to handle millions of dollars worth of business , shouldn’t she have an automatic approval for a phone ? Sure she does – and one call to the CIO can probably get this workflow fixed right away for her and everyone like her in the company . But it’s not just her – what if this is a non executive employee who has a critical job function like door to door delivery where the ability to reach a customer by phone is paramount ? Sure he needs it too – and another call to IT (but this time from an upline manager in escalation mode ) can fix that problem quickly as well . But how many variations can happen in a workflow before it reaches the “this crap cannot be sustained” mode ? It takes very little time and I have lived through that nightmare a few times when I was a young consultant . And however carefully we craft the design of workflows – we won’t be able to predetermine all options that become necessary across enterprise as market evolves and business adapt to keep up .

It’s probably never going to get fixed completely – but machine learning can help solve a lot of these painful problems . Even if an automatic fix is probably hard, given legal and financial policies don’t move at the speed of innovation in STEM, we can make a tangible impact with meaningful insights .

The data about existing workflows is easy to get . That is enough information to get patterns for an algorithm to start on . Then it’s a matter of introducing other data sets and see what we can learn – like say weather , sales data , budgets etc . In our example of the executive – an algorithm that learns that there is a huge business impact if she loses her phone , it can trigger an order automatically . This is a much more sustainable way than a deterministic “if exec , then auto approve” rule . Why ? Say the same exec moves to a non P&L job and has a desktop where she has access all day while a new phone gets ordered .

Humans cannot keep track of all the workflows that are set up over time . No one needs an extra notification or email if they can help it . So machine learning can also be used to keep track of how the workflow landscape evolves over a period of time and suggest meaningful ideas to the workflow admin on options to optimize .

If an hour a week gets saved for a given employee by eliminating useless workflows and making existing ones smarter , that is more than a week’s vacation that you can give that person at no extra cost in productivity . How cool will that be ?

There are really no data scientists in the wild !

There are statisticians , there are mathematicians , there are engineers , there are machine learning programmers , and there are many other types of experts out there – but there are really no data scientists out there in the wild ! What exists are data science teams and many are generally awesome . That is my conclusion after trying really hard to become a data scientist myself over the last few months . I am not giving up quite yet – but I am at a stage where I need to express my opinion on the matter for what its worth 🙂

2_221863

I thought I had good odds to be a decent data scientist, at least on paper. I think I am a good programmer and while I still think in C when I am coding , I can work on R and Python without sweating it too much . I am an engineer and have a degree in business – and I was convinced that I have more than an elementary capability in math and stats . I can do most data engineering work to get bad data into a shape that a machine can crunch. I spent a lot of time in BI – which made me believe I can visualize data really well. And so on . Yet , I didn’t become a data scientist despite my honest efforts to become one, and I think I now know why .

Between my engineering/business background a couple of decades in consulting – three things come naturally to me when I am faced with solving problems

1. The classic MECE approach

2. Thinking about it from the client view and working back to what I can do

3. Trying to get to a solution from first principles so that I trust the output

On the flip side, when I cannot do a good job on any of these three things, I get extremely frustrated. And in this effort to become a data scientist, I stumbled on all three. I also am close to questioning the idea of calling this domain as data science . It has more of an art feel to it – its like a half way point of an architect and an engineer, a bit weird. This could be an emotional response, so I am not going to make a fuss about it in this post.

As I played with it for a while – I understood that a few things need to come together for data science to work effectively for my clients, not necessarily in the linear fashion I call them out below.

Define a problem in a way that it can be solved – some kind of designer/consultant type skill which I am generally good at, I thought. Turns out you just keep redefining the problem as you learn more.
Create an abstraction – what programmers call “logic” or “algorithm” , and what math geeks call “model” . This needs a lot of “rinse and repeat” as I figured. I could have saved a lot of trouble if I started plotting data in simple dimensions first – a lesson I won’t ever forget.
Find, clean and assemble data to feed into the model – the data engineering skill, and it becomes a challenge when data is really big. Analyzing data makes you wonder about your sampling strategy throughout. There are always gaps and it will make you say “maybe” or “it depends” as the answer to most questions.
Figure out your model is crap, and explore alternatives endlessly. I realized I had forgotten how common substitutions worked in integral calculus and it took a lot of googling to get me comfortable on a first principles basis that what I am doing was sensible math. On the bright side my linear algebra skills are still sharp – but clearly that is not enough.
Figure out what is worse – false negatives or false positives, and have a strategy to not have too many and how to explain the few you will always get. This needs extremely good domain/industry knowledge and the kind of assumptions you make can be comical when you run it by a real expert
Finally – you figure out a half decent solution, fully knowing you can’t be really sure. At this point – you need to figure out a way to tell the story, usually with visualization. Voila – your entire perspective on how to tell a story with data will change quickly. I always loved D3, but now we are soul mates.

It is nearly impossible for one human being to be great at all these things – the best case is that you get to be really good at one or two, and have a solid appreciation of the rest. In other words – a bunch of such experts in these areas together can be brought together to form a great data science team. But it is just impossible to have one person have all these skills and hence be called a data scientist.

I also feel I should express my “amusement” about machine learning on two aspects before I end this rant.

Depending on whose book you read, or who you talk to – you will think machine learning has two distinct flavors. One is a math flavor, and the other is a programming flavor. I have more developer friends than math geek friends – so I mostly got a math flavored “black box” answer every time I had that conversation. But the books I studied were mostly written by stats majors.
The fact that a model is the right one does not mean that it performs well in production. You can sample ( I am staying away from my endless fights with bias, even for “simple” cases) and take smaller data sets to make your model work . But then you get the idea of running your logic against big hairy data – and suddenly you realize that your “black box” algorithms don’t all scale to work in parallel mode. I am now stuck in a debate with myself on whether a code rewrite , or a different math approach is better to crunch all the data I want.

Its clear that stats majors and CS majors should really talk more and not let me be the one worrying about these kinds of problems . I am happy to buy the pizza and beer for you folks 🙂

PS : my dear friend Sameer who is the chief of Kahuna , showed this blog to his data science leader Andrew – and here is his counterpoint . You should absolutely read this too – debates and strong opinions are good things !

When machines think on our behalf !

I don’t really think machines will displace humans in significant numbers for a long time – but I do think we have an interesting time ahead of us where we let machines think on our behalf quite a bit .

Every company out there has rebranded themselves to an AI company . The first generation of this is broadly of two categories

1. Telling an AI system what we want to do – order a coffee , close the curtain – pretty much call an available API to do something

2. Use AI to learn and do something better – like switching carpet bombing marketing campaigns and target better

But this is just a temporary phase. Why do you want to ask your AI wizards to order coffee for your home – isn’t it better to let the machine reorder coffee when it gets to some level ? Should it even ask you to confirm these kinds of routine activities or just do it without asking ? About half the things I need routinely -I am totally cool with having a machine do it without asking me anything , especially about coffee . I get mad at myself when I forget to pick up coffee and I don’t have much left in the kitchen when I need my coffee . I am sure I am not the only one who is ready to offload some routine activity to machines .

So this poses some interesting challenges like- if my AI system is the one ordering groceries for me without my input , how do other coffee vendors ever get my business ?

My wife already thinks I spend way too much on coffee . So she maybe able to tell the AI system to limit my purchases to say $50 a month . So now my AI thingy needs to be coupon shopping and stuff to stay within budget – but that is easy, machines can do this math stuff better than us anyway .

This makes me wonder about what is the future of marketing itself ?

Simple – brands stop marketing to me and instead they ( as in their AI systems ) will market to AI systems ! And Brands will do whatever they can to convince my AI system to feed me their coffee first to increase their chance of my business being a reorder situation !

Well , guess what – this means we are in the “my AI is smarter than your AI” world at that point . The bright side – email spam reduces significantly for me as a human , and I have some more time on my hands .

But this is not without its share of dilemmas too – for example , what if the AI provider for me and the coffee company are one and the same , or if they are two companies that share my data ? Am I going to be put in a situation where I am negotiating against myself ? So we do need some clear guidelines established on ethics , legality, security and even morality before we get there to dealing with this problem .

We have a good grip on what happens when AI does smart stuff when humans deal with it – like customer service , sales etc . But the thing that excites me the most is when both sides of a transaction are AI systems . I am betting it won’t take even 5 years for us to see this mainstream . Are you ready ?