And Vijay Says...

So you want to join a large company , eh ?

In my couple of decades in corporate world – I have done stints at both large and small companies . I have also hired a lot of people over the years and watched their careers in those companies . These days when I hire – a lot of applicants tell me “it’s a really large company and that worries me” . So here is an attempt to provide some color commentary on this large company thing to help you think through .

Why do you want to work for a large company at all ?

The truth is that while the company is large – YOU are probably going to be working in a team that is not that large . The better questions to ask is about the team you will be a part of . If you don’t like the team’s mission or the people in it – walk away and don’t look back .

Large companies mostly do things at larger scale . What they occasionally lack in speed , they make up in scale . Scale comes in many flavors and not everyone can deal with scale very well . For example – in last 5 years , I had to run portfolios that were an order of magnitude larger than previous ones . I had to unlearn and relearn a lot to make it work and it was not easy . I have seen this work both ways – some people feel stifled at smaller companies because they want to change the world and they can’t find an opportunity to do so where they are . Some others at large companies beat their heads against a wall because they can’t move their ideas at the speed they want despite having access to vast resources . Choose wisely !

I often get asked “wouldn’t larger companies be really political?” . My answer “absolutely – but not any worse than smaller companies”. Politics is everywhere and you need to learn to live with it and navigate it . Also what is politics for you will be routine for someone else . Don’t sweat too much on that front . My own experience with small companies as an employee – which obviously is not a valid sample – is that favoritism and other political shenanigans are alive and well there , and more magnified because of smaller number of people .

Another common question is “wouldn’t I be lost in this big ocean?” . And my answer is “yes – unless you show real results”. Large organizations are unwieldy to manage and hence get matrix management structures . It is very easy to get lost in the system and it’s no fun to work that way . BUT – if you are good at what you do , and can show real results , the system favors you by design and you will get noticed quite quickly . If you are average – you will be the tree that fell in the forest that no one ever heard . So if you are not sure of your abilities to consistently deliver above average results , and if it’s important for you to get recognized – you should rethink the idea of joining a large company .

Here is another one “I have heard the only way to succeed is to have a godfather in the higher ranks”. Well – having a god father certainly doesn’t hurt . But your real question is how do you get one ? Bosses like team members who make them successful . When you see a top executive giving special attention to someone – don’t just assume sinister things are at play or that the employee is sucking up . While those things all happen from time to time – the majority of cases , that employee had gone above and beyond in making their boss successful and is just getting noticed for good work . Also – sucking up to the boss is rarely a sustainable strategy . A VP of sales cannot “hide” a poor performing director of sales for very long. Unfortunately in very large teams where metrics are not clear – you may run into these bad scenarios . I have witnessed it a few times in large engineering and marketing teams .

Large companies are often blamed as slow and bureaucratic . There is absolutely merit in that allegation . However , it has a good side too . Large companies are predictable in the sense that they rely on policies and procedures a lot . The policies themselves might be terrible and outdated – but you know what they are upfront . Also – if you run into problems like say a bad performance appraisal , or a commission dispute, you can be rest assured that there is a well defined process to rectify that and at least in my personal knowledge – it mostly favors employees . There are exceptions and those usually get the most publicity .

One last point before this flight lands – the reason the large companies want to hire you usually is because they think there is something special about you that they value . They are not really looking for one more of what they already have usually . So find that out while you assess your future employer – if you think you have to morph into something you are not , this might not work out well for you or the company despite all the money and titles . I learned this lesson the hard way and hopefully you don’t have to 🙂

Making workflows sexy again with machine learning

Since I grew up in ERP space , workflows are near and dear to my heart . I have set up a lot of workflows myself and I have been subject to the tyranny of bad workflows a lot too . Over time “collaboration” became a thing but classic workflows still largely rule our work life . The first time I directly set up a workflow was for a purchase order scenario in the late 90s – and I remember the client VP took me and a colleague to a fancy dinner to thank us . It solved the biggest pain for him in routine business and for two young consultants – that was like winning backstage tickets to a rock concert 🙂

So why do people use workflows ? The “useful” reason is that some decisions are usually complex and can’t be taken by one person – because of skills , legal and other reasons . There are many “useless” realms too . What is not talked about often in polite company is that lack of trust in fellow human beings is a big reason for the zillion workflows we all live with .

I have several friends who specialize in workflow and collaboration systems – and they take great care of their clients in setting up the most efficient and effective workflows . What doesn’t always happen is that life changes often, but workflows don’t mostly change with it . And this can lead to comic and tragic and tragi-comic situations !

For example – Lets say there is an executive who runs a business, which has annual revenue that has a lot of zeros on the right side . But if she damages her phone and needs a new phone , she will need approval from her manager to get one .

If a company trusts her to handle millions of dollars worth of business , shouldn’t she have an automatic approval for a phone ? Sure she does – and one call to the CIO can probably get this workflow fixed right away for her and everyone like her in the company . But it’s not just her – what if this is a non executive employee who has a critical job function like door to door delivery where the ability to reach a customer by phone is paramount ? Sure he needs it too – and another call to IT (but this time from an upline manager in escalation mode ) can fix that problem quickly as well . But how many variations can happen in a workflow before it reaches the “this crap cannot be sustained” mode ? It takes very little time and I have lived through that nightmare a few times when I was a young consultant . And however carefully we craft the design of workflows – we won’t be able to predetermine all options that become necessary across enterprise as market evolves and business adapt to keep up .

It’s probably never going to get fixed completely – but machine learning can help solve a lot of these painful problems . Even if an automatic fix is probably hard, given legal and financial policies don’t move at the speed of innovation in STEM, we can make a tangible impact with meaningful insights .

The data about existing workflows is easy to get . That is enough information to get patterns for an algorithm to start on . Then it’s a matter of introducing other data sets and see what we can learn – like say weather , sales data , budgets etc . In our example of the executive – an algorithm that learns that there is a huge business impact if she loses her phone , it can trigger an order automatically . This is a much more sustainable way than a deterministic “if exec , then auto approve” rule . Why ? Say the same exec moves to a non P&L job and has a desktop where she has access all day while a new phone gets ordered .

Humans cannot keep track of all the workflows that are set up over time . No one needs an extra notification or email if they can help it . So machine learning can also be used to keep track of how the workflow landscape evolves over a period of time and suggest meaningful ideas to the workflow admin on options to optimize .

If an hour a week gets saved for a given employee by eliminating useless workflows and making existing ones smarter , that is more than a week’s vacation that you can give that person at no extra cost in productivity . How cool will that be ?

There are really no data scientists in the wild !

There are statisticians , there are mathematicians , there are engineers , there are machine learning programmers , and there are many other types of experts out there – but there are really no data scientists out there in the wild ! What exists are data science teams and many are generally awesome . That is my conclusion after trying really hard to become a data scientist myself over the last few months . I am not giving up quite yet – but I am at a stage where I need to express my opinion on the matter for what its worth 🙂

2_221863

I thought I had good odds to be a decent data scientist, at least on paper. I think I am a good programmer and while I still think in C when I am coding , I can work on R and Python without sweating it too much . I am an engineer and have a degree in business – and I was convinced that I have more than an elementary capability in math and stats . I can do most data engineering work to get bad data into a shape that a machine can crunch. I spent a lot of time in BI – which made me believe I can visualize data really well. And so on . Yet , I didn’t become a data scientist despite my honest efforts to become one, and I think I now know why .

Between my engineering/business background a couple of decades in consulting – three things come naturally to me when I am faced with solving problems

1. The classic MECE approach

2. Thinking about it from the client view and working back to what I can do

3. Trying to get to a solution from first principles so that I trust the output

On the flip side, when I cannot do a good job on any of these three things, I get extremely frustrated. And in this effort to become a data scientist, I stumbled on all three. I also am close to questioning the idea of calling this domain as data science . It has more of an art feel to it – its like a half way point of an architect and an engineer, a bit weird. This could be an emotional response, so I am not going to make a fuss about it in this post.

As I played with it for a while – I understood that a few things need to come together for data science to work effectively for my clients, not necessarily in the linear fashion I call them out below.

Define a problem in a way that it can be solved – some kind of designer/consultant type skill which I am generally good at, I thought. Turns out you just keep redefining the problem as you learn more.
Create an abstraction – what programmers call “logic” or “algorithm” , and what math geeks call “model” . This needs a lot of “rinse and repeat” as I figured. I could have saved a lot of trouble if I started plotting data in simple dimensions first – a lesson I won’t ever forget.
Find, clean and assemble data to feed into the model – the data engineering skill, and it becomes a challenge when data is really big. Analyzing data makes you wonder about your sampling strategy throughout. There are always gaps and it will make you say “maybe” or “it depends” as the answer to most questions.
Figure out your model is crap, and explore alternatives endlessly. I realized I had forgotten how common substitutions worked in integral calculus and it took a lot of googling to get me comfortable on a first principles basis that what I am doing was sensible math. On the bright side my linear algebra skills are still sharp – but clearly that is not enough.
Figure out what is worse – false negatives or false positives, and have a strategy to not have too many and how to explain the few you will always get. This needs extremely good domain/industry knowledge and the kind of assumptions you make can be comical when you run it by a real expert
Finally – you figure out a half decent solution, fully knowing you can’t be really sure. At this point – you need to figure out a way to tell the story, usually with visualization. Voila – your entire perspective on how to tell a story with data will change quickly. I always loved D3, but now we are soul mates.

It is nearly impossible for one human being to be great at all these things – the best case is that you get to be really good at one or two, and have a solid appreciation of the rest. In other words – a bunch of such experts in these areas together can be brought together to form a great data science team. But it is just impossible to have one person have all these skills and hence be called a data scientist.

I also feel I should express my “amusement” about machine learning on two aspects before I end this rant.

Depending on whose book you read, or who you talk to – you will think machine learning has two distinct flavors. One is a math flavor, and the other is a programming flavor. I have more developer friends than math geek friends – so I mostly got a math flavored “black box” answer every time I had that conversation. But the books I studied were mostly written by stats majors.
The fact that a model is the right one does not mean that it performs well in production. You can sample ( I am staying away from my endless fights with bias, even for “simple” cases) and take smaller data sets to make your model work . But then you get the idea of running your logic against big hairy data – and suddenly you realize that your “black box” algorithms don’t all scale to work in parallel mode. I am now stuck in a debate with myself on whether a code rewrite , or a different math approach is better to crunch all the data I want.

Its clear that stats majors and CS majors should really talk more and not let me be the one worrying about these kinds of problems . I am happy to buy the pizza and beer for you folks 🙂

PS : my dear friend Sameer who is the chief of Kahuna , showed this blog to his data science leader Andrew – and here is his counterpoint . You should absolutely read this too – debates and strong opinions are good things !