Granularity of measurement is directly proportional to fear of failure !


Its December , and its that time of the year when I am torn between planning for next year and closing out the current year on a strong note. As I sat down with a bunch of paper drawing up crazy ideas of what we will do next year, I realized I am falling prey to my biggest strength and weakness – the desire to measure everything at the most granular level.

I grew up in BI where if you had granular data, you can generate very sophisticated insights. And I have spent countless hours modeling such data, and writing ETL scripts to get that data in a shape I like. Over time, this has become second nature to me even though I don’t get to do the fun stuff with data any more. Most of my decisions are made on aggregate data today, and the only time I need granularity is when things go wrong and I have to “debug” to find out what the heck happened.

Granular data comes at a high operational overhead – in terms of management itself, and a lot of data wrangling. You need codes to tag every bit of data – and I am not kidding when I say that some times I come across more attributes about the data, and the number of data records itself 🙂

So as I sat here staring at the stack of paper on my desk and the array of spreadsheets on my macbook, I came to the sad realization that my penchant for granularity is simply a representation of my fear of failure. Over time, I have taken over more and more responsibility at work – and the measurements have also become more and more complex and time consuming. This is true for pretty much every employer I have worked for and every client for whom I have designed solutions.

Which brings me to the “scale of failure” issue. As your responsibilities increase, the number of ways in which you think you can fail also increase. And to compensate, we try to measure across multiple dimensions, and matrix the organization some more. At some point, you will absolutely realize that operational over head is not worth the trouble (do you need more checkers and double checkers than people in the field?) but by then you are also a creature of habit that cannot get away from this mess you created for yourself. And finally you also make everyone else in your team miserable – because you force them to tag more and more attributes and unlike you, they might not even know why they are tagging it. I am not kidding – there are many things I was told to do 10 years ago that I had no clue why I had to do. It would have lessened my grief significantly if my bosses at least explained why they made me do it then 🙂

Tagging data and multi dimensional operational reports have another consequence that was perhaps never intentionally designed – the sinister idea of taking credit for someone else’s work . At some scale of business – especially in sales type work – it is hard to pinpoint what all led to a given sale. You will have a direct sales team, and assorted over lay teams that all think they were the ones who drove the business. So they will enthusiastically start tagging more attributes to show their value add and we end up with “everyone is a winner” type scenarios. Even then we won’t typically stop this madness.

So my current plan is – I am going to sacrifice some granularity of measurement in our measurements. I want me and my team to design our work around the idea that we are aiming for success instead of lack of any kind of failure. I would rather fail responsibly quickly and stop doing things that don’t pan out, rather than worry about it every step of the way for all activities for ever. Lets see how that goes 🙂

 

 

Boss Vijayasankar 3/12/04 – 12/3/2016


He was the biggest teddy bear of a pup when I picked him up from Marjorie Blake’s house in Bakersfield on May 8, 2004 . Turned Dhanya , who was mortally scared of all dogs , into the biggest dog lover overnight


He helped raise our daughter Shreya – she used to call him Chetta (big bro) till she was about 5 🙂



He was a great big brother to Hobo , when he came home for Shreya’s 4th birthday


He loved his toys – and food . Learned everything he needed for competitive obedience in about 4 weekends and two packs of hot dogs



He was a sage by the time Shreya was born . He got to be a puppy again as Hobo grew up . Hobo was ten times stronger , and Boss was a hundred times smarter and wiser 🙂


Along came little Ollie – Boss was already 9 by then but young enough to show the kid the ropes .

He kept alive my dream that Shreya would some day choose to handle in the show ring 🙂


He loved water – and was the ultimate gold fish . I must have tossed a few thousand oranges into that pool in last decade 🙂


Boss never met a stranger – he loved everyone . But for me he was my best buddy , my shadow . If I slept in – he would come find me and wake me up without fail . He loved riding shotgun with me in the SUV


He was the “boss” of the gang – from day 1


He grew older gracefully , and we celebrated every birthday


He was diagnosed with Hemangiosarcoma in September 2016 . He underwent surgery to get a tumor removed . The surgeon gave us three months with him , and we tried our best to make every day with him the best he had


And today , December 3rd 2016 – he had all the ice cream , eggs and bacon he could eat . And before the vet took over – he had a giant slice of chocolate cake


And he went to sleep on my lap , just as he did the first day I met him and brought him back on a united airlines flight back to Phoenix .

You will always live in my heart Bossappai – you were and always will be the boss . Till we meet again , buddy !

Is data science doomed with Trump being elected ?


Ever since Trump won the election , the question I have faced the most from family and friends is “is predictive analytics dead?”. I also got asked if Watson would have picked the correct winner . The more savvy doubts were about how Clinton missed the trends in places like Wisconsin and Michigan .

Here are my thoughts – and pls treat them as my personal opinions only as always !
To begin with – the analytics was not all wrong , and did many things right . It also did many things wrong . Rather than saying data science  is dead , I think all it really is that it’s cloudy and some work needs to be done to make it less cloudy . 
The thing we forget the most about data science is that it is all about odds . When Nate Silver said Trump had 35% chance of winning – he meant exactly that ! Having about 2/3 chance of winning for Clinton should not have been interpreted as Clinton will win ! This problem is one I face every day with my clients too on all kinds of predictive scenarios . It’s not a binary thing as we like it to be in most cases .

That said , the predictive models all had given significant odds for Clinton and now we know something was wrong with them . So yes – data science on politics should absolutely take some significant blame for what they missed . 

To begin with – All analytics about people are hard . I wrote about it few weeks ago here . 

Models are based on history and assumptions to give them context . It’s not uncommon in this business for calibration to go out of whack  – usually because context changes , but the model continues to depend on old assumptions . Since all public analysis of this election trended the same way – I guess we can safely say that “establishment thinking” about polls needs an overhaul . 

Then there is the actual data itself that comes from polls and the bias ( like selection bias , confirmation bias etc ) that gets associated with it . I often post twitter polls to get a pulse on topics I care about – and I should know about the selection bias when I look at the results . People who collected and analyzed the data should have been way more careful about bias . 

Pollsters need to know the markets they are polling . Respondents don’t always literally say what they mean . This is nothing new – any kind of market research would have run into this scenario and there are ways to get around it . When I have done collection and analysis about foreign markets using folks who are technical experts , but largely ignorant of those markets – I have always had poor results . I have a feeling that a lot of polling was “lazy” this time around in election season . For example – if your call list only has landline numbers , you won’t know what I have to say ( I haven’t had a land line for quite some time and I am hardly alone in that ). 

Weather forecasting is something we are all familiar with since it’s been around for a long time . However , our ability to accurately predict beyond the next week or ten days is actually not that high . Little events can change weather big time.  If we extend that thought to how the sex tapes and FBI actions all came back to back – we probably can have some sympathy for the statisticians who had to deal with the data . 

Even if all the models worked well , late happening events – like FBI director’s two notes to Congress – don’t leave a lot of room to actually act on what the model tells you . We were recently working on predictive maintenance solution at a client . The maintenance VP was very clear that if all I can give him is a 2 day window with failure prediction  , there isn’t a whole lot he can do to avoid down time . While I don’t know for sure – I wouldn’t mind making a small bet that analytics used by Clinton campaign probably highlighted the issues of Michigan and Wisconsin , just that it was too late to do anything about it . 

I am sure I am missing several other aspects – and some technical aspects are probably too boring for most of my usual readers – but I think I have given a fair idea of the thoughts I have on this topic . I am sure you will add more , or correct me in the comments . 

Some changes in the polling and predictions industry is needed , but we just need to try to NOT throw the baby with the bath water . And while I am the biggest fan of Watson , I don’t really know for sure if it would have done better . Knowing what went wrong this time – I am sure this industry will use it to its advantage and reclaim its position quite quickly . 

Parting thought – for all my pals who think AI will take over the world soon , this might be worth noting that for foreseeable future these models will need significant human help to be useful . It’s man AND machine , and we should stop obsessing about man VS machine .