There are really no data scientists in the wild ! 

There are statisticians , there are mathematicians , there are engineers , there are machine learning programmers , and  there are many other types of experts out there – but there are really no data scientists out there in the wild ! What exists are data science teams and many are generally awesome . That is my conclusion after trying really hard to become a data scientist myself over the last few months . I am not giving up quite yet – but I am at a stage where I need to express my opinion on the matter for what its worth 🙂


I thought I had good odds to be a decent data scientist, at least on paper. I think I am a good programmer and while I still think in C when I am coding , I can work on R and Python without sweating it too much . I am an engineer and have a degree in business – and I was convinced that I have more than an elementary capability in math and stats . I can do most data engineering work to get bad data into a shape that a machine can crunch. I spent a lot of time in BI – which made me believe I can visualize data really well. And so on . Yet , I didn’t become a data scientist despite my honest efforts to become one, and I think I now know why .

Between my engineering/business background a couple of decades in consulting – three things come naturally to me when I am faced with solving problems

1. The classic MECE approach

2. Thinking about it from the client view and working back to what I can do

3. Trying to get to a solution from first principles so that I trust the output

On the flip side, when I cannot do a good job on any of these three things, I get extremely frustrated. And in this effort to become a data scientist, I stumbled on all three. I also am close to questioning the idea of calling this domain as data science . It has more of an art feel to it – its like a half way point of an architect and an engineer, a bit weird. This could be an emotional response, so I am not going to make a fuss about it in this post.

As I played with it for a while – I understood that a few things need to come together for data science to work effectively for my clients, not necessarily in the linear fashion I call them out below.

  1. Define a problem in a way that it can be solved – some kind of designer/consultant type skill which I am generally good at, I thought. Turns out you just keep redefining the problem as you learn more.
  2. Create an abstraction – what programmers call “logic” or “algorithm” , and what math geeks call “model” . This needs a lot of “rinse and repeat” as I figured. I could have saved a lot of trouble if I started plotting data in simple dimensions first – a lesson I won’t ever forget.
  3. Find, clean and assemble data to feed into the model – the data engineering skill, and it becomes a challenge when data is really big. Analyzing data makes you wonder about your sampling strategy throughout. There are always gaps and it will make you say “maybe” or “it depends” as the answer to most questions.
  4. Figure out your model is crap, and explore alternatives endlessly. I realized I had forgotten how common substitutions worked in integral calculus and it took a lot of googling to get me comfortable on a first principles basis that what I am doing was sensible math. On the bright side my linear algebra skills are still sharp – but clearly that is not enough.
  5. Figure out what is worse – false negatives or false positives, and have a strategy to not have too many and how to explain the few you will always get. This needs extremely good domain/industry knowledge and the kind of assumptions you make can be comical when you run it by a real expert
  6. Finally – you figure out a half decent solution, fully knowing you can’t be really sure. At this point – you need to figure out a way to tell the story, usually with visualization. Voila – your entire perspective on how to tell a story with data will change quickly. I always loved D3, but now we are soul mates.

It is nearly impossible for one human being to be great at all these things – the best case is that you get to be really good at one or two, and have a solid appreciation of the rest. In other words – a bunch of such experts in these areas together can be brought together to form a great data science team. But it is just impossible to have one person have all these skills and hence be called a data scientist.

I also feel I should express my “amusement” about machine learning on two aspects before I end this rant.

  1. Depending on whose book you read, or who you talk to – you will think machine learning has two distinct flavors. One is a math flavor, and the other is a programming flavor. I have more developer friends than math geek friends – so I mostly got a math flavored “black box” answer every time I had that conversation. But the books I studied were mostly written by stats majors.
  2. The fact that a model is the right one does not mean that it performs well in production. You can sample ( I am staying away from my endless fights with bias, even for “simple” cases) and take smaller data sets to make your model work . But then you get the idea of running your logic against big hairy data – and suddenly you realize that your “black box” algorithms don’t all scale to work in parallel mode. I am now stuck in a debate with myself on whether a code rewrite , or a different math approach is better to crunch all the data I want.

Its clear that stats majors and CS majors should really talk more and not let me be the one worrying about these kinds of problems . I am happy to buy the pizza and beer for you folks 🙂


When machines think on our behalf !

I don’t really think machines will displace humans in significant numbers for a long time – but I do think we have an interesting time ahead of us where we let machines think on our behalf quite a bit . 

Every company out there has rebranded themselves to an AI company . The first generation of this is broadly of two categories 

1. Telling an AI system what we want to do – order a coffee , close the curtain – pretty much call an available API to do something 

2. Use AI to learn and do something better – like switching carpet bombing marketing campaigns and target better 

But this is just a temporary phase. Why do you want to ask your AI wizards to order coffee for your home – isn’t it better to let the machine reorder coffee when it gets to some level ? Should it even ask you to confirm these kinds of routine activities or just do it without asking ? About half the things I need routinely -I am totally cool with having a machine do it without asking me anything , especially about coffee . I get mad at myself when I forget to pick up coffee and I don’t have much left in the kitchen when I need my coffee . I am sure I am not the only one who is ready to offload some routine activity to machines .

So this poses some interesting challenges  like- if my AI system is the one ordering groceries for me without my input , how do other coffee vendors ever get my business ? 

My wife already thinks I spend way too much on coffee . So she maybe able to tell the AI system to limit my purchases to say $50 a month . So now my AI thingy needs to be coupon shopping and stuff to stay within budget – but that is easy, machines can do this math stuff  better than us anyway . 

This makes me wonder about what is the future of marketing itself ?

Simple – brands stop marketing to me and instead they ( as in their AI systems ) will market to AI systems ! And Brands will do whatever they can to convince my AI system to feed me their coffee first to increase their chance of my business being a reorder situation !

Well , guess what – this means we are in the “my AI is smarter than your AI” world at that point . The bright side – email spam reduces significantly for me as a human , and I have some more time on my hands . 

But this is not without its share of dilemmas too – for example , what if the AI provider for me and the coffee company are one and the same , or if they are two companies that share my data ? Am I going to be put in a situation where I am negotiating against myself ? So we do need some clear guidelines established on ethics , legality, security  and even morality before we get there to dealing with this problem . 

We have a good grip on what happens when AI does smart stuff when humans deal with it – like customer service , sales etc . But the thing that excites me the most is when both sides of a transaction are AI systems . I am betting it won’t take even 5 years for us to see this mainstream . Are you ready ?

Future of Software Development 

There are so many angles to this topic – and this is my third attempt in three days to organize my thoughts around it . The trouble is not that I don’t have enough ideas – it is that most ideas seem to contradict each other when I read them again .  Let’s see if the third time truly is the charm 🙂

1. Everyone will NOT  be a (meaningful) programmer in future 

I know that is not a popular position to take today – but that is where my mind is at now . We will need to water down the definition of coding significantly to make “everyone is a coder” be a true statement . If I can change the tires and oil of a car  , and program the infotainment system – should I be called an automotive engineer ? That is “roughly” how “everyone will be a coder” sounds to me now . 

Don’t get me wrong – I do think that understanding how code works is important for everyone in future . That doesn’t translate to everyone coding , at least in traditional sense . Very complex applications exist today in MS Excel – created by business analysts who are not traditional programmers . If we change the definition of coding to include that kind of development – I can buy into “everyone will be a coder”. The closer statement – though less sexy – would be “everyone will be a designer or modeler” !

2.   More code will be destructed than created 

World around us is changing extremely fast and that means we need lots of newer kind of applications . But the pace of change is such that no application can satisfy a customer for any length of time . Building better components and better orchestration mechanisms are the only way to deal with it . Neither concept is new – but the execution will kick into a higher gear . API designs will need a lot more flexibility than we are used to 

3. Performance will trump simplicity 

By simplicity – I mean what “humans” think of as “simple”, not machines . Code for tomorrow will be used more for machine to machine communication than for machine to human – by orders of magnitude . Creation of code itself might not need a lot of human help for that matter . And while maintainability and human readability are important today , it might get trumped by the need for extreme performance tomorrow  . For example – if both ends of an interface are machines , why would they need to communicate in text and pay for the overhead of things like XML/JSON tags that need to be converted to binary and back again to text ? 

4. You won’t need as much code in future 

A lot of code is written today because a human does all thinking and tells computers what to do in very prescriptive ways with conditions and loops and all that. When computers get to “general AI” – they will learn to think and adapt like humans – and won’t need step by step instructions to do what they do today . Less code will do a lot more than a lot of code does for us today . We may be decades away at most – we are not centuries away from that stage . Software will eat the world now , AI will eat software tomorrow 🙂

5. Software offshoring/outsourcing  will not be for development or maintenance – it will be for training 

It’s already possible for machines to learn from vast amounts of .  Some time in far future , machines will self train too . Till then – and that’s at least a decade or more – humans will need to train machines on data . And that will need to make use of local knowledge , labor arbitrage etc and hence will be an ideal candidate for offshoring and outsourcing ! 

6. Community of developers will be the only thing that matters  

Well – that is already true, isn’t it . I have forgotten the last time I have checked official documentation or real books to learn anything . I google or search on stack overflow to get most of what I need . I am not a full time developer – but looking at the communities that help me , I am sure full time developers do what I do , a lot more than I do 🙂 . A better way of mining this treasure trove of information is the need of the hour to get significantly more engineering productivity. 

7. More and more use of biological sensors 

Human bodies and brains are the ultimate computers and we are some ways away from mimicking human thought . In near future I expect simple sensors for all kinds of chemical and biological stuff ( how cool would smell be as an input , like touch is today ) that provide input to and also dictate how code should work . Text is fast becoming the most boring part of data any way 🙂

8. We haven’t even scratched the surface of parallelism 

What we call as massively parallel today in all probability will be amusing and funny to tomorrow’s programmers . The over heads of making parallelization work today is pretty high – and that will go away soon. A part of the problem is also that majority of developers don’t think of parallelism when they design code . I guess the near term solution will be for more primitives in popular programming languages (like for data access) to have built in parallelism . Note to self : get better at functional programming in 2017

9. Ethics and Privacy become core to all development 

A few things are happening together now

a) data is exploding and we are leaving our digital finger prints everywhere 

b) applications won’t stay around long enough to have ethics and privacy as a “future release” issue to be fixed

c) more and more software affects humans , but is controlled by machines with little human input 

d) access to information is (and will be ) universal – which means bad guys and good guys can both use it for what they want 

e) legal systems won’t ever catch-up with the pace of tech innovation 

And that means – ethics , privacy etc need to be core principles of tomorrow’s software . It cannot be “in pockets” as it happens today. And the education on this topic needs to be pushed down to even the earliest levels of schools. 

9 is really not a conventional number of bullets for a list – but given there won’t be anything conventional about the future of software development , I think now would be a good time for me to stop this list . Feel free to add , edit and challenge in the comments – I look forward to it .

Happy 2017 everyone!

CES 2017 – Random Thoughts On Future of APIs In An AI world

I spent half this week at CES 2017 in Las Vegas !


To say the least, it puts the “enterprise” side shows to shame in number of people it attracts, variety of solutions it offers and how boldly the future is thought about. It did not take any time to see that the future is all about AI – and how expansive the definition of AI has become.

There were bots of all flavors there – but voice was the major interaction media, and it was hard to walk the floor without hearing “hey Alexa” type conversations . Also noticed a lot of VR and AR. I walked away thinking voice will rule the consumer world for a while, and between VR and AR – I will bet on AR having more widespread use. While VR based video games are indeed cool – putting on something on your head to use technology makes me wonder how many will actually use it. Like 3D televisions – where you need special glasses, and hardly anyone uses it that I know.

The generation of products using AI that I saw (admittedly I only saw a small fraction of the HUGE show) barely scratched the surface of what is possible. If I think of what I saw with my engineering hat on , it is something like this

  1. Human voice or text waking up the AI service ( “hey Jane” )
  2. A natural language based request ( “When is my next meeting” )
  3. Voice to text translation as needed
  4. Intent and entity extraction ( me, my calendar, current time, read entry)
  5. Passing it to a structured API ( ) and get a response
  6. Convert output to a string ( “your next meeting is in 2 hours with Joe” )
  7. Text to voice translation
  8. Keep the context for next question ( “is there a bridge number or should I call Joe’s cell?” )

This is easy stuff in general – there are plenty of APIs that do stuff, and many are RESTful. You can pass parameters and make them do stuff – like read calendar, switch a light on , or pay off a credit card. If you are a developer – all you need is imagination to make cool stuff happen. How fun is that !

Well – there are also some issues to take care of. Here are 5 things that I could think of in the 1 hour in the middle seat (also in the last row, next to the toilet) from Vegas back home.

Like say security – you might not want guests to voice control all devices in your house for example (which might not be the worst they can do, but you know…). Most of the gadgets I saw had very limited security features . It was also not clear in many cases on what happens to data security and privacy. A consistent privacy/security layer becomes all the more important in the AI driven world for all APIs. 

Then there is Natural language itself. NLP itself will get commoditized very quickly. Entity and intent extraction are not exactly trivial – but its largely a solvable problem and will continue to get better. The trouble is – APIs don’t take natural language as input – we still need to pass unstructured>structured>unstructured back and forth to make this work. That is not just elegant – and it is not efficient even when compute becomes negligibly cheap. Not sure how quickly it will happen, but I am betting on commonly used API’s should all have two ways of functioning in future – an NLP input for human interaction, and a binary input for machine to machine interaction (to avoid any needs to translate when two machines talk to each other) . Perhaps this might even be how the elusive API standardization will finally happen 🙂

If all – or most – APIs have an easy NLP interface, it also becomes easy to interoperate. For example – if I scream “I am hungry” to my fridge, it should be able to find all the APIs behind the scenes and give me some options and place an order and pay for it. And my car or microwave should be able to do the same as well and I should not have to hand code every possible combination . In future APIs should be able to use each other as needed and my entry point should not matter as much in getting the result I need. 

Human assistants get better with time. If an executive always flies by American Air, when she tells her assistant to book a flight, the assistant does not ask every time back “which airline do you prefer” or “should I book a car service also to take you to the meeting when you land”. The virtual assistants – or pretty much any conversational widget – I saw this week had any significant “learning” capability that was demonstrated. While I might enjoy having a smart device today since it is a big improvement from my normal devices – I will absolutely tire of it if it does not get smarter over time. My fridge should not just be able to order milk – it should learn from all the other smart fridges and take cues from other data like weather . In future, “learning” should be a standard functionality for all APIs – ideally unsupervised. 

The general trend I saw at CES was about “ordering” a machine to do something. No doubt that is cool. What I did not see – and where I think AI could really help – is in machines “servicing” humans and other machines. For example –  lets say I scream “I am hungry” to my fridge. The fridge has some food in it that I like and all I need is to put it in the oven. So fridge tells the oven to start pre-heating – and gets no response in return ! Telling me “the oven is dead” is a good start – But the intelligent fridge should be able to place a service order for the oven, as well as offer me an option to order a pizza to keep me alive for now. APIs should be able to diagnose ( and ideally self heal ) themselves and other APIs in future – as well as change orchestration when a standard workflow is disrupted. 



Future of Project Management 

Next to programming , Project management is the role that gave me the most satisfaction in my career. So after Rethinking IOT and AI for future , and Future Of Technology Consulting – I spent some time organizing my thoughts on where project management is today and where it is headed .

This picture is an old one – where I was leading a consulting team as the PM at my client, and we were codeveloping a product with SAP. There was no way to distinguish who worked for which company in this team. It was a highly stressful time – but also the most fun and productive time of my life 


In general I think project management as a profession has lost its stature and for all the wrong reasons . I also think that it will regain its lost glory, and then some, starting almost immediately !

Utterly stupid is how I would describe the move to commoditize project management over the last few years . The PC version would be penny smart, pound foolish !

Several factors played a part – and I think the wrong use of PMP certification is one big reason.  I am personally not a big fan of certifications in general. I (and others) have successfully managed hundreds of millions of dollars worth of projects successfully without a PMP a . When I was a full time PM (also when i was a developer) , none of my clients ever asked me if I was certified . In my view PMP and tech certifications are a definite plus for the job – but should not be a mandatory requirement .

PMP gives a false sense of security and accelerates the path to “if everyone has a PMP , they must be roughly equal in skills – so let’s choose the cheapest one for the job” . When I convinced my old boss many years ago that I don’t need a PMP – my defense was that we commonly knew at least ten people in their early twenties – who have never even been a team lead – pass PMP exam with flying colors, and neither one of us were confident enough to let them run a team !

To be perfectly clear : PMP itself is not to blame . I have studied the “body of knowledge”  closely and it’s pretty good . I encourage all PMs and aspiring PMs to study it . I am just strongly opposed to treating it as a way to falsely equate everyone who has it to be of same project management ability .

Becoming a PM is best done in an apprenticeship model . Project plan , documentation , chasing down tasks etc are good things, and you can learn it from books – but successful projects are mostly about making people successful  , not tasks successfully completed ! There is a big difference and a full appreciation of that only comes from watching and learning from folks who do it consistently well . However smart you are – you can’t learn it by studying a book or taking a multiple choice exam .

Sadly – and probably due to the mandate to commoditize all parts of IT projects  , task management – which was a means to an end in the past – seems to have become all of project management today !

Consistency and repeatability and scalability are all good for efficiency . So dumbing down of some project management aspects have that aspect going for it . But what is missed out today is effectiveness – efficiency without effectiveness leads to failed projects . And effectiveness is all about people !

People have only so much intellectual and emotional capacity and not all of it is spent on work . Example – the best programmer in my team in Bangalore spent 4 hours every day on commute . Even then he was twice as good as the next programmer . I let him work Mondays and Fridays from home and he became three times as good at what he did . I knew that issue because I went to Bangalore and lived there for a month to see the team and work with them and become one of them . I couldn’t get the same result by asking him to document more or sit in more status calls . I also remember a situation where we had an unreasonable client who made constant demands of our time to meet time lines that were not realistic . After two weekends back to back at work – my team had no energy left . My solution was to stop working weekends and instead we all went out bowling for a whole day on Monday and followed by a potluck on Tuesday . Even the client could not believe we hit the deadline with room to spare !

Motivating and getting the best out of your team is one aspect – equally important is making your client successful . By that I don’t mean the client company – I mean the human beings from the client team who work with you and sponsor the project . This means you need to get to know them , what makes them tick and what success means to them. No certification teaches you empathy !

To make clients successful – you need to know their business and their industry cold , or know others whom you can tap into for that knowledge . You also need the ability to make short term vs long term trade offs .  I once had a finance director of a company as my client – and she was stressed out that there wasn’t enough time left to build 150 reports that were scoped for the project . I worked with her and told her similar projects in past only needed 50 or so reports for similar functionality and the two of us spent a day looking through the specs and quickly brought it down to 40 reports . My employer had a short term revenue loss because of reduction in scope – but this lady was publicly recognized by the CFO of the company for getting the project done on time and under budget . And she got a larger portfolio and I got a lot more business from her , which in turn helped my own career progression .

Project managers need the respect of their team to succeed. PMs who manage a project where they don’t know any aspect of what is being done generally find it harder to get the team’s respect. It can be done – but it is an uphill task and you need superior skills and patience. This is another reason why commoditizing PM skills is a terrible idea – people who grew into PM after being developers, consultants, team leads etc can empathize and add quality to their team’s work much better than someone who can only manage tasks.

Why do I think this will change quickly, and for the better ? Its because the complexity of projects and client expectations have both risen to a level where commodity skills and elementary automation cannot keep up. Fear of failure is very high today thanks to a lot of failed projects in past – and at the speed at which technology is progressing, there are very few “apples to apples” references to say “this will work”. Good solid project management is the need of the day to help realize the value of technology innovation happening around us. I think employers and clients are both ready – or very close to being ready – in treating PM again as a critical role in making projects successful .

Those of you who manage development teams as PMs might enjoy this post this post I wrote in 2010 🙂

PS : Might as well add a shameless plug – If you have experience as a PM in big data, analytics, IOT etc – I am hiring in North America. Ping me !


Future Of Technology Consulting

Its the last week of the year – and that gives me the luxury of time to spend thinking of some big picture topics. Last week I was Rethinking IOT and AI for future . And that led me to think of my own profession of technology consulting in future. Especially important to me since my 11 year old daughter wants to be in this profession when she grows up . She wrote this in her first grade journal – so it is official 🙂


As always – these are just my personal points of view and do not represent the views of my employer.

Tech consulting is bound to get disrupted at least twice in my remaining professional life – with pendulum swinging first in the direction of flexibility, and then in the direction of convenience. That means the big and small companies that play in this ecosystems, and the assorted consultants that work in these companies are in for some crazy times. I would venture a guess that the first wave will be within 3 to 5 years, and the second one probably 7-10 years out from now.

When I joined consulting, the career option was pretty straight forward. If you were good at what you did , you can make Partner in about 10 to 15 years and then reap the benefits of that till you retire and then retire on a comfortable pension. Billing rates of $500 to $800 an hour did not raise many eye brows in those days. Well – that has changed for sure over my professional life !

When I hire new college grads these days, I see only a minority who have a career plan of sticking it out at a consulting firm for 10-15 years to make partner. Most of them plan to keep their options open to explore other careers along the way. When I hire for experienced roles – I increasingly see candidates who are from non consulting backgrounds wanting to try consulting for the first time. I also saw the reverse of this when I was in the software business for few years – many consultants (like me) wanted to gain exposure to software business. I am not a career channel management executive – but I had a great time establishing a channels business at MongoDB. In short – traditional career paths are dying and more and more people at all levels of their profession are vying for flexibility .

Interestingly – while employees have made the change in large part to this “flexibility first” mode, most employers are still in “traditional” mode. I believe the inherent difficulty for larger companies is how the financial market looks at them – risk taking is encouraged for small companies, and punished at larger companies. And changing the org model is fraught with short term risk by definition – so employers resist change in many ways. The more progressive ones encourage flexibility in hybrid models – take one day a week to do your own projects, put a consulting guy in charge of channels team, take a line sales leader out of the business and put her in charge of HR etc. They try to “force fit” employees with “new” ideas into “traditional” career paths. It does not seem to scale very well from my (admittedly limited) perspective.

At the moment, number of employees with such career attitude is not large enough – but in 3 to 5 years, I expect it to overwhelm and overpower organizations that a new paradigm will need to be built. And when overwhelming force is applied to organizations with a lot of inertia, the pendulum swings to an extreme. My bet will be that technology consulting firms will become master orchestrators that bring a tailored collection of skills to a client – even though majority don’t work for them directly. But this model has one inherent problem – elasticity is not your friend in labor based business.

So that means – tech consulting companies will need to shift their business model to be more of an IP based one. That needs new skills that historically were not important to these firms – like engineering, product management and product marketing at scale. A lot of existing roles will probably go into a “freelancer” system. Another way of saying it is – there will not be much difference from what is a product and what is a service. These lines will all get very fuzzy . A natural side effect will be acquisitions of product companies by tech consulting companies – at scale, unlike the handful that happens today !

The disruption this causes will be tremendous. Procurement function will need new ways of evaluating suppliers . Analysts and VCs/PEs will need new ways of assessing value of businesses. HR will need new ways of sourcing and developing talent. And so on . I won’t name names – but I have a list of tech consulting companies in mind that probably cannot deal with this and will end up in utter chaos at a minimum, or go out of business at worst. Yeah, I think it will be that dramatic (and I hate drama at work).

What happens next ? The way we deal with such chaos is usually to swing the pendulum in opposite direction. About couple of years into the chaos caused by this disruption – I totally expect leading companies to realize that for scale, some centralization is a must. Its like mainframes to client server to cloud to edge computing – centralization and decentralization happens back to back at intervals to keep the universe in balance. I don’t think the asset based nature of business itself will go away – but I do think the employees and employers will realize the pragmatic limits of autonomy and flexibility and make compromises.

But that still leaves the wild card – the power of automation to disrupt the disrupters. In this 10 years or so that I painted above – there is no saying how quickly automation can influence tech consulting’s business models fundamentally. The incremental changes are already well known – but as with every long term disruptive force, I would bet on us having under estimated the effect it has on our future.

Rethinking IOT and AI for future – I don’t want to take cloud to a drone fight !

IOT has always fascinated me. As a young developer in the 90s, I made a living doing mostly integration work – making two computers talk somehow . And then I moved into the world of data management and analytics. IOT is one of those things that just combine all the good and bad of what I know from my data and integration background. It is of particular interest off late to me given its part of the portfolio I lead at IBM – so I have a practical need to understand it better and think about where it is headed.
When I think of IOT – like many others, the first thing I think of is mobile phones to get the idea clarified in my head. 

I started with “How does an app really work ?” . You do something with the app as a user -leading to some minimal processing that happens in the phone using its RAM and CPU and then some information gets sent to a server somewhere ( as in cloud) where it does the heavy lifting and then it sends  back simple instructions back to the phone. Rinse and repeat for every action . 

The computers in the data center and the processing power of the phone are all improving exponentially . What is not keeping pace is network bandwidth. Network is the biggest bottleneck today and with the explosion of data that we live with today – its going to choke even more tomorrow . This is already true for non IOT world – for example I know SAP Hana projects that could not go forward because of insufficient network capacity . 

That is just phones. What about all the other “smart” devices that have processing power ? Everything from toothbrushes to cars already have, or will have, significant processing power soon . And that is only going to increase over time. 

I don’t know for sure – but between our two cars, I am guessing there are couple of hundred processors sitting in my garage as I am typing this. And my cars are not Teslas,  which probably have many more. As “things” stop needing humans to do their “thing” (see what I did there cleverly ? )  – they will have more and more processing power and local memory. They will generate even more data that needs to be analyzed quickly . And “network” as we know it will be toast dealing with all the traffic .

Already machine generated data is many times bigger than human generated data. These “things” won’t rely on “small data” to work without humans – this will be the “real” big data challenges we need to solve. It won’t be simple text files that count as data – it will be things like sound and video. 

The volume, variety and velocity of data will become quite unmanageable if we wait for these things to ping the cloud , transfer data and get a result back. You don’t want a drone to crash into something because it could not reach its cloud source to exchange data and find out how to land. You get the general idea – things that need quick request/response generally cannot work like how the average phone app works today. Even the next gen of phone apps probably can’t work like how they work today !

So what is the big deal ? Well – the current thinking on systems architecture needs some refresh. Maybe a significant refresh !

I don’t think cloud will go away per se. Just that it will become more “hybrid” in nature. My purist cloud friends who hate the term hybrid cloud will probably yell and scream at me – but for now, I am going to take the heat on that 🙂

What I mean is – a lot of processing where latency is a critical factor – the “things” will become mini clouds themselves. I am sure there is (or will be) a better term for it – but till then “mini cloud” is the official technical term for me. 

These mini clouds might be very powerful – many CPUs/cores/storage/RAM etc – and should be able to do pretty much all real time processing, and also be capable of some point to point interfacing. 

For example if I ordered Sushi and my wife ordered a burger – and two drones are trying to both land in our front yard to deliver our orders – they should be able to talk to each other on who lands first and not wait for both to talk back and forth with their respective clouds. If there are only two drones – perhaps we can tweak existing infrastructure and design patterns to make this work. But what if there are two million drones over my city doing different things ? I don’t want to take cloud to a drone fight !

My other passion is AI and analytics . How do they play in the IOT world ? Well – they need to play in two places at least . A micro version needs to play in the “thing” (the mini cloud) and a “macro” version needs to play in the traditional cloud . And there might even be a third version that brokers information flow in the middle . 

The reason is straightforward  – things need to make autonomous decisions like a drone figuring out where to land given what it can sense . There might not be enough time to ping the cloud and get an answer and hence the mini cloud needs algorithms that help with the task at hand . But this way the drone never gets much better beyond what is already coded . For the drone to get better at its job – it needs to learn from every other drone , as well as some external data like weather etc . In some smaller scenarios I guess point to point interfaces can solve this problem . But at scale – this would mean the big cloud runs advanced algorithms ( probably AI types ) and what it figures out gets shared with all the drones that it serves . 

This is a non trivial problem . Just to reliably sync regular text databases in a distributed system makes life complicated for IT shops around the world today despite such techniques being available for a long time . Just getting one set of machine learning models going is a pain for many teams . IOT just makes it much harder problem to solve – especially considering different vendors might own what goes into the brains of a drone and what goes into the centralized cloud . 

Not to make it more complicated – there are all kinds of “quality of service” questions to be considered too like performance , security , disaster recovery, standards and so on . And of course there are the unknowns – like what happens if a brand new communication protocol comes up that eliminates the network bandwidth problem ? 

Not all scenarios need all the bells and whistles of the drone example I used to make my case . A smart toothbrush probably doesn’t need as much sophistication to reach its peak potential compared to a drone . What that means is that whatever is the future architecture we come up with – it needs to be “sliding scale” friendly . Otherwise at a minimum the economics might preclude scale and viability of the solution . 

Interesting world eh ? If this kind of work fascinates you , let me know – I am always hiring 🙂