Big Data Deployment – Planning is everything

When data becomes big – and it gives pretty cool insights of high value , the big hurdle facing customers will be their own deployment challenges. Where should this magic solution live ? And how exactly will we find out ?

Several factors play into this – and I am just mentioning a few that came to mind first.

1. Hardware is cheap-ish

Cheap is relative . When you have a million dollars to spend in a hard economy , would you buy hardware or will you do something else like hire more sales people , spend more on marketing etc ?

Would you buy or would you rent ? Or will you start small by renting and then buy when you need a scale that makes renting uneconomical ?

If you buy , are you going to buy cheap servers and live with extra redundancy? Or would you rather invest in fewer industrial strength servers with great HA and DR ?

2. Skills , or lack there of

Even if you have cheap hardware lying around , do you have skills and manpower to install and patch on all the machines ? Is it cheaper to hire/train internally or should you hire a consulting company to do your big data technical work ?

What about business users ? If big data tells them something new – are they empowered to act on it ? Or will a real time insight need a batch mode committee of people to act on ?

Does the business user have enough training to understand the context of what big data solution tells them ?

What is the minimum usability requirement ? (Not everyone is a data scientist – and majority of use cases will need stupid simple usability , ideally with little to no training )

3. Ever improving technology

Big data technology is benefitting from rapid innovation from open source world and commercial vendors . How much appetite do you have for keeping up with fast evolution of technology ?

Tactically , when will you replicate and when will you federate ?

4. Quantifying the value

Investments are worth only when value is greater than cost over a reasonable period of time . Cost is a straight forward calculation and so is OPEX vs CAPEX . But do you have the ability to quantify value and benchmark against the best in the industry ?

How does this play with existing strategies on BYOD , security and everything else that you have a strategy for ? Can they all work together ?

5. Platform and Applications

What will you buy and what will you build ? Do you have guidelines on deciding what factors will make you ask a vendor to create an app for you (and others) as opposed to building it yourself ? Do you have criteria for evaluating all the platform options ? Do you expect ERP like security for big data or will you relax it ?

6. Legal , ethical and privacy stuff

Are you aware of what the government thinks of your data ? Do you have ideas on how best to keep your big data solution legal and ethical ? Have you considered opt-in and opt-out scenarios for users ?

In short – there are a large number of deployment considerations for big data . The options available are increasing and improving almost every day . So definitely a good first step is to spend some time deciding on your big data strategy – while remaining pragmatic that your strategy will evolve over time , and probably at a rate faster than BI strategies etc of past .

In my opinion , accelerated value from big data is possible only if all or part of the solution is cloud based . A customer should not have to worry about the deep mechanics of big data – they should be focused on the quality of insights . The mechanics of this should be offloaded to a vendor you TRUST to partner with . Big data comes with big responsibilities – so choose the partner wisely and for long term .

Such a vendor should be able to shield customers from a lot of the flux – and at a cost that is cheaper than if you tried to do it yourself . Of course 100% cloud like deployment is not practical for many reasons as made obvious in the discussion above – but vast majority of big data landscapes will need to be cloud based if value realization had to happen at a big scale . So like it or not – plenty of hybrid solutions will crop up to support big data .

So what is the end game ? Wish I knew – but I do have a dream . A network of big data is my vision of an end game . A network where data is shared across a huge ecosystem where people collaborate securely on data without everyone having to keep a redundant copy and build custom solutions on top . Of course not all data can be shared – but in almost all industries I am familiar with , not even 5% of data of common interest is shared freely . Lets see how long it takes before such a network will show up in our lives – or maybe it never will , and I will have to find a new dream 🙂


Published by Vijay Vijayasankar

Son/Husband/Dad/Dog Lover/Engineer. Follow me on twitter @vijayasankarv. These blogs are all my personal views - and not in way related to my employer or past employers

One thought on “Big Data Deployment – Planning is everything

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: