Why I don’t worry about AGI … yet


The recent OpenAI drama triggered two debates on social media – one on corporate governance and the other on AGI. I was quite surprised – and amused – by the number of people who have jumped to the conclusion that AGI is already here or very close to being here.

I don’t think AGI is a near term thing at all. Also to be clear – I am a big fan of AI but I don’t think at all that AI needs to work exactly like a human (or better than a human) to be of massive value to our every day life. Similarly I don’t think we should sit around waiting for AGI to put some safeguards in place – less sophisticated AI still has massive chance to cause hurt because of the ease of global distribution of software.

There are a few reasons why I don’t think we will get to AGI by doing more of what we already do – like having bigger foundational models, even more compute , having even more training data and so on

To begin with – the basic idea of building an AI solution is to feed it a lot of data. For example – for language based models, training it on all of Wikipedia is a common first step. And that’s not nearly enough – on top of it, these models are fed millions of more tokens. Compare that to how a well educated human learns – no one reads the entire Wikipedia to get a PhD. Humans learn from a small amount of data . A highschool English teacher teaches critical thinking and analytical writing often based on just one book. We then can expand that to every other source of information we get later without needing explicit lessons. When we read a new book – we don’t need to think through every book we have read to form concepts. We are way more efficient in how we learn compared to a machine . But the way machines are taught – it doesn’t mimic how humans learn.

One counter argument is that a machine has a cold start while a human has the advantage of a long evolutionary history and hence some information is already present in our genes/brains . But even if that’s true – humans still didn’t have access to as much info as the machine readily has. Basically – we assimilate and store information differently to machines – and access it differently when we need it.

Humans can get started quickly with very little information. My daughter when she was three years old could recognize animals at the zoo based on the cartoons she had watched. She never confused a bear for something else because the red shirt on Winnie the Pooh was missing on the live bear 🙂 . She knew dogs and cats are animals – and naturally figured out that elephants and lions are animals too.

Also, humans can abstract information across modes of information without special training. Whether I see a sketch of a car, an actual car parked on the street or a car moving in a high speed chase in a movie – I know it’s a car and how it generally works. When I throw a ball up and it comes down – I can relate to the concept of gravity from my middle school lesson even though the example used was of an apple falling on Newton’s head. GenAI has started becoming multi-modal – but not in the way humans do. This is of course a simplistic way of looking at how a human thinks and acts – we have not yet quite figured out the details of how human brains work.

How do we find answers when we are faced with a question ? Let’s say you ask me what’s 121 squared. I don’t know on the top of my head – but I know how to calculate it , and I also know how to approximate it without a precise calculation. But if you ask me what’s 12 squared, I already know it on top of my head. AI only knows the latter way as far as I can tell. An orchestration of several computing techniques could potentially solve these kinds of problems – but learning from a sequence of tokens alone probably won’t get us there.

One last point on what “general” means in the context of intelligence. There are some things that a computer can do faster and more efficiently than a human can. If we can draw a boundary around the problem – like a game like chess or go – a computer has a higher chance to figure our optimal answers compared to us. B

Where humans excel is in generalizing as context changes. As AI research makes breakthroughs in how machines plan, set goals and think about objectives – I am sure we will see massive breakthroughs . And at that point – perhaps AGI might be something more of a reality. I am not an AI researcher – I am just a curious observer . I will happily change my mind as I get more information. But for now – I am not worried about AGI becoming a thing in near future.

GenAI will need a whole new look at Data Governance !


There are two areas that I think will be the “make or break” criteria for Generative AI

1. MLOps and

2. Data governance

And between the two – I think Data governance will be the one that will get enterprise attention first, and real quick. This is because I think the first hurdle will be to make sure enterprise users trust GenAI – and that’s a high bar in itself. I will park my thoughts on MLOps for now.

The size of the model is probably less important for enterprise uses – most tasks that AI can help with in an enterprise context are narrow in scope. This is generally a good thing. Big models are expensive to train and probably will never get used at inference time to make use of all it was built to do.

Even if we look at a complex end to end process in an enterprise context – it probably makes more sense to have a series of specific models that can work together, instead of one big model that covers everything. We don’t need the model that answers questions on purchase orders to also write an essay on the meaning of life 🙂

I am well aware that talking about cost of a new technology instead of innovation goodness is uncool – but having lived my whole career in large Enterprise land, I am quite sure that if GenAI has to scale in adoption – it has to have a low cost base. Enterprises might even live with a lower quality of responses if the cost is right. I am only half kidding here 🙂

To make smaller models (which are cheaper) really useful – enterprises will need very high quality data to fine tune it with. For narrow scope – enterprises generally will have the data with enough tokens to make it useful ( product manuals, customer complaints m, procedures, laws, invoices etc ). The only question is whether such data is governed in some systematic way so that the information can be trusted to be of high quality.

Data quality is largely an unsolved problem even for the much simpler world of data warehouses which has been around for decades now. It has almost never attracted enough budget and time in most companies. a big reason why datalakes didn’t yield the planned business value is also because people didn’t trust the data to be of high quality. We will see what fate awaits lakehouse approaches – but I am always optimistic. These things generally improve over time.

Size of the available data to train and fine tune might actually not be as big a problem as the quality of data. More data that looks the same doesn’t really do much for models that use it to make them any better. After reading the Chinchilla paper , I am sure we will keep massively improving the ratio of training data to size of models. Deepmind’s approach is radically more efficient than the original GPT-3 paper and it only took a couple of years to get there.

There are two complimentary approaches I can think of regarding how an enterprise will think of data for fine tuning (assuming they will start from a model that a someone else spent money on training) – 1. Establishing a consistent data governance process and tooling and use high quality trusted data to fine tune the models and/or 2. Depend on LLM itself to create high quality data ( self-instruct , use one LLM to create data for another , have human users curate LLM generated data etc – like in a chatbot type use case where a human expert can correct an AI solution and let it learn from it) .

Fine tuning is only one part of why I think data governance will get a lot of attention. There is an “everyday” need that will happen frequently when the model is used – people ( users, auditors, regulators …) will all ask for proof on where did this data came from that GenAI is answering .

GenAI has an additional headache beyond what’s used for training and fine tuning and all – users might feed it inappropriate data ! That’s another thing that needs to be governed – and probably heavier in regulated industries and when IP, privacy etc need to be kept in mind at every step.

There are two things to think about carefully here – the process of data governance itself, and the tooling and automation of it. I am less worried about the tooling part in relative terms – I am just not sure yet if enterprises have thought through all these “fringe” aspects of GenAI compared to all the cool applications they are excited about. If they don’t find the time and budget to get it done right – it will be a lot of grief to deal with.

GenAI in the enterprise – nine themes that I have seen so far


Ever since ChatGPT became a thing – I haven’t had a week pass by without having GenAI conversations with clients. It’s truly been a fascinating time to be a technologist.

There have been 3 times in the past 25 years when I have seen this kind of massive interest in being a first mover

1. When ERP helped consolidate applications

2. When Datawarehousing became mainstream

3. When mobile and cloud converged

I work in Financial Services – which adds it’s own layer of flavour to make all opportunities and challenges a bit more spicy 🙂

Here are nine broad themes that I have noticed so far from the conversations I have had with FS companies.

1. Risk mitigation vs First mover

FS companies pride themselves as primarily being the best risk managers (which is a very good thing for consumers). So “what can go wrong” has been front and center for GenAI plans. FS companies also know that their primary competitive advantage is in data and they want to be the first to capitalize on it. This push/pull tension is common in how they operate even for mainstream innovation – but GenAI has taken over as the lead theme for now, with public cloud adoption perhaps as the close second.

2. Privacy

All FS companies handle highly sensitive and personal data. There are tight restrictions on what can and cannot be done – and thankfully this industry thinks through this carefully. Between legal and ethical issues at play – the risk of getting this wrong is apparent to everyone and hence a lot of thought goes into mitigating it. How they solve it is not consistent across the industry – and a unified approach with is both efficient and effective is much needed. Otherwise a lot of GenAI innovation just won’t happen at scale.

3. Buy vs Build

The larger Banks (all kinds of banks) all have hired great tech talent including in AI. While this is obviously great to have such great people – it also means a lot of time and money is spent on building everything in-house. This is less common in insurance – but Banking and Capital markets companies generally love to build more and buy less. I know companies who have tried and failed to build their own equivalent of commercial CRM systems. Open source software has made building systems much more possible and many times it’s a good thing for the companies. But again – these debates do take away a lot of time from having innovation at scale. You can’t extrapolate time and budget from POC projects to full enterprise implementations.

Buy is not an easy option either given the tech is so new. Every large tech vendor has a platform offering and evaluating them takes time and money. The usual checklists for build/rent/buy is not enough for emerging tech, and needs to be extended. But that extension needs a level of knowledge that they don’t have today.

4. Skills

To begin with – most companies don’t have enough people with solid knowledge of AI. GenAI has an even smaller talent pool. Upskilling is totally possible – but takes a lot of time. I have lost of count of how many hours I have spent in the last three months reading papers to get the basics right. I am grateful that my employer has a lot of experts in the field who can clarify concepts for me when I run into confusion, but that’s not a luxury every company has. It’s not just great AI talent that you need – you need all the usual things that go with it ( architecture, engineering, UX ….) which means you have to deprioritise other projects. That disruption is not pretty

5. Intellectual property

One of the offshoots of GenAI is it’s use with developer productivity – code generation type usecases . Everyone – me especially – got very excited when we saw the possibilities for the first time. But that doesn’t naturally translate to the enterprise world – IP problems come into play very quickly. GenAI is only as good as the training set that was used in its creation. Have the solution providers done the work to make sure copyleft and copyright issues are addressed before a client generates code ? Otherwise it’s a massive risk that the companies carry. I just used code generation as an example – it applies across the board for GenAI ( well for all AI really )

6. Environmental impact

Greenhouse Gas emissions is something to think about upfront. GenAI is compute intensive to train given the size of the models – and while inferencing is not similarly intensive on a unit basis, a wide deployment will make sure the units add up. Also remember that GPUs consume more energy than CPUs. Between primary and secondary factors – the environmental impacts are a factor to be thought through before large scale work happens. Only a subset of companies seem to have made it a tier one criteria though in my limited view.

7. MLOps

While most of the attraction of GenAI is in the actual “generative” aspects, the enterprise attention is quite high on operations. There are big problems to tackle – how do you detect and prevent models from drifting ? How do you prevent degradation via AI learning from synthetic data created by AI itself? What are the most trustworthy watermarking approaches ? And so on . I think GenAI will be the shining moment for all the research going on in MLOps which will help across the board .

An excellent side effect of this attention to ops is that it has highlighted the need for investment in foundational data management which often gets ignored in the enterprise world.

8. Quality control

Similar to the point on MLOps – companies will have to rethink how QA is done. Software is built in layers – LLMs can affect the quality of layers above them that use them. There is a lot of work going on in academia and at all the big tech companies on improving accuracy, consistency, performance etc of LLMs. I have a strong feeling that these studies will probably result in alternate approaches to GenAI fundamentally . I will write another blog later to expand on my thinking – I am still organizing my thoughts on the matter.

9. Trust

GenAI has rekindled this important topic and put some urgency around scaling it. It’s invariably the first question in every meeting that I hear – “can we trust this thing?”. The question is simple – but the answer is quite complex in the capabilities needed to ensure trust. We need to know how AI arrived at the decision, what data was used to train it, what has changed over time in both data and the model and so on.