Decoding Tokenomics: From Brute-Force Reasoning to Architectural Minimalism


The enterprise AI landscape is hitting a quiet but definitive turning point. Over the last two years, organizations rushed to move their generative AI proofs-of-concept into production, driven by the sheer awe of what frontier LLMs could accomplish. We built multi-agent frameworks, dense RAG pipelines, and autonomous workflows capable of orchestrating complex enterprise tasks.

But as these systems scaled to production instances, a cold, hard reality set in. The issue isn’t that the models aren’t smart enough, it’s that they are incredibly expensive to feed!

Enterprise technology leaders are waking up to a profound realization: building context-aware, deterministic applications with non-deterministic models is an economic battlefield. The era of “token-maxing” – throwing boundless token budgets and massive test-time compute loops at every problem is hitting a financial and operational wall. Winning the next phase of enterprise AI requires an aggressive shift toward Architectural Minimalism.

So, how did we get here?

In the race for absolute accuracy, frontier model labs introduced a paradigm shift: test-time compute. Instead of generating a knee-jerk next token, modern reasoning models use internal monologues, multi-turn self-correction loops, and extensive chain-of-thought processing before outputting a final answer.

This is “token-maxing” in its purest form. For complex coding, scientific discovery, or deep strategic evaluations, this approach is revolutionary.

But when applied carelessly to enterprise workflows, it creates what can only be described as structural bloat !

Your “simple” question might only be 50 cheap input tokens – and your answer might be 100 more expensive tokens. What you don’t see is the part where an additional expensive 1000 tokens were the AI talked to itself. Think about the overhead of this cost across millions of transactions – you changed a 2 cent efficient transaction to a 50 cents unit economics liability !

In multi-agent architectures, teams frequently pass the entire chat and execution history back and forth between specialized agents to maintain context. If Agent A, Agent B, and Agent C all receive the full payload at every turn, the input tokens grow quadratically, not linearly. You quickly end up paying a massive “historical baggage tax” on a turn that only required a simple validation.

High token costs rarely stem from rank incompetence. Instead, they happen because teams are trying to force non-deterministic models to behave reliably within rigid enterprise constraints. Without mature guardrails, models naturally wander, hallucinate, or demand massive context injections to maintain accuracy.

High token spend is a sign of an architectural mismatch. It happens when a team treats a top-tier, frontier LLM like a universal database, a basic keyword router, and a heavy-duty processor all at the same time. Using a frontier model to parse a date string or extract an account number is the enterprise equivalent of using a Ferrari to haul gravel. It works, but the cost per mile will ruin you.

So, what does Architectural minimalism mean in this narrow context?

It is about answering this one question : what is the absolute minimum compute required to execute this step with 99.9% accuracy?

Transitioning to a minimalist architecture requires decoupling your systems into a tiered, intent-driven framework.

  1. Have a “cheap” gate keeper : Route the incoming questions to the appropriate component to answer. “What is my account balance” doesn’t need even an LLM – it can be answered by an API call or a DB lookup. Only route complex reasoning tasks to frontier models. Another elegant solution that is often missed is semantic caching – where a recently answered similar question can help reduce the cost of answering the new question to nearly zero.
  2. Surgical context management : Don’t let your RAG system feed multiple PDF pages when 5 lines of well crafted sentences will do the job. Another underutilized hack is prompt caching – you can save 80% plus costs while also returning results faster – which helps UX. Why only please the CFO when you can also keep your users happy with under two second responses?
  3. State Truncation in Multi-Agent Loops: Stop passing the entire historical baggage of a conversation. Instead, compress past agent actions into concise, structured metadata packets so that agents only receive the immediate payload required for their specific micro-task.

The winning architectures of the coming years will not be the ones that burn the most tokens; they will be the ones that exhibit the highest intelligence efficiency. By embracing architectural minimalism, optimizing context, and deploying specialized, tiered models, the enterprise can finally bridge the chasm between raw technical capability and genuine, sustainable business value.

The trap of functional excellence in executive succession


Time and time again, I have seen this story play out – the best sellers and best product people get promoted to run a P&L and many of them fail. The best of P&L leaders get promoted to run a business and many of them fail. Not all of them fail – but enough do to make this a trend worth looking at.

Who doesn’t fail and why? Let’s get that out of the way first. I have seen two sets of people who succeed – people who have an excellent learning mindset who figure it out quickly on what it takes to succeed at the new big job , and people who happened to do a lot of lateral roles before getting the bigger job. A few organizations do this well – many don’t.

Circa 2010 – Bill Smiley told me something that I have told myself over and over : what got you here won’t get you there ! There was one other lesson I learned from Dave Lubowe around the same time that helped me a ton since then – we don’t have customers, we have clients !

There are many ways to look at this problem – one simple lens is to map it to basic accounting and finance for the bigger role

When you are great at sales – and get rewarded for it, the skill you optimize for is revenue growth. You learn how to qualify deals, how to hire and manage great sellers, how to develop great client relationships and so on. What you don’t learn is the cost of doing that sales – that’s not your concern. You don’t appreciate the concept of profit

Scaling revenue at the expense of profit is not a sustainable strategy. There are times when you might do it – like the early part of your product cycle. But even then – you need to understand the unit economics and plot out when you will turn profitable. Without that skill – you can’t run a business unit.

It’s no fun being a business unit leader and getting yelled at routinely when your sales numbers are good – you used to be praised for that. I have seen this happen right in front of my eyes multiple times. The best solution I have seen comes from spending some time in a lateral role leading a cost center to understand how to handle operational expenses optimally. Those roles generally don’t have a lot of glory in many companies but they teach you valuable skills that help at the next level up the chain.

Knowing the full P&L of course helps you be great as a business unit leader – but it doesn’t help you succeed at the next level up running a division.

A P&L is just a report card of what happened over a specific window of time. It doesn’t tell you how efficiently you used your assets to get there. Without understanding the balance sheet, you don’t understand capital allocation. Your entire job running a full business is capital allocation!

To push the narrative on who makes it to a great business leader and then a CEO – it’s the person who realizes that cash is the actual oxygen of the enterprise. Relatively early in the career, you would get beaten up on improving collections etc – but you do it because you are a good corporate citizen. You have very little understanding of the importance of cash. You don’t even usually know the controller or treasurer of the company you work for – let alone what they do. And then one day you need to know all of it and then some !

When you run a business – there are very few no brainer decisions. Life just becomes a series of trade offs. Your success is based on having good frameworks to make those decisions. If you don’t understand ROIC , liquidity etc – it will be hard to make sound judgment calls on whether you want to do M&A, conserve cash or build something organically.

All this boils down to 3 things primarily in my mind

  1. The organization needs to be comfortable moving their best functional leaders to lateral roles to expand their understanding of the business. This is not just the CHRO mandate – this is the CEO mandate. It’s quite hard – we all come with “there is no one else to do that role this well”
  2. Mandating financial fluency appropriate for each level. It cannot be just a “nice to know”. Finance cannot just be the finance person’s job.
  3. Give a lot of importance to resource allocation competence and framework based decisions when promoting to senior executive roles

This is just one lens – of course you need people to have high integrity, collaboration and so on. I just wanted to point out some things I have noticed in my own career and those of the people around me these past several years. I am not an HR expert by any stretch – so I am sure there are a million other things that are super important too

What I look for when hiring engineers, engineering managers, architects and product managers


As I am building out my agentic AI team, about half my time is being spent on hiring great technical talent. I thought it’s a good idea to explain what I look for at a high level for the four roles that I am hiring the most

Hopefully this gives you an idea of what I look for at a high level in each role. It’s not an exhaustive list. I am not listing basic competence needed in each role at all – someone else will usually evaluate all that before you and I will have a chat.

I hope this helps you qualify if you are a good fit for these roles. If you do think you are a fit – we would love to talk to you. It’s a great team and we are fun to hang with and we do a lot of cool stuff that the biggest companies in the world benefit from.

Great engineers

They are great craftsmen – not just creating code that works, but code that lasts. They understand that their work doesn’t exist in isolation – others need to work with it. While they are curious about technology, they are pragmatists. For example – they are not religious about always designing for horizontal scaling and will happily start with vertical scaling that is simpler and faster if that’s all the problem needs for a good solution.

They write good documentation that lets others understand their rationale. They don’t wait for QA to find issues – they take pride in testing and spend the time thinking about corner cases.

Great engineering managers

There was a point in time when I was a half decent engineer (maybe with the exception of writing great documentation) . I struggled as a leader of an engineering team though. Later in my career I observed that this was quite common – and it’s no different in sales. Great sales people don’t automatically became great sales leaders, and great sales leaders don’t automatically turn into great P&L leaders.

It’s a hard transition when you shift focus from managing code to managing people who often are way smarter than you.

If any role in a technical team needs to be good at politics – engineering managers are the ones that need it. They need to be able to find resources for their team to succeed and shield their team from EM organizational craziness. There were days in my life where I truly just wanted to go back to being a technical hero and not deal with product managers and sales people and CFOs.

EM needs more EQ than IQ. Great engineers are a diverse lot – some need to be left alone, some need active coaching, and some will burn out if you don’t keep checking in to course correct. It’s a tiring job and you need to thrive in that environment

They need to be communication wizards – negotiating with product managers to translate the roadmap into things that can be coded, while also explaining to finance why you need budget to tackle technical debt.

I have always felt that it’s easier for an EM to learn basic business than to teach tech to other stakeholders. I have often felt my MBA was wasted effort but grudgingly I will accept that it helped me explain tech better to others in a language they can follow.

My boss BK keeps telling us “candor over comfort” is how we should operate. This is critical for EM . There are very few days without hard conversations in this role

Great architects

A few technologists need to step back and use wide angles lens to look at problem statements and solutions – that’s the architecture profession.

There is no such thing as perfection in architecture- it’s a series of tradeoffs. You have to balance what is needed today vs long term sustainability. Architects are not philosophers – they have to weight pros and cons and come to a decision. Poor architects are quite easy to spot – they throw around a lot of jargon and never take a hard decision. Great architects are harder to find easily – their work looks boring because everything just works.

No one can predict the exact future – architects are not prophets. But what they need to do is to make sure that software is written in such a way that the next requirement won’t force a complete rewrite.

Unlike EM – architects don’t usually have big teams and often are individual contributors. They have very limited direct authority. So they need to influence decisions via their clarity, with prototypes and with solid technical rationale. Architects who preach from an ivory tower lose the respect of engineers really quickly.

Architects need a detailed understanding of the business problem they are solving – the job is to identify capabilities and design solutions for them that accelerate the delivery of those to the business.

Great product managers

Some of my PM friends don’t like me saying this – but I like PMs to be the CE”No” of the team. They have to say NO to a lot of things and ruthlessly prioritize.

While engineers and architects are tasked to build the solution the right way, product managers are the ones making sure that it’s the right solution that is being built. They are the connective tissue that makes it all work. Engineers take care of the HOW – PM takes care of the WHAT and WHY.

Saying NO to powerful people js not easy. You need clear frameworks to make decision – and those frameworks need to be data driven. That in turn means you need to understand what adds value to your clients and what moves the needle for your business.

The viability of a product is not just determined by the coolness of technology used to build it. Yes agentic AI is super cool and you need to know how it works – but even more important is understanding how the intersection of UX, regulations, GTM, pricing etc work

Much like I mentioned about architects – this is a role that is low on authority because they don’t manage big teams. They influence with clear thinking and market insights.