Forward Deployed Engineers Aren’t the Moat. The Learning Loop Is.


Usual caveat: These are strictly my personal opinions and have nothing to do with my past or present employers.

Almost every discussion about the slow adoption of enterprise GenAI eventually becomes a discussion about deployment. The narrative is familiar: today’s models are remarkably capable, but they start struggling when they collide with fragmented enterprise data, decades-old ERP systems, and the way large organizations actually operate.

The industry’s response has become equally familiar. Build larger customer-facing engineering organizations, embed technical talent inside customer environments, and call them Forward Deployed Engineers (FDEs).

It’s an appealing story.I also think it’s missing the bigger lesson.

Part of the reason this conversation is happening now is that enterprise AI has changed the nature of implementation. Moving workloads to the cloud was largely an infrastructure challenge. Deploying AI agents is increasingly a workflow challenge. The models aren’t failing because they can’t generate text or write code. They struggle because they have to operate inside decades of accumulated enterprise complexity.

That’s a very different problem.

The tech industry is discovering that the era of “touchless” enterprise SaaS—the plug-and-play playbook that defined the last decade—is hitting a structural wall. Deep data heterogeneity means AI cannot be delivered via a standard browser login alone. It requires a return to high-touch, localized execution.

If you’ve spent time building enterprise software or running a professional services P&L, you learn pretty quickly that changing a job title rarely changes the underlying unit economics of the business. The success of the FDE model has less to do with the brilliance of the engineer on-site than what happens to the code after they leave.

That’s the distinction I think much of the industry is missing.

What Palantir Actually Built

Most conversations about FDEs focus on the engineers themselves. I think the more interesting question is why the overall system works.

What Palantir built wasn’t really an implementation organization. It was a product organization that happens to deploy engineers into customer environments.

The company appears to have recognized early that mission-critical enterprise data is messy, political, and full of unwritten business rules. No software platform can anticipate every edge case before it is deployed in the real world.

Its answer was to pair two complementary roles that function less like standard consultants and more like a tightly coupled product-and-implementation team embedded inside the client environment.

Forward Deployed Engineers, internally known as Deltas, solve the immediate technical problems. They write production code, connect fragmented systems, and build what has been described as the initial “gravel road.”

Deployment Strategists, internally known as Echos, operate as domain heretics. Rather than acting as passive change-management advisors, they challenge the client’s existing assumptions, surface institutional constraints, and identify the real operational metrics that justify the deployment in the first place.

But the vital product mechanics happen back at headquarters.

If an FDE team simply copied every customer workflow into the core platform, the software would quickly degrade into an unmaintainable system. To be clear, some enterprise environments are so shaped by legacy decisions that clean abstraction is impossible anyway.

Instead, the central engineering team acts as a ruthless generalization engine. They don’t look to copy bespoke logic; they look across dozens of deployments to isolate the underlying structural friction and abstract those patterns into reusable primitives. What started as a customer-specific gravel road becomes a foundational data abstraction—a paved highway that future customers can use without repeating the same work.

That changes the economics entirely, though it is a long, margin-heavy R&D phase that requires years of organizational patience to bear fruit.

The deployment team isn’t just implementing software.It is teaching the product how to improve itself.

If a customer deployment uncovers a reusable primitive, the next implementation should require fewer integrations, fewer workarounds, and fewer engineering hours. Over time, the platform becomes more capable precisely because it has been exposed to real-world complexity the core team could never fully anticipate in advance.

The engineers are not the moat. The learning loop is.

Why This Is Hard for the Hyperscalers

It’s easy to understand why Microsoft, AWS, and Google are expanding customer-facing engineering as enterprise AI deployments become more complicated.

The question isn’t whether those engineers create value.They almost certainly do.

But hyperscalers are structurally and financially optimized for a very different economic engine. Public markets value them on predictable, high-margin infrastructure consumption. The true FDE model requires a deliberate financial trade-off: accepting heavy human operational costs in exchange for long-term operational dependency. This is fundamentally different from adding raw cloud infrastructure.

Furthermore, their feedback loop is designed to look for different things. When a hyperscaler’s customer engineering unit or partner motion uncovers an integration pattern, the goal isn’t to build a unified operating system; it is to harden infrastructure and developer tooling primitives (like security layers, data pipelines, or vector databases) so *any* developer can build faster.

Even so, the organizational coordination loop remains the hardest part.

Palantir operates around a relatively unified software platform. When deployment teams discover something new, there is a direct path for that insight to become part of the platform.

Hyperscalers operate across sprawling portfolios that include infrastructure, developer platforms, security, databases, AI models, analytics, and enterprise applications. A field engineering team may discover an important pattern around enterprise data orchestration, but where does that insight belong? Which separate product team owns it? Which distinct roadmap gets changed?

That is not primarily a technical problem.It is an organizational coordination problem.

None of this means the model cannot work for them. It simply means the value created by these engineers may end up strengthening localized customer relationships and developer tools more than systematically evolving a unified enterprise platform. Those are two very different feedback loops.

The System Integrator Problem

The challenge for traditional system integrators is different.

It is mostly about incentives.

The economic engine of a traditional consulting business rewards utilization. Success is measured by keeping talented people billable for as long as possible.

A successful FDE model tries to do almost the opposite.

Imagine an engineer builds an abstraction that reduces future data-mapping work by 80%.That is a strong product outcome because every future deployment becomes faster, cheaper, and more repeatable.

It is a much more complicated outcome for a services business whose revenue depends on billable implementation effort.

This is not primarily a technology problem. It is an incentive problem.

Many Global System Integrators (GSIs) will argue they are already navigating this by investing into “asset-based consulting” and proprietary accelerators. But there is a difference between using reusable artifacts to speed up delivery and building a system that continuously abstracts deployment work into a shared product core. If the firm’s P&L is still fundamentally driven by headcount, the incentive to expand labor almost always wins.

If system integrators genuinely want to build an FDE model, they need to change two things.

First, shift revenue toward outcomes rather than hours. The firm must benefit when automation and productization reduce implementation effort instead of losing revenue because fewer consultants are required.

Second, enforce real product discipline. Deployments need to strengthen a shared software platform. If every engagement produces another collection of client-specific, bespoke integrations, the organization gets better at delivering isolated projects but not better at building a scaling product.

Those are two very different businesses.

The Hardest Part May Be the People

There is another constraint that does not get discussed enough. True FDEs are unusually difficult to hire.

You need engineers who can write production systems under pressure while also earning the trust of executives, operations teams, and domain experts.

That is a rare combination.

More importantly, many of these people want to build products, not just complete projects. They want the solution they built for one customer to become part of something used by thousands of customers.

The operating model matters because the best engineers usually care about the structural trajectory of what they are building, not just the immediate engagement.

The Takeaway

Enterprise AI absolutely has a deployment problem. The models are advancing faster than most enterprises can absorb them, and embedding technical talent inside customer environments will often be necessary.

But I do not think the lesson is simply to hire more Forward Deployed Engineers.

The lesson is to build organizations that learn from every deployment.That is what made the model powerful in the first place.

If your deployment teams are helping the platform become smarter with every customer, you are building a compounding moat. If they are primarily helping close deals, retain customers, or deliver custom implementations, they may still create enormous value. But you are running a fundamentally different operating model.

That is why I think the conversation around FDEs is slightly misplaced.Everyone is debating how many engineers to hire.The harder question is whether the organization is designed to learn from those engineers.

Without that feedback loop, an FDE is just another implementation consultant with a more modern title.

Many companies are copying the surface mechanics of the FDE model.

I am not convinced they are copying the system that makes it compounding.

It Was Never Jensen vs. the Hyperscalers. It Was a Balance Sheet Problem – And Power Is Next.


Every discussion about AI infrastructure eventually turns into a story about Jensen Huang outsmarting the hyperscalers.

The narrative goes something like this: NVIDIA deliberately routed scarce GPUs to upstarts like CoreWeave and Crusoe, creating a new class of AI cloud providers that would prevent Microsoft, Amazon, and Google from monopolizing AI infrastructure.

It’s a compelling story. It’s also an incomplete one.

If you’ve spent time building or financing large-scale enterprise infrastructure, this doesn’t look like a battle. It looks like a capital allocation strategy that happened to benefit everyone involved — for a while.

The rise of the neoclouds wasn’t about defeating the hyperscalers. It was about solving a problem they all had.

The Balance Sheet Problem

Building infrastructure for an AI hardware cycle is fundamentally different from building traditional cloud infrastructure.

The hardware is expensive, demand is unpredictable, and the depreciation curve is unlike anything we’ve seen before. Buy a fleet of GPUs today, and there’s a good chance something significantly faster arrives before you’ve fully recovered your investment.

Public companies don’t like carrying that kind of uncertainty on their balance sheets.

This is where the neoclouds fit into the picture. Take Microsoft’s relationship with CoreWeave: Microsoft has historically accounted for well over half of CoreWeave’s revenue (62% in 2024, 71% in Q2 2025) – not because Microsoft couldn’t build the capacity itself, but because it didn’t want that capacity, and its depreciation schedule, sitting on its own books. A specialized provider takes on the risk of rapidly depreciating hardware, while the hyperscaler keeps investing in assets with much longer economic lives – land, fiber, power infrastructure, enterprise cloud platforms.

That structure has since been used well beyond Microsoft. OpenAI has stacked up roughly $22 billion in commitments to CoreWeave across three separate expansions, and Meta has committed to roughly $35 billion combined across two agreements of its own, most recently a $21 billion expansion. These are different customers solving the same problem in the same way: none of them wanted rapidly depreciating GPU fleets sitting on their own balance sheets, so they paid someone else to hold that risk.

Not every neocloud got there through the same door, either. Crusoe and CoreWeave both count as “neoclouds,” but they didn’t start from the same assets. CoreWeave built its position through GPU-collateralized debt. Crusoe, like several other entrants, grew out of the Bitcoin mining industry and repurposed cheap, already-built power infrastructure toward AI workloads once crypto economics soured. Different starting points, same underlying trade: someone besides the hyperscaler absorbs the capital risk of the hardware cycle.

The hyperscalers didn’t lose the first phase of the AI infrastructure race. They found a way to participate without absorbing all of its financial volatility.

Why This Worked

This model only works if the underlying assets retain value.

GPUs are unusual because they aren’t purpose-built appliances with a single use case. NVIDIA spent more than a decade building CUDA into the default software platform for AI development, which is why demand for GPU compute extends across research labs, startups, enterprises, and cloud providers rather than sitting locked to one buyer.

Developers weren’t choosing neoclouds because Jensen allocated them chips. They were choosing them because their existing software stacks already ran there. That’s what gave NVIDIA hardware a liquidity that few infrastructure assets enjoy, and that liquidity is what made the financing model possible in the first place. (Frameworks like PyTorch and Triton will keep chipping away at that advantage, but not this cycle.)

NVIDIA’s incentive here is simpler than the “multipolar strategy” framing suggests. NVIDIA doesn’t need Microsoft, Amazon, or Google to lose share , it just needs GPUs sold, wherever the check comes from. CoreWeave is the only major cloud provider buying from NVIDIA that isn’t also building its own competing silicon, which makes it a cleaner customer than the hyperscalers, not a rebel one. Backing the neoclouds wasn’t a play against Microsoft and Google. It was a way to keep selling chips to buyers with no reason to ever build their own.

The Case Against It

None of this means the model is free of risk, and it’s worth stating the bear case plainly, especially since two of its central assumptions cracked within the same few months.

The financing math is straining even as the backlog grows. CoreWeave’s Q1 2026 results, reported in May, showed revenue backlog reaching $99.4 billion — up nearly 50% in a single quarter. But that growth came with a GAAP net loss of $740 million and $536 million in net interest expense for the quarter, against total debt approaching $25 billion. Backlog is a measure of future revenue promised, not cash in hand, and the cost of carrying the debt that built the capacity is rising faster than the revenue that capacity has generated so far.

Customer concentration remains a structural overhang. Meta and OpenAI together represent close to two-thirds of CoreWeave’s guaranteed revenue backlog, and contract cancellation clauses tied to delivery schedules create real execution risk on both sides. If any one of those customers delays a buildout or scales back, the neocloud carrying that exposure feels it immediately – the risk didn’t disappear when it moved off the hyperscaler’s balance sheet, it just concentrated somewhere else.

And the assumption that hyperscalers would stay on the sidelines just failed, publicly, in real time. On July 1, 2026, Bloomberg reported that Meta is building a cloud business called Meta Compute to sell its own excess AI infrastructure – both raw GPU capacity and hosted model access – to outside customers, directly competing with the neoclouds it currently pays. The market didn’t treat this as a distant hypothetical: CoreWeave fell 14% and Nebius fell 17% the same day, even though Meta is a paying customer of both. Investors read it correctly – a customer with enough spare capacity to become a competitor changes the pricing power in the relationship, immediately.

That doesn’t make the underlying financing thesis wrong. Neoclouds still solve a real balance-sheet problem hyperscalers don’t want to carry. But it does mean two of the risks that were previously theoretical : debt service outpacing revenue, and hyperscaler self-competition, are now showing up in earnings reports and single-day stock moves, not just bear-case footnotes.

The Second Life of AI Infrastructure

One criticism of the neocloud model is that it depends on borrowing billions against hardware that inevitably becomes obsolete.

That assumes GPUs lose most of their value once the next generation arrives.

Reality is more nuanced. Training frontier models requires the newest hardware connected in enormous, tightly synchronized clusters. Inference is a different business. As models become more efficient and enterprises deploy them at scale, older GPU generations remain perfectly capable of serving inference workloads, fine-tuning models, and running enterprise AI applications. CoreWeave is already leaning on this: much of its long-term business model depends on running older Hopper-generation chips for years after the newest Blackwell and Rubin systems come online, selling that capacity to enterprises and startups who don’t need the bleeding edge.

The question isn’t whether older GPUs continue to have value. They almost certainly will. The question is whether that value exceeds the combined cost of financing, power, cooling, and operations.

That calculation will determine which neoclouds become durable infrastructure companies and which were simply vehicles for financing the first wave of AI demand.

The Next Bottleneck Isn’t GPUs

Much of the discussion around AI infrastructure still assumes GPUs are the scarce resource.

That was true. It’s becoming less true.

The next constraint is power. A company can buy thousands of the latest GPUs and still need hundreds of megawatts of reliable capacity, years of utility planning, transmission infrastructure, substations, cooling, and land that can support all of it. CoreWeave itself surpassed 1 gigawatt of active power in its Q1 2026 results, with total contracted power near 3.5 gigawatts — and getting there took years of buildout, not a single GPU purchase order.

Those assets aren’t built overnight.

This is where the hyperscalers have a structural advantage that’s much harder to replicate than access to silicon. Decades of investment in land, utility relationships, permitting, and physical infrastructure are becoming increasingly valuable as AI data centers grow larger.

The industry’s bottleneck is shifting from semiconductors to energy infrastructure. When bottlenecks move, so does leverage.

Net-Net

The AI cloud isn’t becoming multipolar because NVIDIA wanted to weaken the hyperscalers.

It became multipolar because a new class of providers solved a financing problem during the most capital-intensive phase of AI infrastructure buildout. The neoclouds absorbed risk that public cloud providers had good reasons not to carry on their own balance sheets. NVIDIA sold more GPUs. The hyperscalers gained flexibility. AI companies got access to compute when they needed it most.

That arrangement is now being tested from two directions at once, and they’re not the same threat. Power is a slow-moving constraint : it rewards whoever spent the last decade on utility relationships and permitting, which mostly still favors the hyperscalers over the neoclouds they fund. Hyperscaler self-competition is a fast-moving one – it doesn’t require new infrastructure at all, just a company like Meta deciding its existing excess capacity is worth more sold than idle. A neocloud can plan around the first. The second can reprice a company in a single trading session, as CoreWeave and Nebius just learned.

For the first phase of the AI boom, GPUs were the scarce resource, and the neoclouds won by absorbing financing risk the hyperscalers didn’t want. The next phase will be decided by two separate questions: who controls power, and whether the hyperscalers’ own spare capacity ends up competing with the companies that were built to serve them. Right now, neither question is settled — and the neoclouds don’t control the answer to either one.

Systems Over Scale: What Bridgewater Teaches Us About the Enterprise AI Plateau


I have lost count of how many client conversations this year have gone the same way. Someone tells me the model isn’t accurate enough yet for what they want to do, and the plan is to just wait for the next release. GPT whatever. Claude whatever. Gemini whatever. Someone bigger and smarter is always around the corner, so why do the hard work now?

Bridgewater just published a paper that quietly pokes a hole in that thinking, and I think it deserves more attention than it’s getting outside finance circles.

They took an open weight model, Qwen3-235B, and ran it through a serious reinforcement learning and distillation pipeline built with Thinking Machines Lab. The result was 84.7% accuracy on their internal financial evaluation suite at a fraction of the inference cost of the big commercial models. Those are impressive numbers.

But the numbers aren’t really the story.

The story is how they got there.

Everyone’s first assumption will be that Bridgewater won because they have proprietary data nobody else has. Sure, that helps. But I think the more interesting thing they built is the feedback loop around the data, not the data itself.

They didn’t have their best investment people label every single example. That would be a waste of very expensive time. Instead they trained a baseline model on cheaper vendor-labeled data first. Only when the model disagreed with the vendor label did it get routed to an experienced investment professional for a second opinion.

So the expensive human judgment gets spent exactly where it matters most, on the cases that are genuinely ambiguous, not on the easy 90 percent that any reasonable process would get right anyway.

Then on the training side, they didn’t just keep distilling from the same fixed teacher model forever. The student gets promoted to teacher only once it proves it’s actually better on validation. That’s a small design choice that I think matters a lot. It keeps the whole system improving instead of plateauing around whatever the original teacher model was capable of.

If I had to boil the lesson down to one sentence, it’s this.

Good enterprise AI usually comes from a better feedback loop, not from more data or a bigger model.

I also think that’s why so many enterprise AI projects seem to plateau around the same accuracy range. Once you’ve exhausted prompt engineering and upgraded to the latest foundation model, the next gains usually don’t come from a smarter model. They come from better supervision, better routing, better feedback, and better systems.

Now, a few things I’d push back on if I were reviewing this paper with a client.

The cost savings headline needs a footnote. A 235 billion parameter model doesn’t run itself. You still need GPUs, batching, latency tuning, people who know how to keep the thing running. If you’re processing enormous volumes every day, owning that infrastructure can absolutely pay off. If your workload is lumpy or unpredictable, a commercial API that turns fixed infrastructure cost into a variable line item might still be the smarter bet.

This isn’t a universal answer. It depends entirely on how much you actually use the thing.

I’d also gently push back on the framing of “replicating expert judgment.” Many of the evaluated tasks focus on document segmentation, filtering, classification, and finding the needle in a haystack of financial text. That’s genuinely useful work and it saves analysts a ton of time. But it is not the same as a model independently coming up with a macro thesis or an investment idea nobody has had yet.

Parsing information well and synthesizing new insight are two different skills. I’d want any vendor or internal team to be honest about which one they’re actually selling me.

And specialization has a cost that doesn’t show up in the benchmark table. A model tuned tightly to today’s financial reporting formats and today’s regulatory language will need care and feeding when those things change, and they always change.

That’s not a knock on the approach. It’s just the maintenance bill nobody talks about until the invoice shows up.

A lot of IT organizations aren’t set up yet to treat retraining and re-distillation as an ongoing operational cost the same way they’d treat patching a production system.

Here’s where I land on all this.

The Bridgewater paper isn’t proof that the big frontier models are becoming irrelevant. It’s evidence that enterprise AI is becoming an architectural discipline.

The organizations that win won’t necessarily be the ones with access to the biggest models. They’ll be the ones that build the best systems around them.

Use specialized models for the high-volume, close-to-the-data work. Save expensive frontier reasoning for the small slice of problems that are genuinely hard and ambiguous.

That’s a tiered architecture. It’s a lot more work than pointing everything at one API. But it’s also a lot harder for a competitor to copy, and that’s usually the kind of advantage worth building.