CNBC frames the AI trade as shifting from “use the best model for everything” to model routing: send hard tasks to premium models and routine work to cheaper ones. The two guests, Cognition CEO Scott Wu and Cisco’s G2 Patel, argue that rising token bills, enterprise budget pressure, and the proliferation of capable models are making routing inevitable — while also warning that this could compress premium-model economics and force AI vendors to prove ROI much more directly.
Watch on YouTube ›Get the market thesis, key claims, assets, contradictions, and follow-up questions from any financial video — then unlock a version personalized to your portfolio, watchlist, and favorite speakers.
This CNBC segment argues that the AI business is entering a new phase where buyers are no longer willing to route every task through the most expensive frontier model. The core thesis is that corporate customers are starting to treat AI like a cost-optimized workflow: premium models for the hardest, highest-stakes work, and smaller or cheaper models for boilerplate tasks. The segment opens with the claim that the old “one best model for everything” assumption is breaking as bills arrive and enterprise spending becomes visible. It highlights OpenRouter’s reported five-fold volume increase over six months and frames that as evidence that model routing is moving from niche optimization to mainstream behavior. The first interview, with Cognition CEO Scott Wu, centers on Cognition’s “AI productivity guarantee,” which CNBC repeatedly describes as an ROI guarantee. …
Near term, the actionable setup is enterprise AI cost pressure: buyers are likely to test routing layers, cheaper models, and ROI guarantees before expanding frontier-model usage further. That favors middleware and infrastructure tools more than pure premium-model pricing power.
Over the next few months, the base case is broader model-mixing across enterprise workflows, with frontier models reserved for the hardest tasks and smaller or local models handling routine work. The key confirmation signal is whether routing becomes a default procurement and architecture choice rather than a pilot feature.
Structurally, AI looks less like a single-model monopoly and more like a distributed stack of specialized models, orchestration layers, and infrastructure. If that regime holds, the durable winners may be the control planes and infrastructure vendors, while frontier labs face slower monetization per task.
The AI trade is shifting from “use the best model for everything” to task-based model routing.
The segment’s main thesis is that companies now split hard tasks and easy tasks across different models to optimize cost and quality.
Model routing is taking off because there are now many capable models with different strengths.
Wu says the market has gone from one or two usable models for agents to dozens, making price-performance optimization practical.
Cognition’s productivity guarantee is designed to align AI spend with measurable engineering output.
Wu says the firm measures value by capacity and shipped work, not by token counts or agent activity.
Why will enterprises adopt model routing and cheaper models now?
Wu says there are now many strong models with different strengths and weaknesses, so companies no longer need to force everything through one expensive model. He also points to mass automation: when tasks are triggered at high volume, firms need models that can handle many routine jobs efficiently.
Did customer pressure drive the decision to offer the guarantee?
Wu says it was something they wanted to get ahead of with customers. He describes both cautious buyers who worry about blowing budgets and newer users who need help understanding the productivity value they are getting.
How do you prevent customers from gaming the guarantee?
Wu says they measure productivity by output rather than activity, similar to evaluating a human engineer by what gets shipped. The company uses its own model and systems, trained on many customer tasks, to verify whether real engineering capacity and shipped work increased.
Unlock the full claims, asset map, scores, related transcripts, follow-up questions, and AI chat — shaped around your portfolio, watchlist, favorite speakers, and risks.