Theo argues that 2025 was the year AI code tooling crossed from novelty to daily infrastructure: reasoning models improved, agents became practical, coding CLIs like Claude Code became central, and developers increasingly spent more time reviewing than typing code. He frames this as a durable shift in how software is built, while also warning about security, cost, and overreliance on unsafe automation.
Watch on YouTube ›Get the market thesis, key claims, assets, contradictions, and follow-up questions from any financial video — then unlock a version personalized to your portfolio, watchlist, and favorite speakers.
Theo’s core thesis is that 2025 was the year programming changed in a lasting way because AI models, tool use, and harnesses all improved at once. He says the biggest shift was not simply smarter models, but the arrival of reasoning, agentic workflows, and coding CLIs that let models write, run, inspect, and iterate on code in a loop. In his view, this made it feel normal to spend less time inside an editor and more time reviewing, prompting, and steering code generation. He starts with the reasoning-model wave, pointing to DeepSeek R1 as the catalyst that made reasoning visible and widespread. He says OpenAI, Anthropic, and others followed with reasoning variants and control dials, and that reasoning became especially useful once combined with tools. …
Tactically, the near-term trade is continued momentum in AI coding tools and agent workflows, with the market still rewarding products that make models more autonomous and easier to use. The main immediate risk is a high-profile safety failure or a slowdown in model improvement that dents the current enthusiasm.
Over the next few months, the base case is that coding agents keep taking share from manual workflows as teams trust them on longer and more complex tasks. If success rates and reviewability keep improving, the category should consolidate around the most useful harnesses rather than the flashiest demos.
Structurally, this points to a regime where software is increasingly produced by model-assisted automation and humans shift toward supervision, review, and systems design. The durable winners are likely to be the model platforms and workflow layers that control context, tools, and verification.
The improvements in AI coding models and tooling have shifted expectations so that tasks that were impossible 3 months ago are now routine.
The speaker describes how rapidly model capabilities have advanced, citing personal experience with early GPT-5 in cursor and noting that most labs now have models that can handle previously impossible tasks.
Reasoning models (like GPT-5 thinking) have solved the gullibility problem that previously prevented agents from working meaningfully.
Speaker notes that Simon's 'gullibility problem' (LMs believing anything you tell them) was the roadblock, and reasoning models address it.
Coding agents (CLI-based AI coding tools) are the most impactful category of AI agents in 2025, larger than deep research or AI search patterns.
Speaker contrasts coding agents with deep research/search patterns and asserts coding agents are the more impactful event of 2025.
How does Simon define the gullibility problem with LLMs?
Simon defines the gullibility problem as LLMs believing anything you tell them, so any system that attempts to make meaningful decisions on your behalf runs into the roadblock of not being able to distinguish truth from fiction. The speaker agrees reasoning would help with that.
Why did Anthropic jump from Claude 3.5 to 3.7?
Anthropic had an update to 3.5 in October but kept the same name, which everyone unofficially called 3.6. So they effectively burned a version number due to their naming choices.
What was Boris's insight about building Claude Code relative to model improvements?
Boris understood the scaling laws internally about how quickly models improve. He pushed the team not to build for the model of today but to build for the model six months from now. For a long time Claude Code wasn't a great product and was used for only about 10% of code, but when Sonnet and Opus 4 released in March, the product suddenly worked and usage soared, with most of Claude Code now being written by Claude Code itself at 80-90%.
Unlock the full claims, asset map, scores, related transcripts, follow-up questions, and AI chat — shaped around your portfolio, watchlist, favorite speakers, and risks.