Theo argues that most people underperform with AI coding tools because they use them at the wrong stage, with too much context, and with too much configuration. The core message is that AI works best when you give it a well-scoped problem, enough but not excessive context, a sane environment, and small feedback loops that correct mistakes in the codebase and in the instructions.
Watch on YouTube ›Get the market thesis, key claims, assets, contradictions, and follow-up questions from any financial video — then unlock a version personalized to your portfolio, watchlist, and favorite speakers.
Theo’s thesis is that the disappointing results many developers get from AI coding tools usually come from user error, not tool uselessness. He says the biggest mistakes are picking problems the model is poorly suited for, handing it too much codebase context, relying on outdated experiences with older models, and overloading the agent with MCPs, skills, plugins, and other scaffolding. In his framing, AI coding becomes much more useful when treated like a very capable new engineer: give it a problem you already understand, give it the minimum relevant context, and maintain the environment and instructions so it can work cleanly. He spends most of the video on “selecting the right problem.” His point is that people often ask AI to solve hard, poorly understood bugs only after they have already exhausted everything else. Theo argues this is backwards. …
In the near term, the actionable read is to use current AI coding tools only on tightly scoped tasks with strong local validation. The main tactical risk is overloading the agent with noise, history, or extra tooling and then blaming the model for a setup problem.
Over the next several weeks or months, the likely path is better results from a cleaner workflow rather than from piling on MCPs or orchestration. The setup improves if teams build a small library of real repros, update their agent notes, and keep the environment reproducible.
The structural thesis is that AI coding value compounds when teams learn to manage context, documentation, and validation as part of the development system. In that regime, the edge comes from workflow discipline and codebase hygiene more than from maximal tooling.
Giving AI models too much context (entire codebase) causes performance to degrade significantly.
Speaker argues that because LLMs work via next-token prediction from context, providing irrelevant context dilutes the signal and lowers the probability of correct autocomplete. Points to benchmark data showing success rates plummet from ~100% to under 60% as context grows.
Creating reproducible benchmark tests from real-world problems that AI tools cannot solve is extremely valuable for evaluating new models.
Speaker explains that if you freeze a code state with a problem an agent couldn't fix, along with the info needed and a way to validate the correct solution, you have a real-world reproduction test that is far more useful than standard benchmarks.
AI tools can reliably solve coding problems when given the same context and information a human engineer would need to solve them.
The speaker asserts that when you give an AI agent the same info that led you to a solution (logs, line references, blog posts), it will likely solve it.
What is good context to provide to AI coding models when asking them to solve a problem?
The speaker explains that generally less is best — you should simply describe what is wrong and what needs to be done, and trust the model to find what it needs. He contrasts Codeex (which reads all potentially relevant files, is slower but makes precise changes) with Opus (which is more eager to edit immediately), recommending that with Opus you should put more upfront time specifying where the problem is and what not to touch. He also describes how Claude MD files can be used to steer the model in the right direction, giving the example of updating the file to stop the model from running dev servers and to generate types after schema changes.
How do you handle the issue of AI agents running dev servers when you already have one running?
The speaker updated a configuration file describing PNPM scripts to specify 'don't use this unless otherwise told to,' which stopped the model from randomly running dev commands. They also added a 'pnpm generate' command to regenerate convex types after schema changes so the model wouldn't get confused by stale type errors.
Does constantly updating the config file feel like babysitting the AI?
The speaker disagrees, saying it took only 5 seconds and has already saved hours. They compare it to an experience at Twitch where a senior engineer fixed docs after seeing a newcomer's mistake, noting that AI doesn't remember mistakes so you have to build memory for it by documenting so the next run doesn't repeat the same errors.
Unlock the full claims, asset map, scores, related transcripts, follow-up questions, and AI chat — shaped around your portfolio, watchlist, favorite speakers, and risks.