Theo argues that front-end design quality from current frontier models is highly prompt- and harness-dependent, and that Anthropic’s Opus 4.5 becomes the best design model once paired with the front-end design skill. He compares default vs skill-enabled runs across Opus, GPT-5.2, Gemini 3 Pro, and Kimi K2.5, concluding Opus + skill is the strongest overall because it is more steerable and improves more from iteration.
Watch on YouTube ›Get the market thesis, key claims, assets, contradictions, and follow-up questions from any financial video — then unlock a version personalized to your portfolio, watchlist, and favorite speakers.
The core thesis is straightforward: for building polished front-end marketing pages, model choice matters less than the combination of model plus the right skill/harness, and Opus 4.5 paired with the front-end design skill is the best overall option in his testing. He starts by saying all of the major frontier models are “really, really good” in general, but that they differ in important ways—especially in tool use and design sensibility. His rough ranking for design without the skill is Opus 4.5 at the bottom, then GPT-5.2, then Gemini 3 Pro, with a surprise leader above them once the skill is enabled. The surprise is that Opus 4.5 becomes dramatically better with the front-end design skill, producing the most usable and aesthetically coherent outputs of the set. A lot of the video is a side-by-side live benchmark. …
In the immediate setup, the actionable edge is to use the best harness/skill combo rather than trusting base-model reputation alone; Opus 4.5 plus the front-end design skill looks strongest for near-term design work. Gemini may still produce the flashiest first draft, but it is a higher-risk choice if you need reliable iteration right away.
Over the next several weeks or months, the winner should be the system that can preserve design intent through multiple revisions and respond to feedback, not just produce one pretty screenshot. If Gemini or GPT improve their follow-through, the ranking could change quickly, but for now Opus appears to have the better feedback loop.
Structurally, the video argues that front-end generation is evolving toward reusable skills and harnesses that unlock latent model capability. The durable advantage will belong to the stack that best combines taste, controllability, and revision quality, rather than to the model with the single strongest default aesthetic.
Current frontier models are all very good overall, but differ in tool behavior and design sensibility.
He opens by saying the models are strong but have distinct strengths and weaknesses, especially tools.
Gemini 3 Pro is notably poor at tool behavior in this context.
He explicitly says Gemini does not behave well in tools.
Without the design skill, Opus 4.5 is the weakest of the three major models he compares.
His default ranking puts Opus at the bottom for design.
What does 'never use generic AI generated aesthetics' mean in practice?
The speaker shows examples of what generic AI aesthetics look like: purple gradients on white backgrounds, predictable layouts, noise textures, same general shapes across designs, editorial/newsy directions, and cookie-cutter template styles. He contrasts this with the more varied and intentional designs that emerge when the model follows the skill's guidance — like Gemini 3 Pro without the skill producing a completely different aesthetic that 'looks good' and 'really cool and nice.'
What's the point of the front-end design skill document? How does it affect outputs?
The speaker explains that the document is 'built to steer the model towards better design' and contains rules like avoiding generic AI aesthetics, interpreting creatively, making unexpected choices, and never converging on common choices. He was initially skeptical since it's 'literally just markdown' but shows that it does substantially change model outputs. He notes GPT 5 may have used the skill despite being told not to, and compares outputs with and without it across models.
What did the default Opus 4.5 designs look like? Were any of them decent?
The speaker shows several Opus 4.5 designs and considers them all awful — not even good starting points. He points out issues like a box being behind elements, barely visible text, a terrible purple/blue gradient, a bad title treatment, and the models all producing very similar layouts. One had a 'noise texture' background which he especially dislikes.
Unlock the full claims, asset map, scores, related transcripts, follow-up questions, and AI chat — shaped around your portfolio, watchlist, favorite speakers, and risks.