“Engineers are becoming sorcerers” | The future of software development with OpenAI’s Sherwin Wu
Key Takeaways Copied to clipboard!
- At OpenAI, 95% of engineers use Codex daily, with 100% of PRs reviewed by it, leading to a widening productivity gap between AI power users and others.
- The role of the software engineer is shifting from writing code to managing fleets of AI agents, akin to wizards casting spells, a trend predicted in programming literature decades ago.
- Companies often see negative ROI on AI deployments due to a lack of bottoms-up adoption, emphasizing the need to empower excited internal evangelists rather than relying solely on top-down mandates.
- OpenAI views itself fundamentally as an ecosystem platform company, committed to fostering its ecosystem via the API rather than squashing startup ideas, driven by its mission to spread AGI benefits to all of humanity.
- Startups should focus on building products that customers genuinely love, as failure is overwhelmingly due to lack of customer resonance, not being crushed by large labs like OpenAI.
- The next two to three years represent an exceptionally exciting and energizing period in tech and the startup world, and individuals are encouraged to actively engage with and learn the new AI technologies rather than letting the wave pass them by.
Segments
AI Adoption Metrics at OpenAI
Copied to clipboard!
(00:00:00)
- Key Takeaway: Engineers using Codex open 70% more PRs, and this gap widens as they become more proficient.
- Summary: 95% of OpenAI engineers use Codex daily, and 100% of PRs are reviewed by it. Engineers who use Codex frequently open 70% more pull requests than those who do not. This indicates a significant productivity multiplier for proficient AI users.
Future Engineer Role Metaphor
Copied to clipboard!
(00:06:54)
- Key Takeaway: The modern software engineer role is evolving into that of a ‘wizard’ managing 10-20 parallel AI agents, drawing parallels to the Sorcerer’s Apprentice.
- Summary: The job of a software engineer is changing rapidly, shifting toward managing AI agents rather than writing boilerplate code. This high-leverage activity requires skill to prevent the agents from ‘going off the rails,’ similar to the Sorcerer’s Apprentice scenario. Engineers have a rare 12-24 month window to define these new standards.
Agent Stress and Contextual Failures
Copied to clipboard!
(00:12:29)
- Key Takeaway: Failures in AI agents often stem from underspecified context, requiring engineers to encode tribal knowledge into the codebase to guide the model.
- Summary: Engineers feel stress when agents fail, especially when lacking an ’escape hatch’ to manually intervene. A team experimenting with a 100% Codex-written codebase found that agent failure is usually due to insufficient context provided in comments or repository structure. Solving these issues requires encoding tribal knowledge into the repository resources.
Code Review Automation Impact
Copied to clipboard!
(00:15:15)
- Key Takeaway: Codex reviewing 100% of PRs reduces review time from 10-15 minutes to 2-3 minutes, making human review less burdensome.
- Summary: Codex is highly adept at code review, suggesting improvements and changes proactively. For small PRs, human review attention can drop significantly, trusting Codex as a smart second pair of eyes. This automation, combined with automating linting and CI fixes, collapses the friction between writing code and deploying it.
Managerial Role Evolution and Top Performers
Copied to clipboard!
(00:19:35)
- Key Takeaway: Engineering managers will likely manage larger teams by leveraging AI for research and context gathering, while focusing disproportionately on empowering top performers.
- Summary: Managers are using tools like ChatGPT for high-leverage tasks like performance review research by synthesizing GitHub and Notion data. The productivity gap among top performers using AI is widening, necessitating managers to spend the majority of their time unblocking and empowering these high-agency individuals. AI may allow managers to oversee teams larger than the current standard of six to eight engineers.
Second-Order Effects of One-Person Startups
Copied to clipboard!
(00:24:16)
- Key Takeaway: The possibility of one-person billion-dollar startups will likely trigger a golden age of B2B SaaS by enabling hundreds of smaller startups building bespoke support software.
- Summary: The high leverage provided by AI enables one-person billion-dollar startups, but the second-order effect is an explosion in smaller startups building specialized software. This could lead to a boom in B2B SaaS as micro-companies outsource needs to other highly leveraged, small firms. Venture scale returns might shrink as more individuals achieve $10M-$50M businesses.
AI Deployment Challenges and ROI
Copied to clipboard!
(00:37:54)
- Key Takeaway: Negative AI ROI often results from top-down mandates without bottoms-up adoption, requiring companies to form internal ’tiger teams’ of excited technical evangelists.
- Summary: Many AI deployments fail because they are divorced from the actual work, lacking buy-in from employees who don’t understand the technology. Successful adoption requires both executive support and grassroots excitement from employees, often technical adjacent staff, who can evangelize best practices. These internal champions should be empowered to spread knowledge across the organization.
Building for Future Model Capabilities
Copied to clipboard!
(00:47:58)
- Key Takeaway: Builders must design products for where models are going (e.g., multi-hour coherence) rather than their current state to avoid optimizing for local maxima.
- Summary: The rapid evolution of models means current scaffolding, like vector stores or specific agent frameworks, can be quickly superseded. Companies should build products anticipating future capabilities, such as models handling multi-hour tasks coherently within 12-18 months. Improvements in native multimodal audio processing are also expected to unlock significant business applications.
Startup Strategy vs. OpenAI
Copied to clipboard!
(00:57:24)
- Key Takeaway: Startups should prioritize building products customers love over worrying about OpenAI preempting their idea, as market success hinges on resonance, not competitive avoidance.
- Summary: The primary reason startups fail is a lack of customer resonance, not being squashed by large AI labs like OpenAI. The market opportunity created by AI is massive enough that VCs are investing in competitive companies, validating the potential for many successful ventures. Building something people truly love guarantees a space in the market.
OpenAI’s Platform Philosophy
Copied to clipboard!
(00:59:16)
- Key Takeaway: OpenAI is committed to being an ecosystem platform company, ensuring all models released in their products are also available via the API to support external builders.
- Summary: OpenAI views itself as an ecosystem platform, reinforcing this by releasing every model into the API without holding back features. This approach is rooted in their mission to spread AGI benefits to all of humanity, as they cannot reach everyone alone. The philosophy is that a rising tide lifts all boats, which has significantly grown the API business.
ChatGPT App Store Context
Copied to clipboard!
(01:02:12)
- Key Takeaway: The ChatGPT App Store initiative, though managed by a separate team, aligns with OpenAI’s platform strategy to allow external builders to leverage the massive ChatGPT user base.
- Summary: The ChatGPT App Store is another expression of OpenAI’s mission to empower others to build for their massive user base. ChatGPT currently boasts 800 million weekly active users, an unprecedented asset. Allowing external companies to build for this audience is seen as mutually beneficial, ultimately helping expand that user group.
Democratization of AI Access
Copied to clipboard!
(01:04:12)
- Key Takeaway: OpenAI actively works to raise the floor for AI capability globally by ensuring the free version of ChatGPT continuously improves and remains accessible to everyone.
- Summary: The free version of ChatGPT is significantly more powerful now than it was in 2022, effectively raising the capability floor across the world. This democratization is a core part of OpenAI’s mission, ensuring that even those not paying premium prices receive access to powerful AI. Users can access models comparable to what billionaires use for a relatively low monthly subscription.
OpenAI API Offerings
Copied to clipboard!
(01:05:30)
- Key Takeaway: The OpenAI API provides layered abstractions, from the low-level, unopinionated Responses API for long-running agents to the higher-level Agents SDK and Agent Kit UI components.
- Summary: The fundamental API endpoint allows sampling from models, with the Responses API being optimized for building long-running agents by handling extended processing time. The Agents SDK provides scaffolding to build traditional infinite-loop agents with sub-agent delegation capabilities. Agent Kit and widgets offer UI components to easily deploy beautiful interfaces on top of these agent workflows.
Advice for the Next Few Years
Copied to clipboard!
(01:08:26)
- Key Takeaway: The next two to three years in tech will be uniquely exciting, and individuals should actively engage with AI tools rather than passively observing the rapid industry changes.
- Summary: The current period is one of the most exciting in tech, and people should not take this wave of innovation for granted before it becomes more incremental. Engagement means leaning in, building tools, and understanding the current limitations of AI to better track its improvements. Simply using the tools and getting familiar with the technology is more important than trying to absorb every piece of breaking news.
Lightning Round: Book Recommendations
Copied to clipboard!
(01:11:46)
- Key Takeaway: Recommended books include the sci-fi novel ‘There Is No Antimemetics Division’ and nonfiction works ‘Breakneck’ and ‘Apple in China’ for insights on US-China dynamics.
- Summary: The fiction recommendation, ‘There Is No Antimemetics Division,’ is a smart, creative, and unintentionally hilarious science fiction book about an agency fighting memory-erasing phenomena. Nonfiction recommendations focus on US-China relations: ‘Breakneck’ contrasts the US’s lawyerly society with China’s engineering society, and Patrick McGee’s book details Apple’s relationship with China.
Lightning Round: Unexpected House Price Factors
Copied to clipboard!
(01:16:23)
- Key Takeaway: When modeling house prices, high-voltage power lines, complex floor plans, and overall curb appeal/front door quality were surprisingly significant variables.
- Summary: Proximity to high-voltage power lines significantly impacts a house’s price due to negative externalities like buzzing and proximity concerns for families. Quantifying the quality of a floor plan proved extremely difficult but was a major factor in sales success or failure. Underrated importance was placed on curb appeal and the front door, which heavily influence a buyer’s initial perception.