Lenny's Podcast: Product | Career | Growth

What OpenAI and Google engineers learned deploying 50+ AI products in production

January 11, 2026

Key Takeaways Copied to clipboard!

Building AI products fundamentally differs from traditional software due to non-determinism (unpredictable user input and LLM output) and the agency-control trade-off inherent in delegating decision-making to AI systems.
Successful AI product development requires an iterative, step-by-step approach, starting with high human control and low agency (like suggestions rather than autonomous actions) to calibrate behavior before increasing autonomy.
Companies succeeding with AI products prioritize a 'success triangle' involving vulnerable leaders who stay hands-on, an empowering culture that augments employees, and technical prowess focused on understanding workflows over just the technology itself.
The concept of multi-agent systems is often misunderstood, with supervisor-agent patterns being more successful than peer-to-peer gossip protocols between agents due to control and guardrail challenges.
Obsessing over the core business problem and design is more valuable than rote building, as implementation is becoming cheap, making judgment and taste the key differentiators.
Pain endured through iterative development, testing trade-offs, and understanding non-negotiable customer needs creates a competitive moat in the rapidly evolving AI product space.

Segments

AI vs Traditional Software Differences

Copied to clipboard!

(00:00:00)

Key Takeaway: AI products introduce non-determinism in both user input and LLM output, fundamentally altering product development.
Summary: Building AI products differs from traditional software because developers must account for non-determinism in how users phrase intentions and how the LLM responds probabilistically. This non-deterministic input/output process, coupled with the agency-control trade-off, necessitates a different approach to building systems. Relinquishing decision-making to agents requires building trust and reliability first.

Starting Small with Agency

Copied to clipboard!

(00:00:28)

Key Takeaway: AI products should be built step-by-step, starting with minimal agency to focus on the core problem and build confidence.
Summary: Starting small forces teams to focus on the problem being solved rather than just the complexity of the AI solution. This incremental approach allows teams to gradually increase agency while maintaining human control, building a flywheel of improvement over time. It prevents being overwhelmed by the complexity of fully autonomous agents from day one.

Successful AI Company Patterns

Copied to clipboard!

(00:00:50)

Key Takeaway: Successful AI adoption requires leaders to be hands-on in rebuilding their intuitions and fostering an empowering culture.
Summary: Leaders must actively engage with AI, dedicating time to rebuild intuitions that may no longer apply in the AI era, accepting they might be the ‘dumbest person in the room.’ Companies must cultivate an empowering culture where subject matter experts collaborate rather than fear replacement by AI augmentation. Successful teams obsess over understanding workflows to select the right mix of ML models, deterministic code, and AI agents.

Evals vs Production Monitoring

Copied to clipboard!

(00:33:41)

Key Takeaway: Evals and production monitoring are complementary, not mutually exclusive; evals catch known issues while monitoring reveals unpredictable emerging patterns.
Summary: Evals represent product knowledge encoded into test datasets to ensure core functionality is preserved, while production monitoring captures explicit and implicit customer feedback signals in real-time. Evals cannot catch emergent failure modes that users discover post-deployment, necessitating continuous monitoring to identify new error patterns. The process involves using production signals to identify new issues, building evals for those issues, and then returning to monitoring.

The CC/CD Framework

Copied to clipboard!

(00:46:00)

Key Takeaway: The Continuous Calibration, Continuous Development (CC/CD) framework structures iterative AI product improvement by balancing development with risk mitigation.
Summary: The CC/CD framework cycles between continuous development (scoping capability and curating data) and continuous calibration (analyzing unexpected behavior observed in production). Teams should start with low-agency iterations, like suggestion-based co-pilots, to gather data and build trust before moving to higher-agency, end-to-end assistants. This iterative process builds a flywheel of behavior understanding, minimizing risk and preventing catastrophic failures seen in premature, fully autonomous deployments.

Evolving User Behavior and Calibration

Copied to clipboard!

(00:58:18)

Key Takeaway: AI systems require continuous recalibration because user behavior evolves, and external factors like model updates introduce new data distributions.
Summary: Teams know they can advance to the next stage of autonomy when they stop seeing new data distribution patterns during calibration cycles, indicating minimal surprise. External events, such as model deprecations (e.g., GPT-4.0), force immediate recalibration because the underlying API properties change. Furthermore, user excitement often leads them to test systems on unanticipated, complex tasks, requiring product builders to adapt the system’s capabilities accordingly.

Overhyped and Underhyped Concepts

Copied to clipboard!

(01:01:24)

Key Takeaway: Complex, functionality-based multi-agent systems relying on peer-to-peer communication are currently misunderstood and overrated, while coding agents are underrated due to low penetration outside major tech hubs.
Summary: Complex multi-agent systems that divide responsibilities based on functionality and rely on gossip protocols are extremely hard to control and are not yet feasible with current model capabilities. Supervisor agent patterns, where a main agent manages sub-agents, are a more successful pattern. Coding agents are underrated because their impact potential is high, but their adoption rate is low outside of leading AI companies.

Evals and Problem Obsession

Copied to clipboard!

(01:04:12)

Key Takeaway: Rote building and over-obsession with tools like evals are overrated; obsessing over the business problem and design is the most valuable activity today.
Summary: Evals are important but misunderstood; builders should remain old-school and focus on the business problem they are solving, as AI is merely a tool. Building is cheap today, but designing a solution that truly solves a pain point is far more valuable. This focus on problem and design will only become more critical in the near future.

2026 Product Vision: Proactive Agents

Copied to clipboard!

(01:05:19)

Key Takeaway: The next major value creation in AI products by 2026 will come from proactive, background agents that deeply understand user context and workflow to anticipate needs.
Summary: Proactive agents will gain value by plugging into the places where work actually happens to understand the user’s context and optimizing metrics. This allows agents to move beyond simple reactive responses to prompting the user back with solutions, such as delivering fixed code patches at the start of the day. This shift moves AI from being a tool to an anticipatory partner in complex workflows.

2026 Product Vision: Multimodal Experiences

Copied to clipboard!

(01:07:04)

Key Takeaway: Multimodal experiences, combining language, vision, and world models, will advance human-like conversational richness and unlock massive amounts of unstructured data.
Summary: Multimodal understanding is key because humans are inherently multimodal creatures, constantly processing signals beyond just language. Improved multimodal understanding will allow AI to process messy, handwritten documents and PDFs that current models struggle with, tapping into vast untapped data sources. Major players are combining image models, LLMs, and world models (like Genie) to achieve this.

Essential Skills for AI Builders

Copied to clipboard!

(01:08:41)

Key Takeaway: Future career success hinges on developing taste, judgment, and agency, as execution mechanics will be largely automated, and persistence through pain creates moats.
Summary: As implementation becomes cheap, builders must nail down design, judgment, and taste, which represent the uniquely human aspects of product building. Having agency—the willingness to rethink experiences and build solutions (even with low-code/no-code)—sets people apart, ending the era of busy work. Persistence through the pain of iteration and learning is the new moat, as knowledge gained from failed approaches becomes proprietary.

Recommended Books and Media

Copied to clipboard!

(01:15:27)

Key Takeaway: Recommended literature spans philosophical reflection on living versus examining life, grand-scale science fiction exploring human progress, and AGI concepts.
Summary: When Breath Becomes Air prompts reflection on whether too much examination prevents living, contrasting Socrates’s view. The Three-Body Problem series illustrates the devastating impact of neglecting abstract science on human progress, relevant to AI’s role. A Fire Upon the Deep is an epic sci-fi recommendation dealing directly with AGI and superintelligence.

Favorite Products and Mottos

Copied to clipboard!

(01:18:23)

Key Takeaway: Productivity tools like Whisperflow (conceptual transcription) and Raycast (CLI efficiency) are highly valued, while the motto ‘be foolish enough to do what they say can’t be done’ encourages necessary risk-taking.
Summary: Whisperflow is praised for seamlessly translating spoken instructions into code actions (e.g., adding exclamation marks instead of saying the phrase). Raycast and Caffeinate are used for CLI efficiency and preventing Mac sleep during long local model tasks, respectively. The motto about being a fool who succeeds because they didn’t know it couldn’t be done encourages ignoring data that suggests failure is likely.

The Goods

If you buy through our links, we may earn a commission.

📚 When Breath Becomes Air (00:02:43) - Listed under recommended books in the show notes.

📚 The Three-Body Problem (00:02:43) - Listed under recommended books in the show notes.

📚 A Fire Upon the Deep (00:02:43) - Listed under recommended books in the show notes.

🎬 Expedition 33 (01:18:53) - Kiriti mentioned this game (though listed as a movie/TV show category in the prompt, the context implies a game) as a recent discovery he enjoyed for its gameplay, story, and music.

📺 Silicon Valley (00:02:43) - Referenced in the list of external links/references provided in the show notes, which the speakers might have alluded to or which is contextually relevant to the discussion on building tech companies.

🎧 Clair Obscur: Expedition 33 (00:02:43) - Referenced in the list of external links/references provided in the show notes.

0:00 / 0:00

What OpenAI and Google engineers learned deploying 50+ AI products in production

Key Takeaways Copied to clipboard!

Segments Expand All Collapse All

The Goods

About Spoken Goods

What We Do

Why We Exist

Key Features

Contact Me

Choose Font

Segments