Introducing Chain of Thought, the podcast for software engineers and leaders that demystifies artificial intelligence.
Join us each week as we tell the storie...
AI in 2025: Agents & The Rise of Evaluation Driven Development
"In the next three to five years, every piece of software that is built on this planet will have some sort of AI baked into it." - Atin Sanyal
Chain of Thought is back for its second season, and this episode dives headfirst into the possibilities AI holds for 2025 and beyond. Join Conor Bronson as he chats with Galileo co-founders Yash Sheth (COO) and Atindriyo Sanyal (CTO) about major trends to look for this year. These include AI finding its product "tool stack" fit, generation latency decreasing, AI agents, their potential to revolutionize code generation and other industries, and the crucial role of robust evaluation tools in ensuring the responsible and effective deployment of these agents.
Yash and Atin also highlight Galileo's focus on building trust and security in AI applications through scalable evaluation intelligence. They emphasize the importance of quantifying application behavior, enforcing metrics in production, and adapting to the evolving needs of AI development.
Finally, they discuss Galileo's vision for the future and their active pursuit of partnerships in 2025 to contribute to a more reliable and trustworthy AI ecosystem.
Show Notes:
Check out Galileo
Follow Yash
Follow Atin
Follow Conor
Chapters:
00:00 AI Trends and Predictions for 2025
02:55 Advancements in LLMs and Code Generation
05:16 Challenges and Opportunities in AI Development
10:40 Evaluating AI Agents and Applications
16:07 Building Evaluation Intelligence
23:41 Research Opportunities
29:50 Advice for Leveraging AI in 2025
32:00 Closing Remarks
--------
33:13
Now is the Time to Build | Weaviate’s Bob van Luijt
"This is the time. This is the time to start building... I can't say that often enough. This is the time." - Bob van Luijt
Join Bob van Luijt, CEO and co-founder of Weaviate as he sits down with our host Conor Bronson for the Season 2 premiere of Chain of Thought. Together, they explore the ever-evolving world of AI infrastructure and the evolution of Retrieval-Augmented Generation (RAG) architecture.
Bob's journey with Weaviate offers a compelling example of how to adapt to rapid changes in the AI landscape. He discusses the importance of understanding developer needs and building AI-native solutions, emphasizing the potential of generative feedback loops and agent architectures to revolutionize data management.
Chapters:
00:00 Welcome to Season 2
1:43 The Evolution of AI Infrastructure
04:13 Navigating Rapid Changes in AI
07:39 Generative Feedback Loops and AI Native Databases
13:26 Challenges and Opportunities in AI Production
19:03 The Importance of Documentation and Developer Experience
27:13 Future Predictions and Paradigm Shifts in AI
31:17 Final Thoughts and Encouragement to Build
--------
35:22
How AI Assistants Can Enhance Human Connection | Twilio’s Vinnie Giarrusso
Can AI assistants actually enhance human connection?
As Season 1 of Chain of Thought comes to a close, Conor Bronsdon and Vinnie Giarrusso (Twilio) explore the transformative potential of AI assistants in the workplace.
Discover how these assistants function as "async junior digital employees," taking on specific tasks and contributing to the organizational structure. But will AI assistants ultimately replace human connection? Vinnie argues the opposite is true, suggesting that AI can liberate employees from mundane tasks, allowing them to focus on building meaningful relationships and providing personalized experiences.
This thought-provoking conversation takes a philosophical turn as Vinnie explores how AI could revolutionize education while potentially disrupting traditional mentorship roles. He shares his vision for a future where AI democratizes information and empowers individuals to personalize their learning journey. Finally, learn how Twilio and Galileo are partnering to shape the future of AI and what this collaboration means for both companies.
Chain of Thought will be taking a break for the holidays, but we'll see you back here on January 8th for the start of Season 2!
Show Notes:
Watch Productionize 2.0
Check out Galileo
Twilio Alpha: twilioalpha.com
OWASP GenAI: genai.owasp.org
Read: Dominik Kundel on Junior/Senior relationship with AI
Follow Conor Bronsdon
Follow Vinnie Giarrusso
Chapters:
00:00 Twilio's AI Agent Platform
06:34 Ensuring Accuracy and Trustworthiness
09:49 Challenges and Failure Modes
17:39 Future of Fully Autonomous Agents
22:18 Human-AI Collaboration and Mentorship
31:24 Education and Democratization of Information
32:58 Partnership with Galileo
39:54 Conclusion and Season Wrap-Up
--------
42:19
Lessons from Deploying AI at Enterprise Scale | ServiceTitan, Indeed & Twilio
This week, a panel of experts (Mehmet Murat Ezbiderli, ServiceTitan; Grant Ledford, Indeed; and Vinnie Giarrusso, Twilio) join Atin Sanyal (CTO, Galileo) and Conor Bronsdon (Developer Awareness, Galileo) to explore the challenges and opportunities of deploying GenAI at enterprise scale in a conversation that's a wake-up call for any business leader looking to harness the power of AI.
Together, Atin & Conor break down key considerations like performance, cost, and model selection, emphasizing the need for robust evaluation frameworks and a shift in developer mindset.
Atin then sits down with our panel of AI engineering experts to discuss their firsthand experiences with enterprise AI, including the trade-offs of building AI systems, the evolving tools and frameworks available, and the impact these technologies are having on their organizations.
Show Notes:
Watch Productionize 2.0
Check out Galileo
Follow Atin Sanyal
Follow Mehmet Murat Ezbiderli
Follow Grant Ledford
Follow Vinnie Giarrusso
Chapters:
00:00 Enterprise Scale Deployment
05:17 Cost, Performance, and Model Selection
08:59 Building and Integrating GenAI Systems
15:26 Emerging Enterprise Use Cases
18:12 Predictions for AI in 2025
27:28 Panel Discussion: Deploying AI at Enterprise Scale
31:19 Gen AI Solutions and Challenges
33:12 Building & Deploying Traditional Infrastructure vs GenAI Infrastructure
34:36 How to Assemble Your GenAI Stack
40:39 Today's Best GenAI Use Cases
48:15 Enterprise AI Trends for 2025
50:36 Closing Remarks and Future Outlook
As AI agents and multimodal models become more prevalent, understanding how to evaluate GenAI is no longer optional – it's essential.
Generative AI introduces new complexities in assessment compared to traditional software, and this week on Chain of Thought we’re joined by Chip Huyen (Storyteller, Tép Studio), Vivienne Zhang (Senior Product Manager, Generative AI Software, Nvidia) for a discussion on AI evaluation best practices.
Before we hear from our guests, Vikram Chatterji (CEO, Galileo) and Conor Bronsdon (Developer Awareness, Galileo) give their takes on the complexities of AI evals and how to overcome them through the use of objective criteria in evaluating open-ended tasks, the role of hallucinations in AI models, and the importance of human-in-the-loop systems.
Afterwards, Chip and Vivienne sit down with Atin Sanyal (Co-Founder & CTO, Galileo) to explore common evaluation approaches, best practices for building frameworks, and implementation lessons. They also discuss the nuances of evaluating AI coding assistants and agentic systems.
Show Notes:
Watch Productionize 2.0
Check out Galileo
Follow Vikram Chatterji
Follow Chip Huyen
Follow Vivienne Zhang
Chapters:
00:00 Challenges in Evaluating Generative AI
05:45 Evaluating AI Agents
13:08 Are Hallucinations Bad?
17:12 Human in the Loop Systems
20:49 Panel discussion begins
22:57 Challenges in Evaluating Intelligent Systems
24:37 User Feedback and Iterative Improvement
26:47 Post-Deployment Evaluations and Common Mistakes
28:52 Hallucinations in AI: Definitions and Challenges
34:17 Evaluating AI Coding Assistants
38:15 Agentic Systems: Use Cases and Evaluations
43:00 Trends in AI Models and Hardware
45:42 Future of AI in Enterprises
47:16 Conclusion and Final Thoughts
Introducing Chain of Thought, the podcast for software engineers and leaders that demystifies artificial intelligence.
Join us each week as we tell the stories of the people building the AI revolution, unravel actionable strategies and share practical techniques for building effective GenerativeAI applications.