logo
episode-header-image
Sep 2024
51m 25s

Episode 205 - Gemini + LangGraph Agents ...

Mark and Allen
About this episode

Join us as we explore Vodo Drive, an innovative project that leverages Google's Gemini AI to revolutionize how we interact with spreadsheets. Creator Allen Firstenberg takes us behind the scenes, revealing the architecture, challenges, and breakthroughs of building an agentic system that understands and manipulates data like never before.


Discover how Vodo Drive:

* Empowers natural language interaction: Say goodbye to rigid formulas and hello to conversational commands.

* Integrates image recognition: Effortlessly input data by simply taking pictures.

* Provides real-time feedback: Experience transparent processing with live updates on your requests.

* Prioritizes security and user control: Maintain data privacy and manage permissions seamlessly.


More Info:

* Vodo Drive: https://vodo-drive.com/

* Gemini API on Vertex AI: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models

* LangChain: https://www.langchain.com/langchain

* LangGraph: https://www.langchain.com/langgraph

* Google Sheets: https://workspace.google.com/products/sheets/

* Firebase: https://firebase.google.com/


Timestamps:

* (0:00:00) Introduction and Project Overview: Discover the inspiration and goals behind Vodo Drive's participation in the Gemini API competition.

* (0:03:30) Reimagining Spreadsheet Control: Explore the evolution of Vodo Drive from voice-controlled spreadsheets to an AI-powered agentic system.

* (0:07:45) The Power of Visual Input: Learn how Vodo Drive seamlessly integrates image recognition to extract and input data from pictures.

* (0:11:55) Contextual Awareness and Conversational Flow: Delve into the importance of contextual awareness and how Vodo Drive maintains the flow of information.

* (0:14:30) Optimizing Tasks with the Right Tools: Understand the strategic use

of spreadsheets as the computational backbone for Vodo Drive's data processing.

* (0:15:30) System Design and Architecture Breakdown: Get a detailed look at the core components of Vodo Drive, including Firebase Cloud Functions, Firestore, and Authentication.

* (0:22:55) Addressing Security Concerns: Explore the safety measures implemented to protect user data and prevent unauthorized actions.

* (0:26:35) Real-Time Updates and User Experience: Discover how Vodo Drive leverages Firestore to provide real-time feedback and enhance user experience.

* (0:32:30) Behind the Scenes: The AI's Internal Dialogue: Uncover the hidden conversations happening between the agent and the LLM during data processing.

* (0:38:05) Firebase Authentication and Authorization: Learn how Vodo Drive ensures secure access to user spreadsheets and leverages Google's authorization system.

* (0:40:45) Firebase Cloud Storage and Media Handling: Explore the role of cloud storage in managing user-uploaded photos and audio files.

* (0:43:35) Gemini's Role in Image Processing and Agentic Logic: Discover how Gemini powers both image recognition and the decision-making process of the agentic system.


Don't miss this insightful discussion on the future of AI-powered data management and how Vodo Drive is paving the way for a more intuitive and efficient user experience.


#GeminiAPI #LLM #AgenticSystems #VoiceControl #Spreadsheets #Firebase #WebDevelopment #AndroidDevelopment #AI #Innovation

Up next
Sep 25
Episode 255 - Agonizing About Agent-to-Agent
Join Allen Firstenberg and Noble Ackerson in a deep dive into the evolving world of AI agent protocols. In this episode of Two Voice Devs, they unpack the Agent-to-Agent (A2A) protocol, comparing it with the Model Context Protocol (MCP). They explore the fundamental differences, ... Show More
49m 6s
Sep 18
Episode 254 - Agent Frameworks Compared: Google's ADK vs LangChainJS
Allen and Mark are back to discuss AI agent frameworks again. This time, Allen compares Google's Agent Development Kit (ADK) with LangChainJS and LangGraphJS. He walks through building a simple agent in both frameworks, highlighting the differences in their approaches, from confi ... Show More
33m 21s
Aug 29
Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model
In this episode of Two Voice Devs, Mark and Allen dive into the new experimental Text-to-Speech (TTS) model in Google's Gemini 2.5. They explore its capabilities, from single-speaker to multi-speaker audio generation, and discuss how it's a significant leap from the old days of S ... Show More
25m 40s
Recommended Episodes
Nov 2024
Build An App with a Backend Using Ai in 20 min (Cursor Ai, Replit, Firebase, Wispr Flow)
Episode 32: How can you build an app with a backend using AI in just 20 minutes? Matt Wolfe (https://x.com/mreflow) and Nathan Lands (https://x.com/NathanLands) sit down with AI enthusiast Riley Brown (https://x.com/rileybrown_ai) to explore this exciting and challenging process. ... Show More
38m 34s
Nov 2024
scikit-learn & data science you own
<p>We are at GenAI saturation, so let’s talk about scikit-learn, a long time favorite for data scientists building classifiers, time series analyzers, dimensionality reducers, and more! Scikit-learn is deployed across industry and driving a significant portion of the “AI” that is ... Show More
52m 2s
Sep 15
How I built an Apple Watch workout app using Cursor and Xcode (with zero mobile-app experience)
Terry Lin is a product manager and developer who built Cooper’s Corner, an AI-powered fitness tracking app that works across iPhone and Apple Watch. Frustrated with traditional fitness apps that require extensive setup and manual logging, Terry created a solution that lets users ... Show More
36m 16s
Nov 2024
Automate IAM policies creation across multiple accounts
In this episode of the AWS Developers Podcast, Seb and Ran Isenberg discuss the automation of IAM policies across multiple AWS accounts. They explore the challenges faced in managing security and access in a multi-account environment, the design and implementation of an automatio ... Show More
34m 20s
Apr 2025
#246 Will Granis: How Google Cloud is Powering the Future of Agentic AI
<p dir="ltr">This episode is sponsored by Thuma.</p> <p dir="ltr">Thuma is a modern design company that specializes in timeless home essentials that are mindfully made with premium materials and intentional details.</p> <p><strong> </strong></p> <p dir="ltr">To get $100 towards y ... Show More
57m 44s
Sep 18
From RAG to Relational: How Agentic Patterns Are Reshaping Data Architecture
SummaryIn this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transforming database usage and infrastructure design. He discusses the evolving role of data in AI systems, from traditional models to m ... Show More
52m 58s
Jan 2024
Designing Data Platforms For Fintech Companies
<h2>Summary</h2> <p>Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing ... Show More
47m 57s
Sep 2024
Pausing to think about scikit-learn & OpenAI o1
<p>Recently the company stewarding the open source library scikit-learn announced their seed funding. Also, OpenAI released “o1” with new behavior in which it pauses to “think” about complex tasks. Chris and Daniel take some time to do their own thinking about o1 and the contrast ... Show More
50m 10s
Dec 2024
#491: DuckDB and Python: Ducks and Snakes living together
See the full show notes for this episode on the website at <a href="https://talkpython.fm/491">talkpython.fm/491</a> 
1h 2m
Jun 2023
AI trends: a Latent Space crossover
<p>Daniel had the chance to sit down with @swyx and Alessio from the <a href="https://www.latent.space/podcast">Latent Space pod</a> in SF to talk about current AI trends and to highlight some key learnings from past episodes. The discussion covers open access LLMs, smol models, ... Show More
59m 39s