logo
episode-header-image
Sep 2024
51m 25s

Episode 205 - Gemini + LangGraph Agents ...

Mark and Allen
About this episode

Join us as we explore Vodo Drive, an innovative project that leverages Google's Gemini AI to revolutionize how we interact with spreadsheets. Creator Allen Firstenberg takes us behind the scenes, revealing the architecture, challenges, and breakthroughs of building an agentic system that understands and manipulates data like never before.


Discover how Vodo Drive:

* Empowers natural language interaction: Say goodbye to rigid formulas and hello to conversational commands.

* Integrates image recognition: Effortlessly input data by simply taking pictures.

* Provides real-time feedback: Experience transparent processing with live updates on your requests.

* Prioritizes security and user control: Maintain data privacy and manage permissions seamlessly.


More Info:

* Vodo Drive: https://vodo-drive.com/

* Gemini API on Vertex AI: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models

* LangChain: https://www.langchain.com/langchain

* LangGraph: https://www.langchain.com/langgraph

* Google Sheets: https://workspace.google.com/products/sheets/

* Firebase: https://firebase.google.com/


Timestamps:

* (0:00:00) Introduction and Project Overview: Discover the inspiration and goals behind Vodo Drive's participation in the Gemini API competition.

* (0:03:30) Reimagining Spreadsheet Control: Explore the evolution of Vodo Drive from voice-controlled spreadsheets to an AI-powered agentic system.

* (0:07:45) The Power of Visual Input: Learn how Vodo Drive seamlessly integrates image recognition to extract and input data from pictures.

* (0:11:55) Contextual Awareness and Conversational Flow: Delve into the importance of contextual awareness and how Vodo Drive maintains the flow of information.

* (0:14:30) Optimizing Tasks with the Right Tools: Understand the strategic use

of spreadsheets as the computational backbone for Vodo Drive's data processing.

* (0:15:30) System Design and Architecture Breakdown: Get a detailed look at the core components of Vodo Drive, including Firebase Cloud Functions, Firestore, and Authentication.

* (0:22:55) Addressing Security Concerns: Explore the safety measures implemented to protect user data and prevent unauthorized actions.

* (0:26:35) Real-Time Updates and User Experience: Discover how Vodo Drive leverages Firestore to provide real-time feedback and enhance user experience.

* (0:32:30) Behind the Scenes: The AI's Internal Dialogue: Uncover the hidden conversations happening between the agent and the LLM during data processing.

* (0:38:05) Firebase Authentication and Authorization: Learn how Vodo Drive ensures secure access to user spreadsheets and leverages Google's authorization system.

* (0:40:45) Firebase Cloud Storage and Media Handling: Explore the role of cloud storage in managing user-uploaded photos and audio files.

* (0:43:35) Gemini's Role in Image Processing and Agentic Logic: Discover how Gemini powers both image recognition and the decision-making process of the agentic system.


Don't miss this insightful discussion on the future of AI-powered data management and how Vodo Drive is paving the way for a more intuitive and efficient user experience.


#GeminiAPI #LLM #AgenticSystems #VoiceControl #Spreadsheets #Firebase #WebDevelopment #AndroidDevelopment #AI #Innovation

Up next
Sep 25
Episode 255 - Agonizing About Agent-to-Agent
Join Allen Firstenberg and Noble Ackerson in a deep dive into the evolving world of AI agent protocols. In this episode of Two Voice Devs, they unpack the Agent-to-Agent (A2A) protocol, comparing it with the Model Context Protocol (MCP). They explore the fundamental differences, ... Show More
49m 6s
Sep 18
Episode 254 - Agent Frameworks Compared: Google's ADK vs LangChainJS
Allen and Mark are back to discuss AI agent frameworks again. This time, Allen compares Google's Agent Development Kit (ADK) with LangChainJS and LangGraphJS. He walks through building a simple agent in both frameworks, highlighting the differences in their approaches, from confi ... Show More
33m 21s
Aug 29
Episode 253 - The Future of Voice? Exploring Gemini 2.5's TTS Model
In this episode of Two Voice Devs, Mark and Allen dive into the new experimental Text-to-Speech (TTS) model in Google's Gemini 2.5. They explore its capabilities, from single-speaker to multi-speaker audio generation, and discuss how it's a significant leap from the old days of S ... Show More
25m 40s
Recommended Episodes
Nov 2024
Build An App with a Backend Using Ai in 20 min (Cursor Ai, Replit, Firebase, Wispr Flow)
Episode 32: How can you build an app with a backend using AI in just 20 minutes? Matt Wolfe (https://x.com/mreflow) and Nathan Lands (https://x.com/NathanLands) sit down with AI enthusiast Riley Brown (https://x.com/rileybrown_ai) to explore this exciting and challenging process. ... Show More
39m 34s
Nov 2024
scikit-learn & data science you own
We are at GenAI saturation, so let’s talk about scikit-learn, a long time favorite for data scientists building classifiers, time series analyzers, dimensionality reducers, and more! Scikit-learn is deployed across industry and driving a significant portion of the “AI” that is ac ... Show More
52m 2s
Nov 2024
Automate IAM policies creation across multiple accounts
In this episode of the AWS Developers Podcast, Seb and Ran Isenberg discuss the automation of IAM policies across multiple AWS accounts. They explore the challenges faced in managing security and access in a multi-account environment, the design and implementation of an automatio ... Show More
34m 20s
Apr 2025
#246 Will Granis: How Google Cloud is Powering the Future of Agentic AI
This episode is sponsored by Thuma. Thuma is a modern design company that specializes in timeless home essentials that are mindfully made with premium materials and intentional details. To get $100 towards your first bed purchase, go to http://thuma.co/eyeonai What happens when A ... Show More
57m 44s
Sep 18
From RAG to Relational: How Agentic Patterns Are Reshaping Data Architecture
SummaryIn this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transforming database usage and infrastructure design. He discusses the evolving role of data in AI systems, from traditional models to m ... Show More
52m 58s
Jan 2024
Designing Data Platforms For Fintech Companies
Summary Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platfor ... Show More
47m 57s
Sep 2024
Pausing to think about scikit-learn & OpenAI o1
Recently the company stewarding the open source library scikit-learn announced their seed funding. Also, OpenAI released “o1” with new behavior in which it pauses to “think” about complex tasks. Chris and Daniel take some time to do their own thinking about o1 and the contrast to ... Show More
50m 10s
Dec 2024
#491: DuckDB and Python: Ducks and Snakes living together
Join me for an insightful conversation with Alex Monahan, who works on documentation, tutorials, and training at DuckDB Labs. We explore why DuckDB is gaining momentum among Python and data enthusiasts, from its in-process database design to its blazingly fast, columnar architect ... Show More
1h 2m
Jun 2023
AI trends: a Latent Space crossover
Daniel had the chance to sit down with @swyx and Alessio from the Latent Space pod in SF to talk about current AI trends and to highlight some key learnings from past episodes. The discussion covers open access LLMs, smol models, model controls, prompt engineering, and LLMOps. Th ... Show More
59m 39s
Aug 2024
809: Agentic AI, with Shingai Manjengwa
Agentic AI is revolutionizing the tech landscape, and Shingai Manjengwa from ChainML is here to tell us why. Discover how AI agents are becoming an integral part of our lives, automating tasks like travel bookings and daily inspiration. Shingai explains the power of multi-agent s ... Show More
1h 10m