What’s New with Google Gemini?

Move beyond chatbots. Build AI assistants that can see, hear, and interact with a user's screen using the new Gemini Live API.

#1about 2 minutes

The history and merger of Google's AI teams

Google merged its independent research groups like DeepMind and Google Brain to focus on the horizontal value of large language models like Gemini.

#2about 2 minutes

Overcoming model fatigue for developers

The rapid release of numerous AI models creates overwhelm, which can be solved with product features like personalized benchmarks to help developers choose the right tool.

#3about 5 minutes

Choosing between general and domain-specific models

While domain-specific models have their place, powerful general-purpose models often provide a better balance of world knowledge and capability with less development effort.

#4about 3 minutes

The challenge of moving AI from demo to production

It's easy to create a simple AI demo, but the "last mile" to a reliable production application is difficult due to unpredictable model behavior and a lack of mature infrastructure.

#5about 3 minutes

Managing the economic cost of building with AI

High API costs are a major barrier for developers, which Gemini addresses by optimizing for cost-per-intelligence and offering a generous free tier for experimentation.

#6about 3 minutes

The importance of model-agnostic developer tooling

Developers prefer model-agnostic infrastructure to avoid lock-in, so platforms like Google AI Studio are designed as starting points to get an API key and then build elsewhere.

#7about 5 minutes

Navigating regional availability and data ethics

AI model availability is often limited by regional politics and legal frameworks, while developers must also consider the ethical implications of web scraping for data.

#8about 3 minutes

Unlocking insights with multimodal video analysis

Multimodal models like Gemini 1.5 Pro excel at video understanding, enabling developers to unlock and analyze vast amounts of knowledge previously trapped in video files.

#9about 3 minutes

Integrating AI seamlessly into user experiences

The most effective AI integrations are invisible to the user and work in the background to provide value, rather than being a flashy, explicit feature.

#10about 4 minutes

Using open source Gemma for local AI processing

Google's open source Gemma models allow developers to run AI workloads locally, addressing privacy concerns and practical limitations of uploading large datasets.

#11about 8 minutes

Building interactive agents with the Gemini Live API

The new Live API allows developers to build AI agents that can see and hear user context through screen sharing, enabling more powerful and context-aware interactions.

#12about 6 minutes

Using AI to create guided product experiences

Instead of simple chatbots, AI can act as a virtual coworker that guides users through complex software like Photoshop or an IDE, improving onboarding and usability.

#13about 12 minutes

Getting started with the Gemini API and SDKs

Developers can start building with Gemini for free at ai.dev, using SDKs for popular languages like Python and TypeScript to accelerate learning and build more ambitious projects.