The code for generative AI is 'scary easy.' The real skill lies in mastering prompt engineering to get reliable, structured output.
#1about 1 minute
Generative AI code is simple but prompting is complex
The core challenge in generative AI development isn't writing code, but mastering prompt engineering to get desired results, similar to writing performant SQL.
#2about 3 minutes
Understanding Google Gemini models and capabilities
Google Gemini offers different models like Pro and Flash for varying needs, supporting a large context window for inputs like video, audio, and code.
#3about 3 minutes
Getting your API key and making your first call
Obtain a free-tier API key easily through AI Studio without needing the full Google Cloud Platform, and test it immediately with a provided curl command.
#4about 4 minutes
Prototyping prompts and writing code with Node.js
Use AI Studio as a playground to test prompts and generate starter code, then implement it using the Node.js SDK for simple question-and-answer interactions.
#5about 5 minutes
Processing images and files with multimodal input
Leverage Gemini's multimodal capabilities by uploading images via the Files API to analyze their content and automate tasks like generating descriptive filenames.
#6about 3 minutes
Building conversational context with chat history
Create stateful chat interactions by sending the entire conversation history with each new message, a process the Gemini SDK manages automatically.
#7about 3 minutes
Defining model persona and style with system instructions
Use system instructions to formally define a model's persona, tone, and subject matter constraints, ensuring consistent and tailored responses for specific use cases.
#8about 4 minutes
Enforcing structured output with JSON Schema
Ensure reliable and structured data from the model by specifying the desired output format as JSON and defining its precise structure using a JSON Schema.
#9about 3 minutes
Exploring practical use cases and model limitations
Real-world applications of Gemini include a movie recommendation system and a Dungeons and Dragons tool, but it can fail at tasks requiring strategic reasoning like blackjack.
#10about 3 minutes
Running on-device AI in the browser with Gemini Nano
Gemini Nano brings generative AI directly into the Chrome browser, enabling on-device processing for tasks like summarization and translation without API calls.
#11about 4 minutes
Implementing summarization and translation with web APIs
Use the experimental `window.ai` object in Chrome to implement features like text summarization and translation that run entirely on the user's device.
Related jobs
Jobs that call for the skills explored in this talk.
DeepMind Gemini: Google’s Newest ChatbotLast week (Dec 7th) Google held a virtual event where they presented a series of demos for their newest AI model, Gemini. Gemini is Google’s competitive response to ChatGPT. And although Google did release Bard in March, it felt like a rushed respons...
Daniel Cranney
Dev Digest 161: Gemini 2.5, AI killing search, EU A11Y ActInside last week’s Dev Digest 161 .
🤖 Most traffic to web sites comes from AI chatbots
🖼️ Google releases Gemini 2.5 and OpenAI adds native image generation
⬛︎ Next.js has a big security issue
👨💻 How hackers weaponise code agents
📜 WikiTok analysed...
Daniel Cranney, Chris Heilmann
Dev Digest 215: Agent Memory, JS2026, Googlebot Analysis & Canvas❤️HTMLInside last week’s Dev Digest 215 .
🗿 Make AI talk like a caveman
🧠 A guide to context engineering for LLMs
🤖 Simon Willison on agentic engineering
🔐 Axios supply chain attack post mortem
🛡️ Designing AI agents to resist prompt injection
🎨 HTML in c...
Eli McGarvie
13 AI Tools You Have to TryFirst, it was NFTs, then it was Web3, and now it’s generative AI… it’s probably time to stop collecting pictures of monkeys and kitties. Chatbots and generative AI are the next big thing. This time we’ve jumped on a trend that has real-world applicat...
From learning to earning
Jobs that call for the skills explored in this talk.