SHUO Blog NewsDaily Brief

Today's AI Briefing: Google Shows Reachy Mini Powered by Local Gemma 4 — $299 Open-Source Robot Brings Embodied AI Within Reach

Google and Hugging Face team up to show Reachy Mini running Gemma 4 locally for real-time voice and vision interaction. Starting at $299, open-source, local-privacy-first physical AI assistants are here.

Foreword

If someone asked:

"What will a physical AI assistant actually look like?"

Google and Hugging Face recently gave a pretty exciting answer.

In the latest Gemma demo, a desktop robot called Reachy Mini looks at a chessboard through its camera, talks to a user naturally, and understands what's happening around it well enough to respond in real time.

The interesting part? None of this depends on a cloud-hosted giant model.

The video clearly notes:

Reachy Mini connected to Gemma 4 on a laptop locally

That means Gemma 4 is running on a local laptop with very low latency — a solid showcase of what on-device multimodal embodied AI can do today.

Reachy Mini interacting via voice and vision powered by a locally running Gemma 4 model


What Is Reachy Mini?

Reachy Mini is an open-source desktop robot co-launched by Hugging Face and Pollen Robotics, with Seeed Studio handling hardware optimization and manufacturing.

Unlike industrial or humanoid robots that cost tens of thousands of dollars, Reachy Mini is modestly priced — it's designed as an AI experimentation and physical interaction platform for developers and researchers.

Key features:

  • Hardware: Dual cameras, microphone, speaker, plus a head and antenna with 6-axis motion — enough to pull off expressive, human-like movements.
  • Open source & flexible: Ships with a full Python SDK and simulation environment, so you have complete control over the hardware and can script custom behaviors.
  • Two editions:
    • Reachy Mini Lite ($299): Connects via cable to an external computer (Mac, PC, or Linux). A good fit if you're on a tight budget and focused on local model development.
    • Reachy Mini Full / Wireless ($449): Comes with a built-in Raspberry Pi 4, battery, and Wi-Fi — runs fully untethered.

Gemma 4 Running Locally: No More Cloud Latency or Privacy Worries

The real breakthrough in this demo is that the entire voice-and-vision interaction pipeline runs locally.

Until recently, most smart robots had to stream video and audio to a cloud API — which meant recurring subscription costs, bandwidth bills, network latency, and privacy exposure. The Reachy Mini + Gemma 4 combo pulls off a fully offline workflow on a local laptop:

  1. Voice Activity Detection (VAD): Uses Silero VAD to detect when someone starts and stops speaking.
  2. Speech-to-Text (STT): Transcribes speech locally via faster-whisper.
  3. Understanding & reasoning (LLM): Runs Google's latest consumer-grade open model, Gemma 4 12B, directly on the laptop, handling both text and image input.
  4. Text-to-Speech (TTS): Synthesizes speech in real time using a local TTS engine and plays it through Reachy Mini's speaker.

This integrated setup not only delivers solid real-time responsiveness but also means nothing from your home or office — no video, no audio — ever hits the cloud.


Physical AI Is Making Its Way Into Daily Life

Over the past year, the focus in AI has been shifting from web-based chatbots toward autonomous agents. The next big wave is giving AI a physical body — what people call Physical AI or Embodied AI.

Not long ago, embodied AI was something most of us thought was distant and expensive. Reachy Mini shows that the equation:

text
stereo camera + microphone + local open-source LLM + $299 open-source hardware

has already brought the cost of entry down to a level any developer or hobbyist can reach. It means a desktop AI companion that sits on your desk, respects your privacy, and interacts with eye contact and gestures won't be science fiction for much longer.