SHUO Blog NewsDaily Brief

AMD Launches Ryzen AI Halo Developer Platform: 128GB Unified Memory Challenges Local AI Status Quo

AMD opened pre-orders in mid-June for its first Ryzen AI Halo developer platform ($3,999), packing a 16-core Zen 5 CPU, a 40-CU Radeon 8060S GPU, and 128GB of LPDDR5x unified memory — a new option for local agentic workflows.

Preface

With the explosive growth of local AI models and agentic workflows, developers' demand for local VRAM capacity has reached unprecedented levels.

In the past, if you wanted to run open-source LLMs larger than 70B parameters smoothly on local hardware, Apple's Mac Studio or a high-end MacBook Pro with plenty of unified memory was essentially your only option.

But AMD recently announced the new "Ryzen AI Halo Developer Platform" and opened pre-orders at $3,999. This Mini PC packs a staggering 128GB of LPDDR5x-8000 unified memory, going head-to-head with Apple and NVIDIA in the local AI inference market.

Demo video of LLM and agentic workflows running locally on the AMD Ryzen AI Halo Developer Platform


Core Specs: The Ryzen AI Max+ 395 Is a Monster

The developer kit, branded "Ryzen AI Halo," is built around AMD's latest flagship APU (codenamed Strix Halo), officially named the Ryzen AI Max+ 395.

The hardware configuration is impressive:

  • CPU: 16 Zen 5 cores.
  • GPU: Up to 40 Compute Units (CU) on the Radeon 8060S iGPU (RDNA 3.5 architecture). GPU performance is on par with mid-to-high-end discrete graphics cards.
  • Unified Memory: 128GB LPDDR5x-8000 unified memory. Unlike traditional PC architectures, this 128GB is shared between the CPU and GPU, meaning the GPU can directly address a massive pool of VRAM for loading very large AI models.
  • NPU: 50 TOPS of local AI compute.

Integrated Software & Hardware: From Setup to Workflow in Minutes

To address AMD's long-standing software compatibility pain points in AI, this developer platform ships with tight software-hardware integration out of the box:

  1. ROCm Software Stack Pre-installed: The system comes with AMD's ROCm open-source software stack pre-configured, offering day-0 support and competitive generative AI performance.
  2. AI Playbooks & Developer Center: The "Ryzen AI Developer Center" hub app comes pre-installed with pre-configured AI Playbooks, letting developers load LLMs, start coding assistants, run image generation, or automate workflows within minutes of booting up.
  3. Multi-OS Support, No Cloud Costs: Choose between Windows 11 Pro or Linux. Developers can deploy and manage heavy AI workloads up to 200B parameters locally, eliminating expensive cloud API and token subscriptions for true no-token-tax, low-latency local inference.

AMD Isn't Actually Targeting the RTX 5090

When people see "128GB Unified Memory," the first instinct is to compare it to an RTX 5090. But AMD's own positioning tells a different story.

The AI Halo's main competitors are:

  • Apple M4 Pro (high-end AI PC representative)
  • NVIDIA DGX Spark (NVIDIA's desktop AI development platform)

According to AMD's published benchmarks, the AI Halo leads across several generative AI workloads. This suggests AMD's strategy isn't about building a new gaming GPU — it's about going straight for the local AI development market.

Apple M4 Pro vs. AMD Ryzen AI Halo

Generative AI performance comparison: Apple M4 Pro vs. AMD Ryzen AI Halo

In benchmarks against the Apple M4 Pro, the Ryzen AI Max showed significant speedups across multimodal tasks, image generation, and language models:

  • Ace Step 1.5: up to 3.3x faster
  • Ace Step 1.5 XL: up to 7.3x faster
  • Flux Schnell: up to 3.8x faster
  • Flux 2. Klien: up to 4.0x faster
  • Hunyuan 3D 2.1: up to 4.9x faster
  • Qwen Image: up to 3.7x faster
  • Qwen Image Edit: up to 3.9x faster
  • Stable Diffusion XL: up to 4.5x faster
  • Z Image Turbo: up to 4.4x faster
NVIDIA DGX Spark vs. AMD Ryzen AI Halo

LLM inference performance comparison: NVIDIA DGX Spark vs. AMD Ryzen AI Halo

Against NVIDIA's personal AI supercomputer for researchers and developers, the NVIDIA DGX Spark, the Ryzen AI Halo also put up solid token-per-second numbers on popular open-source models:

  • GLM 4.7 Flash-30B-A3B: 14% faster
  • GPT-OSS-120B: 7% faster
  • Qwen 3.5-122B-A10B: 12% faster
  • Qwen 3.6-35B-A3B: 4% faster

The spec-level differences between the two platforms are also significant:

FeatureNVIDIA DGX SparkAMD Ryzen AI Halo
OS SupportLinux onlyWindows & Linux dual-OS support
Price-Performance (Tok/Sec per $)LowerLeading LLM Tok/Sec per $
Built-in NPUNo NPU50 TOPS NPU

Looking Ahead: Ryzen AI Max PRO 400 Series

Beyond the flagship model already available for pre-order, AMD has also laid out its next-gen roadmap.

In Q3 2026, AMD plans to ship next-generation workstations with partners like HP and Lenovo, powered by the Ryzen AI Max PRO 400 series processors. These new platforms will support up to 192GB of unified memory, with up to 160GB allocable to AI models. That means developers will eventually be able to run LLMs with 300B+ parameters directly on a local workstation.