Why Positron AI is Choosing LPDDR over HBM for Next-Gen LLM | Researcher Conversations at GTC

Watch on YouTube ↗  |  April 16, 2026 at 19:58  |  10:47  |  SemiAnalysis
Speakers

Summary

Thomas Sohmers discusses Positron AI's hardware strategy for AI inference, focusing on reducing token costs while maintaining speed. He explains their architecture's high memory bandwidth utilization and choice of LPDDR memory over HBM to avoid supply constraints. The Titan server aims to support massive models with high context lengths using commodity technologies like organic substrates.

  • Positron AI is transitioning from FPGA to custom silicon with the Osmo chip.
  • Architecture optimizes matrix-vector performance for AI inference efficiency.
  • Uses LPDDR memory instead of HBM for higher capacity and supply chain advantages.
  • Avoids advanced packaging, using organic substrates for cost and scalability.
  • Claims 1:1 matrix-matrix to matrix-vector ratio, unlike NVIDIA's worsening ratio.
  • Titan server targets 16-trillion parameter models with million-token contexts.
  • Emphasizes commodity supply chains to compete with large players.
  • Partnership with Oracle and rapid development highlighted.
Trade Ideas
NVIDIA has poor matrix-vector performance for inference.
NVIDIA's GPUs have a worsening ratio of matrix-matrix to matrix-vector performance from Hopper to Blackwell, making them inefficient for AI inference workloads that rely heavily on matrix-vector operations, which Positron's architecture solves.
Commodity LPDDR and substrates over HBM and advanced packaging.
Positron AI uses LPDDR memory and organic substrates instead of HBM and advanced packaging, avoiding supply chain constraints and achieving higher memory capacity for scalable AI inference with commodity technologies.
Commodity LPDDR and substrates over HBM and advanced packaging.
Positron AI uses LPDDR memory and organic substrates instead of HBM and advanced packaging, avoiding supply chain constraints and achieving higher memory capacity for scalable AI inference with commodity technologies.
Up Next

This SemiAnalysis video, published April 16, 2026, features Thomas Sohmers discussing NVDA, SOXX, HBM. 3 trade ideas extracted by AI with direction and confidence scoring.

Speakers: Thomas Sohmers  · Tickers: NVDA, SOXX, HBM