Thomas Sohmers discusses Positron AI's hardware strategy for AI inference, focusing on reducing token costs while maintaining speed. He explains their architecture's high memory bandwidth utilization and choice of LPDDR memory over HBM to avoid supply constraints. The Titan server aims to support massive models with high context lengths using commodity technologies like organic substrates.
This SemiAnalysis video, published April 16, 2026, features Thomas Sohmers discussing NVDA, SOXX, HBM. 3 trade ideas extracted by AI with direction and confidence scoring.
Speakers: Thomas Sohmers · Tickers: NVDA, SOXX, HBM