OpenAI unveils first AI model running on Cerebras chips
Watch on YouTube ↗  |  February 13, 2026 at 19:31 UTC  |  1:41  |  CNBC
Speakers
Deirdre Bosa — Tech Check Anchor

Summary

  • OpenAI has launched "GPT 5.3 Codec Spark," a stripped-down coding model running entirely on Cerebras chips, marking a significant diversification away from Nvidia for inference workloads.
  • The industry is bifurcating between "Training" (one-time build, still dominated by Nvidia) and "Inference" (everyday usage, moving to cheaper/custom silicon).
  • Major hyperscalers (Google, Microsoft, Meta) and Chinese labs are aggressively deploying custom chips (TPUs, etc.) to reduce reliance on Nvidia for high-volume tasks.
Trade Ideas
Ticker Direction Speaker Thesis Time
WATCH Deirdre Bosa
Anchor/Reporter, CNBC Tech Check
"If the high volume, everyday workloads, if they're moving off of Nvidia hardware, that is important. It changes the investment thesis." Nvidia remains the "gold standard" for training (building models), but the real long-term volume lies in inference (running models). If OpenAI and others successfully shift inference to cheaper competitors like Cerebras or internal chips, Nvidia loses the largest segment of future AI compute demand. Watch for signs of eroding market share in the inference segment, which could compress margins or slow growth despite training dominance. Nvidia's CUDA moat remains strong, and they are still "foundational" to OpenAI's business. 0:17
LONG Deirdre Bosa
Anchor/Reporter, CNBC Tech Check
"Google's serving Gemini on its own custom AI chips TPUs. Microsoft just launched its own, and Meta is rolling out custom chips across its data centers." The shift to custom silicon for inference allows these hyperscalers to decouple their cost structure from Nvidia's pricing power. This vertical integration improves gross margins and operational control as AI scales to "hundreds of millions of people." Long the hyperscalers as they successfully execute on hardware independence, reducing CAPEX intensity relative to compute output. Custom chip development is capital intensive and may lag Nvidia's performance improvements. 0:59