AWS Trainium: How Amazon Built Their Own AI Chips | Researcher Conversations at GTC

Watch on YouTube ↗  |  April 30, 2026 at 19:30  |  7:09  |  SemiAnalysis

Summary

In this interview at GTC 2026, AWS engineers Rachel Zheng and Karthik Venna discuss the evolution of their custom AI chips Trainium and Inferentia, their deep partnership with Nvidia, and plans to scale GPU infrastructure to 1 million additional GPUs. They highlight Trainium's 30-40% better price performance and the new Cerebras partnership for disaggregated inference. The conversation focuses on AWS's strategy to offer broad hardware selection and reduce costs for customers deploying AI in production.

  • AWS has a 15-year partnership with Nvidia and offers 2 million GPUs in the cloud.
  • AWS plans to add another 1 million GPUs in 2026.
  • Trainium 3 offers 2-3x better performance over Trainium 2 and 30-40% better price performance versus alternatives.
  • AWS announced a partnership with Cerebras for disaggregated inference to lower cost per token.
  • Anthropic is a key customer using Trainium for large-scale training and inference.
  • AWS emphasizes hardware-software co-design and scale as competitive advantages.
  • Customers face challenges moving from demo to production, which AWS aims to solve.
Up Next