Google's TPU (Tensor Processing Unit) specially designed for AI computing has reached the seventh generation "Ironwood" and will be fully available to customers in the next few weeks.
Using Ironwood, Google can scale to 9,216 chips in a single Pod and achieve 9.6 Tb/s transmission speeds with a breakthrough Inter-Chip Interconnect (ICI) network. This allows thousands of dies to quickly interconnect and access up to 1.77PB of shared high-bandwidth memory (HBM), overcoming data bottlenecks in even the most demanding models. This also means that Ironwood’s Pod can deliver 118x more ExaFLOPS at FP8 accuracy than the next closest competitor.
At this scale, services require uninterrupted availability. Therefore, Google's Optical Circuit Switching (OCS) technology, as a dynamic and reconfigurable network mechanism, can instantly bypass interruptions to resume workloads and ensure continued service operation. When more computing power is needed, Ironwood can scale across pods to clusters of hundreds of thousands of TPUs.
Google has developed an integrated AI Hypercomputer architecture based on years of experience to seamlessly integrate the hardware and software required for AI workloads. TPU is already a key component of the AI Hypercomputer, and Ironwood will take over the responsibility.
It is worth mentioning that Anthropic recently announced that it will expand the use of Google Cloud technology, including expanding TPU usage to 1 million, accelerating their process from training the Claude model to providing services to millions of users. Anthropic is testing Ironwood. "Ironwood's improvements in inference performance and training effectively scale the entire scale while maintaining the speed and reliability that customers expect," said James Bradbury, distinguished engineer in Anthropic's computing department.
▲ Using Jupiter data center network technology, multiple Ironwood superpods can be connected into a cluster of hundreds of thousands of TPUs.
▲ The third generation cooling distribution unit (CDU) provides liquid cooling for Ironwood superpod.
Expanding the Axion product portfolioIn addition to Ironwood, Google has further expanded its Axion product portfolio, which is an Arm-based CPU designed for data centers.
Google's second general-purpose Axion virtual machine, N4A, is available in preview and is suitable for workloads such as microservices, containerized applications, open source databases, batch processing, data analysis, development environments, experiments, data preparation, and making AI applications viable web services.
Google’s first bare-metal execution instance C4A metal based on Arm architecture is also available in preview, providing dedicated physical servers to support special workloads, such as Android development, automotive in-vehicle systems, software with strict licensing requirements, large-scale test platforms or performing complex simulations, etc.
So the Axion series includes three options: N4A, C4A, and C4A metal. The C and N series can work together to allow customers to reduce overall operating costs without sacrificing performance or specific workload requirements.
▲ Axion is a customized, Arm Neoverse-based CPU.
While dedicated accelerators like Ironwood handle complex tasks such as models, Axion excels at the operational backbone, supporting a large amount of data preprocessing, data retrieval, and servers hosting smart applications. Axion begins to translate into actual benefits for the client.
Further reading: Anthropic expands use of one million Google TPUs, looking to reach 1 GW of computing power next year AI demand continues, analysts: Google TPU business is expected to bring $900 billion in business opportunities Google’s seventh-generation Ironwood TPU is unveiled, with advanced computing power and a maximum Pod configuration of 9,216 chips