On April 9th, at the Google Cloud Next 25 conference held in Las Vegas, Google launched the seventh-generation TPU - Ironwood. This chip is specifically designed to support large-scale thinking and AI inference models, making it the most powerful TPU that Google has ever produced.

TPU (Tensor Processing Unit), or Tensor Processing Unit, is an artificial intelligence chip specially designed to accelerate deep learning tasks. It was first proposed by Google in 2015, and the first-generation TPU was officially released in 2016.

The official statement indicates that the release of Ironwood marks the transition from responsive AI models that provide real-time information for human interpretation to models capable of generating insights and interpretations autonomously.

In the era of reasoning, agents will actively retrieve and generate data to provide insights and answers collaboratively, rather than just data. To achieve this, chips that meet both massive computational and communication requirements are needed, along with a coordinated hardware-software design.

The highest-configured cluster of Ironwood can have 9216 liquid-cooled chips, with a peak computing power of 42.5 ExaFlops, which means performing 42500000000000000000 operations per second.

According to nextplatform, this chip is the first TPU from Google to support FP8 calculations in its tensor cores and matrix math units.

The FP8 computing power of Ironwood is 4614 TFlops, slightly higher than NVIDIA B200's nominal 4500 TFlops. The memory bandwidth is 7.2 TBps, slightly lower than B200's 8 TBps.

In addition, the third-generation SparseCore accelerator used in the Ironwood chip encodes various algorithms to achieve acceleration in financial and scientific computations.

The SparseCore accelerator made its debut in TPU v5p and was enhanced in last year's Trillium chip. Its initial design purpose was to accelerate recommendation models that utilize embeddings for cross-user category recommendations.

Official data shows that the energy efficiency of Ironwood is twice that of the sixth-generation TPU Trillium released last year. Each chip capacity reaches 192 GB, six times that of Trillium, enabling it to handle larger models and datasets, reduce frequent data transfers, and enhance performance.

Google plans to integrate TPU v7 into Google Cloud AI supercomputers in the near future, supporting businesses including recommendation algorithms, Gemini models, and AlphaFold.

It is reported that Ilya Sutskever, co-founder and chief scientist of OpenAI, and his AI startup Safe Superintelligence are using Google Cloud's TPU chips to support their AI research.

This article is an exclusive contribution from Observer Network and cannot be reproduced without permission.

Original source: https://www.toutiao.com/article/7491555071684051491/

Disclaimer: The views expressed in this article are those of the author and welcome your opinions to be expressed through the "Like/Dislike" buttons below.