Huawei's AI Super Node Challenges NVIDIA
CloudMatrix 384 Shocks the World
In the fierce competition in the AI field, Huawei has emerged as a dark horse, launching the CloudMatrix 384 super node, which not only ignited the现场 of the Shanghai World Artificial Intelligence Conference (WAIC 2025), but also earned a rare praise from NVIDIA CEO Jensen Huang. Imagine a massive computing cluster composed of 384 Ascend 910C chips, delivering a terrifying computing power of 300 petaflops, surpassing NVIDIA's NVL72 system. This is not a science fiction movie, but a bold bet by Huawei to break through bottlenecks and challenge the giants using engineering wisdom and system innovation. It marks a shift in AI infrastructure from single-chip competition to a cluster revolution, signaling China's strong rise on the global AI map.
Removed by Huawei
The Engineering Magic of the Super Node
The core of CloudMatrix 384 lies in its "super node" architecture, which can be considered an AI computing super engine. The system integrates 384 Ascend 910C chips distributed across 12 computing cabinets and 4 bus cabinets, interconnected via high-speed buses and fiber optic links, reducing latency to 200 nanoseconds, ten times faster than traditional Ethernet. Imagine this scene: data zips between chips like a light-speed train, and 48TB high-bandwidth memory acts like a giant warehouse, ready to respond to the demands of massive AI tasks at any time. This design breaks the constraints of the traditional von Neumann architecture, adopting a point-to-point model optimized for complex mixture-of-experts (MoE) models, making large model training as smooth as silk.
Huawei's breakthrough is not limited to hardware. Industry experts point out that its software ecosystem is the key driving force. CloudMatrix optimizes algorithms to ensure load balancing, reduce power consumption, and support dynamic expansion, avoiding single points of failure. Compared to traditional server stacking, the super node acts like a "super brain," integrating CPU, NPU, storage, and network into one, improving efficiency by 2.5 times. For example, in the Meta LLaMA 3 model test, it generates 132 tokens per second per card, while communication-intensive tasks such as the Qwen model reach 600-750 tokens per second. This performance leap allows data centers to escape congestion, easily meeting the training requirements of trillion-parameter models.
The Behind-the-Scenes Innovators of Technology
In terms of engineering details, CloudMatrix 384 demonstrates Huawei's system-level strength. Although the Ascend 910C chip lags slightly behind NVIDIA's H200 in single-chip performance, the clustering compensates for the gap. Fiber optic interconnection technology is the highlight, with bandwidth increased by 15 times, comparable to NVIDIA's NVLink, but at a lower cost. Heat dissipation and energy efficiency have also attracted attention: despite a power consumption four times that of NVIDIA, Huawei maintains high system efficiency through intelligent scheduling and liquid cooling optimization. This architecture has been deployed in data centers in Anhui, Inner Mongolia, and Guizhou, supporting diverse AI applications from finance to healthcare.
A deeper breakthrough lies in supply chain innovation. Restricted by external manufacturing, Huawei has collaborated with local partners to optimize 7nm process technology and explore more advanced alternative solutions, such as laser-induced plasma technology. This self-research path not only increases chip density but also reduces reliance on foreign equipment. Industry experts believe that this "system over single chip" strategy is reshaping AI hardware design concepts: future competition will shift from chips to full-stack integration, with software, connectivity, and ecosystem collaboration becoming the deciding factors.
The Turbulent Dynamics of the Market
Looking globally, the AI chip market is as hot as fire. In 2025, investment in AI infrastructure is expected to exceed $200 billion, with China's market accounting for more than a third. Huawei's CloudMatrix 384 comes at the right time, filling the gap in domestic high-performance computing, especially under policy-driven circumstances, where companies are accelerating their transition to the domestic ecosystem. Compared to NVIDIA's CUDA barriers, Huawei opens source Ascend software, attracting developers to join, similar to Alibaba and Baidu who have already started training models with their own chips, indicating a trend toward market diversification.
But challenges remain. NVIDIA's ecosystem is deeply rooted, and the limited supply of HBM memory may hinder Huawei's expansion. Market insights indicate that demand for AI computing power will continue to surge, reaching $500 billion by 2030, with cluster-based architectures becoming the mainstream. Huawei's super node not only serves large models but also supports edge computing and autonomous driving, expanding commercial scenarios. Price competitiveness is also critical: the deployment cost of CloudMatrix is lower than that of NVIDIA, attracting small and medium-sized enterprises to enter the market, promoting the popularization of AI.
The Infinite Journey of Future Computing Power
Looking ahead, Huawei's ambitions go beyond this. The Ascend 950 and Atlas 960 are planned to launch in 2026 and 2027 respectively, doubling the computing power and targeting the global market. Commercial super nodes jointly built with telecom operators have already started trial operations in China, marking the acceleration of the commercialization of AI infrastructure.
In summary, the release of CloudMatrix 384 is not only a technological breakthrough but also a strategic declaration. It proves that Chinese companies can break through in adversity through innovation, challenge global AI leaders, and help the industry move towards a more efficient and open computing power era. This storm of super nodes is lighting up infinite possibilities for the future of AI.
Original article: https://www.toutiao.com/article/7552331354151600676/
Statement: This article represents the views of the author. Please express your opinion by clicking on the 【Up/Down】 buttons below.