Why is Huang Renxun so anxious to officially announce Nvidia’s next-generation artificial intelligence chip? |New science popularization
The rise of large artificial intelligence models has made Nvidia one of the most watched companies in the world. Recently, Nvidia CEO Jensen Huang announced that Rubin will be the successor platform to Blackwell chips. On March 18 this year, this Silicon Valley company released the Blackwell chip platform and has just entered the production stage. Why is it so urgent to officially announce the next generation of artificial intelligence chips that are still under development after more than two months? What are the development trends of artificial intelligence chip technology and industry? A reporter from Jiefang Daily and Shangguan News interviewed Dr. Xu Bulu, general manager of Shanghai Silicon Intellectual Property Trading Center and chairman of supervisors of Shanghai Integrated Circuit Industry Association.
According to reports, artificial intelligence chips are based on GP-GPU general computing chips. GP means general purpose, and the original meaning of GPU is graphics processor, so the full name of GP-GPU is "General Graphics Processing Unit", or GPU for short. It is precisely because of being the world's number one in this chip field that Nvidia surpassed Intel and became the hottest chip company in the world.
GPUs are found in personal computers, mobile phones, and tablets and are used to process image-related operations. This type of chip has parallel computing capabilities and can perform a large number of relatively simple operations at the same time, making it very useful in large-scale artificial intelligence calculations. Relying on the foresight of this application prospect, Huang Renxun led the team to upgrade the GPU to GP-GPU, making this type of chip more suitable for high-performance parallel computing and capable of using higher-level programming languages.
On March 18 this year, Huang Renxun gave a speech at the NVIDIA Developer Conference.
With the rise of the artificial intelligence industry, the application value of high-performance GP-GPU is increasing. This type of chip is mainly used in the two fields of artificial intelligence model training and reasoning, and high-performance computing. In large model training, the GPU used for training is often deployed in the cloud, that is, installed in the server, focusing on absolute computing power; the GPU used for inference focuses on comprehensive indicators, and when used in the cloud or on the device side, the unit performance needs to be fully considered. Factors such as computing power consumption, delay, and cost.
In the field of high-performance computing, GPUs also play an important role and are used in various application scenarios such as data centers, scientific computing, and autonomous driving. They can perform parallel computing of massive amounts of data.
"Many people have a misunderstanding about artificial intelligence chips, thinking that with a GPU, there is no need for a CPU." Xu Bulu said, "Actually, GPUs and CPUs have their own strengths. To put it simply, the former is good at parallel computing, while the latter has complex logic "Control function." Take the GB200 chip released by Nvidia in March this year as an example. It consists of two B200 Blackwell GPUs and a Grace CPU based on Arm's chip architecture. It uses TSMC's 4-nanometer process and contains 208 billion transistors. Its performance reaches 20 petaflops per second.
GPU, CPU, memory and other different types of die are integrated into a single chip, which is called chip-level "heterogeneity". In the process of storage and computing integration, rationally optimizing the distribution of tasks and loads between heterogeneous processors such as GPUs, CPUs, and acceleration units, storage, and transmission is a trend in the development of artificial intelligence chip technology. Huang Renxun recently announced that the Rubin chip under development will be equipped with a new generation of GPU, VeraCPU based on Arm architecture and HBM4 high-bandwidth memory, which reflects this trend.
"Essentially, the products made by Nvidia are not just chips, but chip-level servers, providing customers with intelligent computing solutions such as 'on-chip computing systems'." Xu Bulu told reporters.
In the field of servers, Nvidia supplies cloud service giants such as Google, Microsoft, and Amazon, and also poses challenges to these traditional giants. In addition, Nvidia also has to deal with challenges from chip companies such as AMD and Intel. In this fierce market environment, it is not difficult to understand that Huang Renxun was eager to officially announce the next generation of artificial intelligence chips within three months after releasing the Blackwell chip platform. Obviously, the timely launch of artificial intelligence hardware product roadmaps and timetables can give upstream and downstream enterprises in the industry chain and supply chain a relatively clear expectation, allowing them to plan their own products as early as possible based on the roadmaps and timetables.
Early official announcement of the next generation of artificial intelligence chips can also attract the attention of the capital market, which will help Nvidia maintain the historic market value of a technology company of US$3 trillion and raise more research and development funds. Xu Bulu pointed out that in the integrated circuit industry, with the support of global capital, the design of chips such as processors, memories, and transmission optical modules are becoming more and more synchronized. Chip design and cutting-edge wafer manufacturing and three-dimensional packaging are forming efficient collaboration, and computing power The interoperability between chips, transmission protocols, and development environments is also getting higher and higher.
On March 18 this year, Huang Renxun released the artificial intelligence chip B200.
In terms of synergy, the optimized synergy between artificial intelligence chips and the general parallel computing architecture ecosystem is worth mentioning. CUDA developed by NVIDIA is a general parallel computing architecture ecosystem. It contains the CUDA instruction set architecture and the parallel computing engine inside the GPU. Software engineers can use C language to write programs for the CUDA architecture, so that these programs can run with high performance on processors that support CUDA.
"This is a programming platform, application framework and infrastructure." Xu Bulu explained, "Through the collaboration of software and hardware products, NVIDIA has created a powerful GPU innovation ecosystem, attracting global users to develop AI products and optimize development tools in this ecosystem. Operator library, etc., so that the more you use it, the better it becomes, resulting in user dependence.”
Looking at China, a number of leading artificial intelligence chip companies such as Huawei are also developing ecosystems. Although the current number of users of the domestic ecosystem is not as good as CUDA, the domestic integrated circuit industry has reached a consensus: Only by accelerating the construction of an open new-generation artificial intelligence chip software and hardware integrated ecosystem can China's integrated circuit and artificial intelligence industries achieve independent control , facing global sustainable development, and can better serve various industries in cultivating new productivity.