27th Liberation Book List | The Age of Big Models: The Past, Present, and Technical Mysteries of Big Language Models | Models | The Era
In the history of human development, new products and services brought about by technological progress not only meet existing needs but also create new ones. With the widespread application of digital technology, the fourth industrial revolution characterized by intelligence has quietly emerged around us, and ChatGPT is one of the representative achievements. Due to its smooth logical dialogue and interaction ability, it has attracted market attention since its appearance. This achievement tells us that large models with high-level structural complexity and a large number of parameters can achieve deep learning.
"The Era of Big Models: ChatGPT Starts the Wave of Universal Artificial Intelligence" is the first work in China to provide a panoramic explanation of the past, present, and technological mysteries of artificial intelligence big language models.
Unlike traditional language models, large language models learn statistical rules of language through training on large-scale corpora. During training, they typically use a large amount of text data for self supervised learning, enabling automatic learning of multi-level language rules such as grammar, syntax, and semantics. Artificial intelligence models and general models are based on mathematics and statistics and can be used to describe a system or dataset. There are over 100 million parameters for the big language model of artificial intelligence, and this standard is constantly improving. The parameters of GPT3 have exceeded 175 billion, and it is currently believed that there are models with over one trillion parameters running. In addition, there are artificial neural network models that are larger and more complex than large language models, typically with trillions to trillions of parameters.
The more parameters a model has, the more complex and rich information it can handle, and the higher the accuracy of its prediction results. In theory, if a model has enough parameters to match the actual situation, it can reproduce what has already happened or simulate what will happen. Ultra large models are often used to solve more complex tasks such as question answering and machine translation in natural language processing, object detection and image generation in computer vision. These tasks require processing extremely complex input data, as well as extracting deeper level feature data from it, to improve the accuracy of the model.
If we define a large language model from the perspective of artificial intelligence generation, the generation model can generate code based on text prompts, interpret code, and even debug code in some cases. This process can not only generate text, images, audio, and video and construct multimodality, but also generate new designs, knowledge, and ideas in a wider range of fields, and even achieve broad artistic and scientific recreation.
It is worth mentioning that this book reveals that the future performance of the big language model may be in terms of specialization in segmentation, although there may still be a significant gap compared to top experts in the industry. Artificial intelligence, represented by the big language model, can give birth to new models, empower industry digitization, and provide development momentum for new formats and models of the digital economy through forms such as digital humans, personal assistants, and search engines. It will profoundly change the ecology of fields such as technology and education. Therefore, the training and adjustment of large language models require extremely large amounts of computing power, algorithms, data, technology, as well as large-scale investment and collaboration. The first three, as the three elements of artificial intelligence, play a huge role in the intelligent upgrading process of industrial digitization.
The authors Long Zhiyong and Huang Wen, who have worked at Alibaba, Baidu, and Tencent respectively, are senior practitioners in the field of artificial intelligence. They vividly explained the technical principles behind the big model, its impact on knowledge processing and social change through four major parts: technology, change, application, and industry. They detailed the three types of applications of the big model and made forward-looking predictions on the development of related industries.
![27th Liberation Book List | The Age of Big Models: The Past, Present, and Technical Mysteries of Big Language Models | Models | The Era](https://a5qu.com/upload/images/1d3ab7609bda7ae86cc362b7a37483cb.jpg)
As a researcher in the field of industry and corporate competitiveness, after reading this book, I have gained more thinking, among which the most important is the new productivity of the digital economy era - computing power. Data is the means of production, and algorithms represent new production relations. Computing power, as a new productive force, supports algorithms and data, and the level of computing power directly determines data processing capabilities. How to combine different types of computing power is the key to reducing costs and ultimately gaining market recognition.
With the introduction of policies such as the "East West Calculation" project and new infrastructure, China's overall layout of computing power will gradually extend from the eastern region to the central and western regions. We should encourage regions with conditions to continuously explore new areas of computing power according to local conditions. While increasing computing power as an important underlying support for economic development, efforts should be made to cultivate and accelerate the establishment of a complete ecosystem for the development of artificial intelligence in China.
To develop our country's big language model, it is necessary to establish a unified big language model platform and underlying foundation to connect with domestic computing power companies, and provide interfaces that can be used by all enterprises. Only in this way can we promote the ecological chain construction of artificial intelligence, especially big language models, and promote the healthy, stable, and rapid development of artificial intelligence in our country.
"The Era of Big Models: ChatGPT Starts the Wave of General Artificial Intelligence" by Long Zhiyong and Huang Wen, published by China Translation Press