Where is the way out?, Domestic Big Model "War of Gods" Human | Data | Domestic
In 1956, at a conference at Dartmouth College, participants enthusiastically discussed how to develop computer systems that could learn autonomously from experience, similar to humans. This meeting is regarded by future generations as the first shot in the development of artificial intelligence.
More than 60 years later, the American OpenAI company's eight-year-old artificial intelligence model ChatGPT3 was born. People rushed to show its amazing ability on social media, and the ChatGPT4 was even more "invincible": writing poems and writing copywriting. It is no more. It even has the ability of logical reasoning and can correct itself. The answer to that famous meeting seems self-evident.
Not long ago, at the 2023 China International Big Data Industry Expo, keywords such as AI, big model, ChatGPT, and meta-universe appeared frequently. The venue with the theme of "artificial intelligence" was crowded with people. People who listened to the meeting lined up from the venue to the door of the conference room, and new arrivals continued to squeeze in.
In different venues and forums, people from different fields seem to focus on a point of interest. They try to understand what disruptive changes artificial intelligence will bring to the future industry, how the road of domestic large models should go, and what else is there. Basic work to be done.
To solve practical problems, the big model has to "read a doctor"
Sun Maosong asked the ChatGPT to find out the sentence describing the Afang Palace in the "Afang Palace Fu". To his surprise, it was found out word for word.
Sun Maosong is a professor in the Department of Computer Science and Technology at Tsinghua University and the executive vice president of the school's Artificial Intelligence Research Institute. He studies natural language processing and is "professional counterpart" to ChatGPT ". In his speech, he repeatedly exclaimed: "it is really powerful!"
After taking the writing test, Sun Maosong asked ChatGPT again how to draw the picture of Epang Palace according to "Epang Palace Fu. The ChatGPT is divided into five scenes like writing a script: the grandeur of Epang Palace, the architectural style of Epang Palace, water features and bridges, spring scenery and martial arts halls, and the maze-like palace layout.
Since this year, ChatGPT's erudition has been widely known, but Zhang Dongxiao, an academician of the National Academy of Engineering and executive vice president of the Oriental Institute of Advanced Studies, believes: "The big model has a strong ability, very knowledgeable, the equivalent of a primary and secondary school student. To solve practical problems, you have to go to college, and you have to study a major, study a PhD, or become an engineer in that field."
In the past, OpenAI, a ChatGPT-owned company, has cooperated with hundreds of companies or organizations in technology, education, finance and other industries.
Sun Maosong believes: "You can reshape an industry, but also to reshape the ecology of the industry."
He gave an example: some people want to book a hotel and ask that the price is not too expensive. It is better to be closer to Wangfujing and be quiet. Faced with the various requirements of users, it used to be very difficult to do this. It may take two hours for the secretary to find such a hotel. The artificial intelligence model will learn the habit of booking hotels with users, and the efficiency will be greatly improved.
Wang Jianhua, president of the China Association for the Promotion of Industry-University-Research Cooperation, said, "In the field of artificial intelligence imaging medicine, we have integrated the radiology departments of hospitals across the country with some imaging medicine-related companies to solve the problem of watching films entirely by people. With artificial intelligence, the accuracy is higher and it's not tiring."
Wang Jianhua noticed that there are many smart products, such as the comparison of big data, to determine the level of human glucose metabolism. He believes that artificial intelligence will affect the innovation and development of the entire medical field.
Xu Jiming, founder of Yidu Technology Co., Ltd., is more looking forward to the fact that artificial intelligence can accelerate the research and development of new drugs.
New drug research and development is a long process of trial and error, from cell experiments, to animal experiments, to 1. 2. phase III clinical trials, from laboratory research to marketing may take 10 years and cost $1 billion-this is known as the "double Ten Law" in the field of new drug research and development ". In recent years, well-known pharmaceutical companies such as Pfizer and AstraZeneca have begun to introduce artificial intelligence in the field of new drug research and development, hoping to improve the success rate and reduce costs. Xu Jiming imagined that AI could benefit mankind by building a model through multi-modal data of the human body to simulate the operation of human organs, and doing experiments through these system models before clinical experiments.
According to Xue Chao, a senior algorithm scientist at Jingdong Exploration Research Institute, the big model is the future operating system, which provides an application interface upward and can be compatible with a variety of hardware downward. For example, if a restaurant wants to build a food delivery robot or a dialogue robot, the owner will input the restaurant's menu into a large model, so that a functional robot can be quickly built.
"AI will become more and more civilian in the future, with lower and lower thresholds, and its entire interaction can be controlled through natural language." Xue Chao said.
It is critical to develop large models and improve the quality of data "feeding".
At this year's Digital Expo, Zhihu United Face Wall Intelligence released the dialogue model product "Face Wall Lew Card". At the press conference, the host asked it to plan a four-day and three-night tour in Guizhou. In the tourism plan planned by "Luka", tourists have to return to Guiyang, their starting point, from the scenic spot every night, and then leave for the next city and state the next day. The route is slightly lengthy, but fortunately it covers more well-known attractions and specialty snacks.
Yang Xiaokang, executive vice president of the Artificial Intelligence Research Institute of Shanghai Jiaotong University, said that the research and development of large models in China can be described as a "real battle of the gods." "It is said that there are more than 70 large models, or even hundreds of large models, under development". He believes that the study of large models is very energy-intensive and requires orderly guidance to form a resultant force.
Deng Zhouhui, executive deputy general manager of Gui'an New District Science and Technology Industry Development Company, mentioned the "iron triangle" theory: the big model is the product of the combination of "big data big computing power algorithm. He believes: "The current data quality is a more worrying issue, when large model training abroad, there is a lot of good literature, there is some scientific literature, so the model training out of a high level of intelligence. But we are now in the big model training, most of it comes from the Internet, so the quality is not particularly ideal."
The ancients often said that if the soldiers and horses are not moved, the food and grass should come first. In the era of artificial intelligence, data is the food for feeding big models. At present, there are still a lot of practical difficulties in the supply of "grain and grass.
Wang Mingtai, vice president of Jingtai Zhiyao Technology Co., Ltd., said that data needs to be marked and cleaned, but there is a large amount of data in the field of pharmaceutical research and development that cannot be marked. Wang Mingtai said that the protein sequence can reach more than one billion levels, but there are very few protein functional data that can be found at present, "there is a huge gap in the middle".
He also said: "The data now is mainly done by humans, and then extracted into the system, and then 'feed' to the machine learning. At present, we conservatively estimate that there may be more than 200000 people in China doing experiments for pharmaceutical R & D companies around the world, and these people may all have bachelor's degrees or above, and so many people may come to do experiments in the future." This means that the cost of data generation is extremely high, and to "feed" large models, "it must be data that is generated cheaply".
At the source of the data, there is still the problem of inconsistent standards.
Liu Jiangxian, chief strategy officer of Daerguan Information Technology Co., Ltd., believes that we should pay attention to sort out what kind of data we need and how to produce high-standard data. He believes that first of all, we must formulate the standards and processes for data generation, and select the areas that produce a large amount of quality and valuable data. "Only by formulating a standardized production process can we produce the data we want."
In many forums of this year's Digital Expo, the guests mentioned the problems of data circulation and trading.
In April 2020, the Opinions of the Central Committee of the Communist Party of China and the State Council on Building a More Perfect Institutional Mechanism for Market-based Allocation of Factors were released to the public, defining the data as the fifth largest factor of production after land, labor, capital and technology. In June 2022, at the 26th meeting of the Central Committee for Comprehensively Deepening Reform, the Opinions on Building a Data Infrastructure System to Better Play the Role of Data Elements was reviewed and approved, clarifying the need to establish a data property rights system and establish a compliant and efficient data Element circulation and trading system, and improve the market-oriented allocation mechanism of data elements.
Han Liyan, a researcher at Beijing Yanqi Lake Institute of Applied Mathematics and chairman of Qingyan Science and Technology, explained in an exclusive interview with a reporter from China Youth Daily and China Youth Network that data becomes a factor of production and can enter the balance sheet, which means that it is part of the enterprise's assets. it can become the target of pledge, help enterprises finance, and even increase credit for enterprises. This is particularly important for asset-light CRE companies.
Wei Dong, general manager of the Guangzhou Data Exchange, said that after a Zhanjiang aquatic product company registered with the Guangzhou Data Exchange, the credit line granted to it by the bank not only increased several times, but also greatly reduced the time it took.
Wang Jianzong, vice chairman of the China Artificial Intelligence Open Source Software Development Alliance, said that in the past, data was not a factor of production and was not taken seriously after the enterprise was produced. "It may be treated as waste, and someone will take it away". "Now defined as a factor of production, there is no doubt that everyone attaches importance to it. Once it is paid attention to, it will be popular, and there will be difficulties in sharing and circulation." He said.
As a high-tech company jointly incubated by Tsinghua University and Beijing Yanqi Lake Institute of Applied Mathematics, Qingyan Technology is building a "trusted data space". In this virtual space, they strive to make data "available but invisible", thus promoting the transaction and sharing of data assets and ensuring data security.
The road of big model industry can not bypass the ethics of science and technology and employment anxiety.
Yan Yanchun, founder and chairman of Shanghai Shanqiu Liankang Health Management Co., Ltd., expressed his expectation in poetic language: "In the era of industrial civilization, we find that human beings have become machines, and each of our workers has become a screw on the assembly line. ChatGPT will bring great liberation to mankind."
He imagined that human beings may not have to "996", "we may be enough for one or two days, because there is a larger 'new human legion' is coming out."
In fact, the "advance troops" of the New Human Legion have been available for several years. Back in 2018, Daimler Financial Services showed off its first digital sales representative, Sarah, who can calculate the price/performance ratio of buying a new Mercedes-Benz car for people, and can also choose optional kits for customers. In February of the same year, the Royal Bank of Scotland hired a virtual customer service robot Cora. She understands customer preferences, can identify customers and name them at a glance, can handle thousands of problems a day, and she can continue to learn from mistakes. Also in 2018, UBS announced that digitization had "replicated" its chief economist, launching digital people......
Yan Yanchun is very optimistic about the future: "We think that in the next 50 years, when carbon-based and silicon-based life coexist and flourish, everyone may become a poet, a writer, a director, a painter. Even everyone may become a teacher, a doctor and a code farmer."
He quoted a poem from Tagore in "The Collection of Flying Birds": Faith is a bird, which feels the light and sings a song when the dawn is still dark. He said: "At the moment when human civilization is highly introdued, I think artificial intelligence has brought us such a new light."
On the one hand, there is the rapid development of artificial intelligence technology, and on the other hand, there is the theory of technological concern: artificial intelligence may not bring about the liberation of labor force in the first place, but the wave of unemployment.
Sun Maosong mentioned that in the past 20 years, artificial intelligence has created great value for some enterprises; it can also greatly improve the work efficiency of knowledge workers. It is expected that by 2030, it will double the efficiency of accounting personnel and double the programming efficiency of programmers.
"This is good for the company, not necessarily good for the individual. It means that the accounting staff will have to be cut in half, which means that 75% of the programmers may not need it." He said. The rest of us need to have a higher level.