The domestically produced model "Scholar Puyu" achieved higher scores in the college entrance examination than ChatGPT language | Model | Puyu

Release time:Apr 13, 2024 20:53 PM

As artificial intelligence big language models exhibit intelligence similar to that of humans, high difficulty and comprehensive exams are increasingly being introduced into language model evaluation. OpenAI tests the model's abilities through exams in various fields in the GPT-4 technical report. Today is the first day of the college entrance examination. Shanghai Artificial Intelligence Laboratory, Shangtang Technology, in collaboration with The Chinese University of Hong Kong, Fudan University, and Shanghai Jiao Tong University, released the "Scholar Puyu" parameter language model with a scale of billions. It has achieved better results than ChatGPT in multiple Chinese exams, including the Chinese college entrance examination.

"Shusheng Puyu" has 104 billion parameters and was trained on a high-quality multilingual dataset containing 1.6 trillion tokens. The comprehensive evaluation shows that this large model not only performs well in multiple testing tasks such as knowledge mastery, reading comprehension, mathematical reasoning, and multilingual translation, but also has strong comprehensive abilities, making it outstanding in comprehensive exams, including datasets of various subjects in the Chinese college entrance examination. The relevant technical report has been made public online, providing a detailed explanation of the technical characteristics and test results of the model.

The joint research and development team selected more than 20 evaluations to test "Shusheng Puyu", including the four most influential comprehensive exam evaluation sets in the world: the multi task exam evaluation set MMLU constructed by universities such as the University of California, Berkeley; AGIEval, a subject exam evaluation set launched by Microsoft Research Institute; The comprehensive exam evaluation set C-Eval for Chinese language models, jointly constructed by Shanghai Jiao Tong University, Tsinghua University, and the University of Edinburgh; The Gaokao Chinese College Entrance Examination (Gaokao) question evaluation set, constructed by a research team from Fudan University, includes various subjects as well as multiple question types such as multiple-choice, fill in the blank, and question answering.


The domestically produced model "Scholar Puyu" achieved higher scores in the college entrance examination than ChatGPT language | Model | Puyu

Comparison of performance of large models in four evaluation sets

The exam results show that "Shusheng Puyu" not only significantly surpasses academic open source models such as GLM-130B and LLaMA-65B, but also surpasses ChatGPT in multiple comprehensive exams such as AGIEval, C-Eval, and Gaokao, and is on par with ChatGPT in MMLU, which is mainly based on American exams. Among them, "Shusheng Puyu" leads ChatGPT in over 75% of Gaokao's evaluation projects.

Comparison of Performance of Large Models in the Gaokao Evaluation Project


The domestically produced model "Scholar Puyu" achieved higher scores in the college entrance examination than ChatGPT language | Model | Puyu

In order to avoid being biased towards science, researchers also evaluated and compared the sub item abilities of multiple language models through multiple academic evaluation sets. The results showed that "Shusheng Puyu" not only excelled in reading comprehension in both Chinese and English, but also achieved good results in evaluations of mathematical reasoning and programming abilities. The researchers also evaluated the security of the large model and found that Shusheng Puyu reached a leading level in both TrustfulQA and CrowS Pairs.

Comparison of evaluation of sub item abilities in large models

Although achieving excellent results in exam evaluations, the big language model still has many limitations in its abilities. It is reported that "Shusheng Puyu" is limited by a 2K contextual window length, and there are obvious limitations in understanding long texts, complex reasoning, coding, and mathematical logic deduction. In addition, during the dialogue process, large language models commonly suffer from hallucinations, conceptual confusion, and other issues. These limitations mean that there are still many issues to be overcome in the use of large language models in open scenarios.


The domestically produced model "Scholar Puyu" achieved higher scores in the college entrance examination than ChatGPT language | Model | Puyu
Shanghai International Sister City Youth "Play" Summer Camp Experience Traditional Culture, Make Pankou, Learn Paper Cuttings, Make Dumplings International | Youth | Make Dumplings
Shanghai International Sister City Youth "Play" Summer Camp Experience Traditional Culture, Make Pankou, Learn Paper Cuttings, Make Dumplings International | Youth | Make Dumplings

Making coils, learning Paper Cuttings, making dumplings, walking into the home of summer camp volunteers and feeling the life of Shanghai people... On the 20th, the 2023 Shanghai International Sister City Youth Summer Camp officially opened in Shanghai Shidong Experimental School. 73 campers from 13 cities in 12 countries gathered in Shanghai to open the annual international sister city youth exchange event with their peers in Shanghai. The young partners together carried out activities such as learning excellent traditional Chinese culture courses such as Chinese and traditional Chinese painting, intangible cultural heritage, Chinese clothing, disco, Paper Cuttings, seal cutting, calligraphy, pottery, tea art, Yanzhi, dragon dance, youth forum exchanges in sister cities, investigation of urban cultural landscape, visits to universities and venues, city orientation challenges, and local family life experiences, etc., to bloom their youth. In addition to a rich and colorful summer camp physical activity experience, campers and volunteers

Undergraduate voluntary application starts today! These important reminders and suggestions must be read, @ College Entrance Examination Stage | Undergraduate | Volunteer
Undergraduate voluntary application starts today! These important reminders and suggestions must be read, @ College Entrance Examination Stage | Undergraduate | Volunteer

@All college entrance examination candidates, according to the schedule of the college entrance examination, will fill in their undergraduate preferences for all batches except for the comprehensive evaluation batch from 8:00 a.m. to 8:00 p.m. daily from July 1st to 2nd, and from 8:00 a.m. to 12:00 a.m. on July 3rd. The specific contents of this voluntary application include zero voluntary batch, advance batch, art and sports class A batch, local rural special plan batch, special type enrollment, and ordinary batch. The filling method is as follows: Fresh high school graduates in this city will be arranged uniformly by the high school where they are enrolled; Non local fresh high school graduates will be arranged uniformly by the district recruitment office where they apply. It is important to remind candidates that during the voluntary application period from July 1st to 3rd, as the admission of the comprehensive evaluation batch has not yet been completed, candidates who have filled out the comprehensive evaluation batch of voluntary applications still need to carefully fill out other batches of undergraduate voluntary applications

Investment+Services Drive Anti Cancer Drugs into Clinical Practice | Entrepreneurial Stories in Incubators, Second Entrepreneurial Biology for CEOs of Listed Companies | Incubators | Clinical | Incubators
Investment+Services Drive Anti Cancer Drugs into Clinical Practice | Entrepreneurial Stories in Incubators, Second Entrepreneurial Biology for CEOs of Listed Companies | Incubators | Clinical | Incubators

Recently, with the approval of the National Medical Products Administration, the Class 1 innovative drug CC312 developed by Huihe Biotechnology has initiated phase I clinical trials for the treatment of recurrent/refractory CD19 positive B-cell malignant hematological tumors. This is the first domestically and the third globally approved triple specific antibody drug based on CD28 co stimulatory signals to enter clinical practice. When it comes to the development history of this new triple antibody drug, Dr. Zhu Huaxing, the founder of Huihe Biotechnology, still remembers vividly: "Before 2019, there was no triple antibody drug approved for clinical use globally, and some investors couldn't understand CC312. During a critical period of company development, Nokai Xinkang Fund invested 30 million yuan to help us complete Series A financing." Nokai Xinkang is a venture capital fund initiated by Xinze Entrepreneurship Incubator. This incubation company, which has been deeply cultivated in Zhangjiang Science City for many years

"Zidong Taichu" Full Modal Large Model Released, Precisely Positioned 3D Scene, Listening to "Moonlight Song" and Talking about Beethoven Images | Applications | Beethoven
"Zidong Taichu" Full Modal Large Model Released, Precisely Positioned 3D Scene, Listening to "Moonlight Song" and Talking about Beethoven Images | Applications | Beethoven

Not only can you hear Beethoven talk freely in "Moonlight", but you can also achieve precise positioning in three-dimensional scenes, and complete scene analysis through the combination of images and sound. On June 16, at the AI Framework Ecological Summit, the Institute of Automation of the Chinese Academy of Sciences officially released the full mode large model of "Zidong Taichu". This model is the 2.0 version upgraded from the 1.0 version of the 100 billion parameter multimodal model "Zidong Taichu". On the basis of voice, image and text three modes, it adds video, sensor signal, 3D point cloud and other modal data, breaks through the key technologies such as multimodal correlation for cognitive enhancement, and has full modal understanding, generation and correlation capabilities. At the meeting, Xu Bo, the director of the Institute of Automation, presented for the first time in real time the "Zidong Taichu" full modal cognitive model in music understanding

Perseverance is a life experience. Lingling Middle School has weak muscles. Candidates complete the college entrance examination: no matter what difficulties they experience. High school | school | college entrance examination
Perseverance is a life experience. Lingling Middle School has weak muscles. Candidates complete the college entrance examination: no matter what difficulties they experience. High school | school | college entrance examination

"The college entrance examination is an experience in our lives. Looking back on the preparation for the entire senior year of high school, we cannot help but marvel at the preciousness of time. No matter what difficulties we have gone through, I believe that as long as we persist, it will become my lifelong wealth. Today, after completing the morning foreign language listening and speaking test of the college entrance examination, Xiao Song, a senior student of Lingling High School, came out of the examination center of East China University of Science and Technology Affiliated Middle School. The senior year graduate sitting in a wheelchair said," Although academic and life are not small tests for me, I have faced many difficulties in adversity and have never given up on my dream of taking the college entrance examination. "Xiao Song suffered from congenital muscular dystrophy since childhood and entered Lingling High School in high school." Later, the school provided him with classrooms on low floors and close to the toilet, making it easier for him to enter and exit, allowing him to face his academic life with more confidence. At school, I met many kind and enthusiastic people