Users can apply for a trial, and the evaluation performance of the Shangtang big model exceeds that of the ChatGPT model | Language | Users

Release time:Apr 13, 2024 22:36 PM

Shangtang Technology recently announced the results of its self-developed Chinese language model "SenseChat 2.0" on three authoritative language model evaluation benchmarks: MMLU, AGIEval, and C-Eval. The evaluation shows that "Shangliang" outperformed ChatGPT in all three test sets, achieving a breakthrough in the study of language models in China.

In April this year, Shangtang released the "Shangtang Nissin SenseNova" big model system and the Chinese language big model "SenseChat". At present, "discussion" has played a role in many industries and scenarios. For example, in scenarios that require a large amount of copywriting work, it can assist in processing various types of articles, reports, letters, product information, IT information, etc., editing, rewriting, summarizing, classifying, extracting information, producing Q&A, etc., effectively improving employee productivity. In customer service scenarios, it can also play many different corporate roles, such as bank customer service, picture book teachers telling stories to children, etc., and facilitate smooth communication and interaction to enhance customer experience.

It is reported that nearly a thousand corporate clients have applied for and experienced comprehensive abilities such as long text comprehension, logical reasoning, multiple rounds of dialogue, emotional analysis, content creation, and code generation through "negotiation". Users who want to apply for a trial of SenseChat 2.0 can log in to the website: https://lm_experience.sensetime.com/document/authentication .

The scores of major language models on the three evaluation benchmarks of MMLU, AGIEval, and C-Eval

Users can apply for a trial, and the evaluation performance of the Shangtang big model exceeds that of the ChatGPT model | Language | Users

MMLU is a large-scale multitasking language comprehension evaluation benchmark jointly developed by the University of California, Berkeley, Columbia University, University of Chicago, and University of Illinois at Urbana Champaign. It covers 57 subjects in fields such as science, technology, engineering, humanities, and social sciences, with difficulty ranging from beginner level to advanced professional level, testing knowledge and problem-solving abilities.

In this review, the overall score of "negotiation" was 68.6, far exceeding the score of GLM-130B and also surpassing ChatGPT and LLaMA-65B, only lagging behind GPT-4 and ranking second.

The bold font in the figure indicates the best result, while the underline indicates the second best result.

AGIEval is released by Microsoft Research Institute and is specifically designed to evaluate the general abilities of basic models in tasks related to human cognition and problem-solving, in order to achieve a comparison between model intelligence and human intelligence. This benchmark selects 20 exams for human candidates, including university entrance exams, law entrance exams, mathematics competitions, lawyer qualification exams, national civil service exams, etc.

In this review, the "negotiation" score was 49.91, far ahead of GLM-130B, LLaMMA-65B, and also surpassing ChatGPT, second only to GPT-4. In the AGIEval evaluation subset, "Shangliang" ranked second with a score of 58.5, only slightly behind GPT-4.

The bold font in the figure indicates the best result, while the underline indicates the second best result.

C-Eval is a comprehensive exam evaluation set for Chinese language models, jointly constructed by Shanghai Jiao Tong University, Tsinghua University, and the University of Edinburgh. It includes 13948 multiple-choice questions, covering 52 different subjects and four difficulty levels.

In this evaluation, "Negotiate" scored 66.1 points, second only to GPT-4 among the 18 major models participating in the evaluation, and comprehensively leading major models such as ChatGPT, Claude, Bloom, GLM-130B, and LLaMA-65B. Among them, the C-Eval sub evaluation set selected 8 challenging subjects in mathematics, physics, and chemistry to assess the reasoning ability of large language models, and the performance of "negotiation" was also among the top.

Latest C-Eval Rankings

The Shangtang R&D team adopted a series of self-developed methods to enhance complex reasoning ability and a more effective feedback learning mechanism during the training phase, allowing large models to enhance their reasoning ability while reducing the hallucination problem of traditional large models.

It is reported that "Negotiate" also has a wide range of knowledge reserves, which can combine proprietary data of the enterprise's own industry to create a high-level knowledge base that meets the needs of the enterprise. It is also an AI code assistant that can implement the new "28/20 rule", which states that 80% of code is generated by AI and 20% is generated manually.

Shanghai International Sister City Youth "Play" Summer Camp Experience Traditional Culture, Make Pankou, Learn Paper Cuttings, Make Dumplings International | Youth | Make Dumplings

Making coils, learning Paper Cuttings, making dumplings, walking into the home of summer camp volunteers and feeling the life of Shanghai people... On the 20th, the 2023 Shanghai International Sister City Youth Summer Camp officially opened in Shanghai Shidong Experimental School. 73 campers from 13 cities in 12 countries gathered in Shanghai to open the annual international sister city youth exchange event with their peers in Shanghai. The young partners together carried out activities such as learning excellent traditional Chinese culture courses such as Chinese and traditional Chinese painting, intangible cultural heritage, Chinese clothing, disco, Paper Cuttings, seal cutting, calligraphy, pottery, tea art, Yanzhi, dragon dance, youth forum exchanges in sister cities, investigation of urban cultural landscape, visits to universities and venues, city orientation challenges, and local family life experiences, etc., to bloom their youth. In addition to a rich and colorful summer camp physical activity experience, campers and volunteers

Undergraduate voluntary application starts today! These important reminders and suggestions must be read, @ College Entrance Examination Stage | Undergraduate | Volunteer

@All college entrance examination candidates, according to the schedule of the college entrance examination, will fill in their undergraduate preferences for all batches except for the comprehensive evaluation batch from 8:00 a.m. to 8:00 p.m. daily from July 1st to 2nd, and from 8:00 a.m. to 12:00 a.m. on July 3rd. The specific contents of this voluntary application include zero voluntary batch, advance batch, art and sports class A batch, local rural special plan batch, special type enrollment, and ordinary batch. The filling method is as follows: Fresh high school graduates in this city will be arranged uniformly by the high school where they are enrolled; Non local fresh high school graduates will be arranged uniformly by the district recruitment office where they apply. It is important to remind candidates that during the voluntary application period from July 1st to 3rd, as the admission of the comprehensive evaluation batch has not yet been completed, candidates who have filled out the comprehensive evaluation batch of voluntary applications still need to carefully fill out other batches of undergraduate voluntary applications

Investment+Services Drive Anti Cancer Drugs into Clinical Practice | Entrepreneurial Stories in Incubators, Second Entrepreneurial Biology for CEOs of Listed Companies | Incubators | Clinical | Incubators

Recently, with the approval of the National Medical Products Administration, the Class 1 innovative drug CC312 developed by Huihe Biotechnology has initiated phase I clinical trials for the treatment of recurrent/refractory CD19 positive B-cell malignant hematological tumors. This is the first domestically and the third globally approved triple specific antibody drug based on CD28 co stimulatory signals to enter clinical practice. When it comes to the development history of this new triple antibody drug, Dr. Zhu Huaxing, the founder of Huihe Biotechnology, still remembers vividly: "Before 2019, there was no triple antibody drug approved for clinical use globally, and some investors couldn't understand CC312. During a critical period of company development, Nokai Xinkang Fund invested 30 million yuan to help us complete Series A financing." Nokai Xinkang is a venture capital fund initiated by Xinze Entrepreneurship Incubator. This incubation company, which has been deeply cultivated in Zhangjiang Science City for many years

"Zidong Taichu" Full Modal Large Model Released, Precisely Positioned 3D Scene, Listening to "Moonlight Song" and Talking about Beethoven Images | Applications | Beethoven

Not only can you hear Beethoven talk freely in "Moonlight", but you can also achieve precise positioning in three-dimensional scenes, and complete scene analysis through the combination of images and sound. On June 16, at the AI Framework Ecological Summit, the Institute of Automation of the Chinese Academy of Sciences officially released the full mode large model of "Zidong Taichu". This model is the 2.0 version upgraded from the 1.0 version of the 100 billion parameter multimodal model "Zidong Taichu". On the basis of voice, image and text three modes, it adds video, sensor signal, 3D point cloud and other modal data, breaks through the key technologies such as multimodal correlation for cognitive enhancement, and has full modal understanding, generation and correlation capabilities. At the meeting, Xu Bo, the director of the Institute of Automation, presented for the first time in real time the "Zidong Taichu" full modal cognitive model in music understanding

Perseverance is a life experience. Lingling Middle School has weak muscles. Candidates complete the college entrance examination: no matter what difficulties they experience. High school | school | college entrance examination

"The college entrance examination is an experience in our lives. Looking back on the preparation for the entire senior year of high school, we cannot help but marvel at the preciousness of time. No matter what difficulties we have gone through, I believe that as long as we persist, it will become my lifelong wealth. Today, after completing the morning foreign language listening and speaking test of the college entrance examination, Xiao Song, a senior student of Lingling High School, came out of the examination center of East China University of Science and Technology Affiliated Middle School. The senior year graduate sitting in a wheelchair said," Although academic and life are not small tests for me, I have faced many difficulties in adversity and have never given up on my dream of taking the college entrance examination. "Xiao Song suffered from congenital muscular dystrophy since childhood and entered Lingling High School in high school." Later, the school provided him with classrooms on low floors and close to the toilet, making it easier for him to enter and exit, allowing him to face his academic life with more confidence. At school, I met many kind and enthusiastic people

Important news

The boundless scenery is always new-General Secretary Xi Jinping guides the construction of digital society, review and development | education | society Unswervingly Promote and Improve the Comprehensive and Strict Party Governance System Xi Jinping | Strict Party Governance | System "Seeking Truth" magazine published an important article by General Secretary Xi Jinping "Improving the Comprehensive and Strict Party Governance System and Promoting the New Great Project of Party Building in the New Era to Develop in Depth" State | President | Xi Jinping General Secretary's thoughts are still in my mind, learning from the season | building a modern marine ranching power | developing | the ocean Leading the way with a heavy load | More vitality in opening up to the outside world, stronger momentum in science and technology innovation - Strong development momentum in the ancient capital of Xi'an Automotive | New energy | Opening up to the outside world Chinese Stars | 210 Seconds Review of the Victory of Aerospace Heroes on their Way Home! Flying | Divine Boat | Hero

Political situation

What are the remaining obstacles in the investigation and undercover investigation by the law enforcement inspection team of the Municipal People's Congress? This work evaluates that Shanghai has always been at the forefront of garbage classification | garbage | law enforcement Looking forward to the global high-quality engineering enterprise layout in Shanghai, Mayor Gong Zheng meets with the President elect of the World Federation of Engineering Organizations | Global | Quality To jointly discuss and deepen cooperation and exchange in counterpart support, Chen Jining and Nie Zhuang led a discussion on the work of the party and government delegation in Kashgar, Xinjiang Chen Jining and Gong Zheng jointly inspected and discussed, jointly serving the overall development of the country! Yin Li and Yin Yong led a Beijing delegation to Shanghai to inspect the delegation's development and overall situation Contributed about 1/4 of the GDP of Shanghai, currently with 70000 foreign-funded enterprises in Shanghai. The Youth Association of Shanghai Foreign Countries has helped each other in terms of membership and GDP The representative group of Yangpu Street helps to alleviate public concerns, with garbage belts turning back into green belts and street lights lighting up again in residential areas | Representative | Yangpu

Economics

What else can I rely on for growing vegetables?, Agricultural Experiment of a Group of Young People: Lettuce without Heaven or Earth | Physical Strength | Agriculture Very few have been rejected at once... Shanghai Automotive Engine Factory has presented 49 automotive grade chips to solicit domestic alternatives to SAIC | Netherlands | United States | Lingang Pujiang International Science and Technology City | Substitution | Import | Localization | Automotive grade chips Is there still a "flower head" in domestic blockchain?, No Bitcoin, First Blockchain Technology National Standard Released Web3.0 | Blockchain | Domestic These top domestic and international beverage cold drink brands have rolled up like this, in order to welcome the overall recovery of the consumer market as a brand | Sugar Tea | International Making a dojo inside a snail shell allows the city's scraps to make a comeback. This European style garden in Shanghai used to be a "stone" on the road. Shanghai Pocket Park | Sparrow Little Dirty | Making a dojo inside a snail shell 23 industry associations (business and academic) in the Yangtze River Delta have established alliances to provide trade adjustment assistance, and the foreign trade situation is complex and severe. | Trade | Alliance

Regional situation

Shanghai Charity Foundation Successfully Issued First Electronic Donation Receipt Community | Activity | Shanghai Charity Foundation Carry out normalized social theater performance activities, and the "hometown of drama" Songjiang Xinbang will resume stage and social theater | Drama | Songjiang Xinbang How to promote and replicate the "near rail model"? Changfeng Xincun Street Building Party Building 4.0 Model | Building Committee | Jintie The largest single scale near zero energy consumption building in Shanghai will be built, and the construction area of the first batch of "Jiabao Smart Bay" in Jiading New City will start | plot | first batch Jing'an creates "one park, one theme", and the art exhibition "Painting Life and Coloring" opens at Jing'an Sculpture Park to experience | Art | Jing'an The highlights of the Xuhui exhibition area at the first Carbon Expo are frequent, including waste plastic recycling and environmental protection roads, Shanghai's first near zero carbon community, low-carbon | green | community

Viewpoint

"White carbon" is on the rise, and there are many colors of carbon? This is definitely not sensationalism, carbon dioxide emissions, color Where does the trend go: "China-Chic" cultural and creative market status, development trend and innovation path Cultural and creative | Chinese goods | China-Chic He demonstrated the essence of the revolutionary ideology and spiritual qualities of communists, and during his brief 40 years of life, Secretary Yu Xiusong's ideology What is the relationship between the "four major functions" and the "five centers"? Accurately grasp these two key elements | functions | relationships Why mention "Clean Your Plate Campaign" again?, Today's life | leftovers | CDs To find the balance between digitization and greening, experts say that excessive digitization may increase energy consumption. Digital | Green | Digitization

Vision

Celebrating the Second Anniversary of the US Army's Withdrawal, [Looking at the World] Afghan Taliban Armed Security Personnel March Typhoon | Taliban | Afghanistan It's more convenient to buy book exhibition tickets offline today, including ID card, cash, and old-fashioned mobile phones. Lu Xun | Book Exhibition | Mobile Phone These spaces create multiple interactive viewing experiences for citizens and tourists, playing with weekend aesthetics | Design | Citizens The opening ceremony will continue as usual. [Looking at the World] The opening match of the Women's World Cup. A shooting occurred in Auckland City, resulting in three deaths. The World Cup | Event | Opening match World Blood Donor Day: Oriental Pearl TV Tower, Shanghai center and other city landmarks light up Life Red World Blood Donor Day Take the artwork home for a thousand yuan? Many people come to this exhibition for their "treasure hunting" works | gallery | to take home