The return on investment is that this alliance has received 180000 downloads in two weeks of open source, and domestic large models lack high-quality corpus actions | member | domestic

Release time:Apr 16, 2024 20:28 PM

In November 2022, ChatGPT emerged, ushering in the era of big models. But training large models is like cultivating children, only high-quality education can produce high-quality output. Therefore, high-quality corpora are a key link in the large model industry chain. Based on this, on July 6th this year, at the opening ceremony of the World Artificial Intelligence Conference, the China Big Model Corpus Data Alliance, jointly initiated by Shanghai Artificial Intelligence Laboratory and other units, was announced to be established. Subsequently, the alliance took frequent actions. Following the release of the first publicly available multimodal pre training corpus "Scholar · Ten Thousand Volumes" on August 14th, nine new member units joined and new datasets were released on September 8th.

It is reported that the alliance has gathered 10 initiators in the first batch, including Shanghai Artificial Intelligence Laboratory, China Central Radio and Television Station, China Institute of Science and Technology Information, Shanghai Newspaper Group, Shanghai Data Group, etc., which includes the main force and vanguard of national and Shanghai language data supply. On August 14th, the Alliance released the "Scholar · Ten Thousand Volumes" multimodal pre training corpus, with a total data volume of over 2TB, including over 500 million texts, 22 million interlaced text and image documents, and 1000 program videos. Wang Yanfeng, Assistant Director of Shanghai Artificial Intelligence Laboratory, introduced that the 2TB data is the result of strict screening. The laboratory has established an OpenDataLab technology platform for this purpose, which has a large number of professional toolsets. Through classification, cleaning, identification and other methods, it helps to eliminate non high-quality and contaminated data, achieving an improvement in corpus from quantity to quality. Within two weeks of its release, "Scholar · Ten Thousand Volumes" reached 180000 downloads, setting the record for the highest number of downloads of individual datasets publicly released since the rise of large-scale models in China.

The return on investment is that this alliance has received 180000 downloads in two weeks of open source, and domestic large models lack high-quality corpus actions | member | domestic

The high-quality corpus "really fragrant" has also attracted more units to join the "feeding". This time, 9 units including Shanghai Timi Robotics Co., Ltd., Shanghai Urban Construction and Urban Operation Co., Ltd., China Patent Technology Development Corporation, Shanghai Arbitration Commission, and Shanghai Data Exchange have all joined the company and launched the second batch of open-source language dataset "Honey Nest · Pollen 1.0". Multiple alliance member units have also formed open-source solutions for corpus data, which will gradually enter the release queue.

Honey Nest · Pollen 1.0 comes from Shanghai Mido Information Technology Co., Ltd., one of the nine new members. The Chief Technology Officer of the company, Liu Yidong, told reporters that many large models in China are trained based on foreign language data combined with a small amount of Chinese materials, which leads to weak understanding ability of Chinese and insufficient generation ability based on Chinese scenes. Honeycomb · Pollen 1.0 is mainly based on Internet media data. After fine processing such as filter cleaning, multi condition de duplication, and pre review of data compliance by senior lawyers, the total number of open source Chinese data has exceeded 70 million. In fact, the series of large models of Mido Company itself have also been trained using the Mido Pollen dataset, which can be used in vertical fields such as government and media, providing services such as knowledge Q&A and content generation, automatic generation of analysis reports, content review and editing of documents.

Alliance members actively contribute corpus, but it is not just about generating power with love. Wang Zhijia, Director of the Artificial Intelligence Development Department of the Municipal Commission of Economy and Information Technology, introduced that the alliance has designed four operational models from L1 to L4. L1 is open source to the society, L2 is only open source within the alliance, and L3 and L4 involve projects that are listed or traded on or off the market. "We have also been exploring incentive mechanisms based on contribution and sustainable operation, such as through joint research and development with member units, using scientific research benefits, commercial licenses, etc. to achieve feedback to contributors." Wang Yanfeng said.

Is there a market for wearing VR headsets to visit "Cloud Park"?, Shanghai Park will have over a thousand parks during the 14th Five Year Plan period | Shanghai | Cloud Park

As of the end of 2022, Shanghai has included 670 types of parks under the management of the urban and rural park directory, including 438 urban parks, 172 pocket parks, 59 rural parks, and 1 theme park. By the end of 2025, Shanghai will add more than 600 new parks through construction or renovation, bringing the total number of parks to over 1000. How can the landscaping industry assist Shanghai in building a "park city"? How can we contribute to the implementation of the "dual carbon" goals? At the 19th China International Landscape Industry Trade Expo, which opened on June 29th, many enterprises were eager to give it a try and rubbed their hands together. Walking around the park without leaving home, wearing a virtual reality headset display device, the Mianqing Park, built and opened earlier this year, will come into view. With the control handle, you can stroll around the park

Shanghai's banking industry has released a self-discipline convention, scientifically setting interest rate pricing, and not using interest rates as a single means to compete for customer development | industry | interest rates

Under the guidance of the Shanghai Banking and Insurance Regulatory Bureau, the Shanghai Interbank Association recently formulated and released the Self Discipline Convention on Creating a Sustainable Credit Market Environment for the Shanghai Banking Industry, which applies to all member units of the Shanghai Interbank Association engaged in credit business. The Self Discipline Convention emphasizes that member units should adhere to the general principle of seeking progress while maintaining stability, operate in accordance with the law and regulations, based on changes in the interest rate market and the strategic positioning of the bank, adhere to the principle of deposit to loan, play the guiding role of LPR in loan interest rates, and reasonably determine the pricing level of interest rates. Interest rates should not be used as a single means to attract customers, while balancing the social responsibility and commercial sustainability of banking institutions, and maintaining the bottom line of avoiding systemic risks. Member units should establish a healthy, compliant, and sustainable performance concept, adhere to the origin of financial services for the real economy, scientifically formulate interest rate pricing strategies, and

A heavy rainstorm is coming, two days earlier than usual, and Shanghai will be in the plum blossom tomorrow | rain belt | a storm

The Shanghai Meteorological Bureau announced today that according to the latest meteorological data analysis, the atmospheric circulation situation in East Asia is currently adjusting, with the subtropical high pressure rising northward and the precipitation process increasing in the middle and lower reaches of the Yangtze River. Shencheng will officially enter the plum blossom season on June 17th. This year, the onset of the plum blossom season is slightly earlier than usual by two days. In the early stage of the rainy season, the rain bands are unstable and show a north-south oscillation trend. Shanghai also experiences occasional cloudy and rainy weather. The Shanghai center Meteorological Observatory forecast that a heavy to rainstorm was on the way. Affected by the eastward movement of the high trough, there will be a heavy rain to rainstorm process in Shanghai on Sunday and next Monday, accompanied by lightning, short-term heavy rainfall and thunderstorm gales. The main precipitation period occurs from Sunday afternoon to next Monday morning, with uneven distribution of rainfall. The cumulative rainfall is 40 to 60 millimeters, locally 70 to 90 millimeters, and the maximum hourly rainfall intensity is 30 to 50 millimeters. Expected rain belt south next Tuesday

Welcome foreign institutions to China for business expansion. In these four financial fields, Li Yunze said finance | management | welcome

At the 14th Lujiazui Forum held today, Li Yunze, Director of the State Administration for Financial Regulation and Co rotating Chairman of the Forum, stated that he will always adhere to the unwavering opening up of the financial industry to the outside world, and the door to the opening up of the financial industry will only continue to widen. Recently, Faba Bank Wealth Management will open in Shanghai. "We will continue to adhere to the combination of 'bringing in' and 'going out', steadily promote high-level opening up of the financial industry to the outside world, and continue to create a market-oriented, legal, and international business environment." Li Yunze said that in areas such as wealth management, green finance, elderly care and health, and asset management, we warmly welcome foreign-funded institutions with stable operations and excellent qualifications to come to China for business development, and encourage eligible foreign-funded institutions to participate in various business pilot projects. Implement high-quality regional comprehensive economic partnership agreements and continuously improve the level of institutional openness in the financial industry. Optimize cross-border

What high-quality public resources will be implemented?, The collection of public building and landscape project proposals for the five new cities in Shanghai this year concludes with Architecture | New City | Public Buildings

On June 7th, the Shanghai Municipal Planning and Resources Bureau announced that the design plans for 15 public buildings and landscape projects in five new cities in 2023 have come to an end, and will accelerate the gathering of high-quality public resources in the new cities. 15 public building and landscape projects have attracted 45 design units from 9 countries to participate. A batch of excellent design proposals stood out in the layers of evaluation. The proposals presented this time have basically demonstrated the planning and design trends of the new city. For example, highlighting "people's cities" and caring for the needs of the people. In March of this year, relevant departments in Shanghai held an event to solicit opinions from citizens on the construction of the new city. As of March 31, a total of 3786 proposals have been received. These suggestions revolve around transportation facilities, ecological environment, public services, outdoor public spaces, etc., and are provided to the design team in the early stage of the design, as the focus of the scheme design

Important news

Xi Jinping Meets with U.S. Secretary of State Brinken Xi Jinping | World | United States CCTV Quick Review: Adhering to National Righteousness and Sharing the Great Glory of National Revival in the Strait | Forum | CCTV Xinhua News Agency+| "Ten Million Project" has created countless beautiful villages in Haili Village, Shitang Town, Wenling City, Zhejiang Province over the past 20 years. In recent years | Culture | Xinhua All Media+American Media: "Pride and hypocrisy" Move the Global South Away from the United States Columnist | Washington Post | United States Hello!, Goodbye! Shen Sixteen, Creative Micro Video | China Space Station: Shen Fifteen Earth | Crew | Micro Video Looking at the National Version Gallery again, Xinhua All Media+| 10 Month Version | China | Country

Political situation

Decision to appoint Chen Jie as Vice Mayor, Report on the Fourth Meeting of the Standing Committee of the 16th Shanghai Municipal People's Congress | Situation | Shanghai Strictly abide by the Party Constitution, Party rules, and Party discipline. The Party Group of the Standing Committee of the Municipal People's Congress, the Party Group of the Municipal Government, and the Party Group of the Municipal Political Consultative Conference held separate meetings to firmly support the Central Committee's decision on the Party Central Committee | Politics | Party Group Can Shanghai connect with the greenways of these cities? Official: Proposed Preparation of Greenway Construction Standards for the Yangtze River Delta | Standards | Yangtze River Delta Chen Jining, Gong Zheng, and colleagues from the Standing Committee of the Municipal Party Committee exchanged their experiences and insights on thematic education research based on their respective experiences. Education | Municipal Party Committee | Theme The Shanghai Municipal Committee of the Chinese People's Political Consultative Conference awarded commemorative plaques to members and standing committee members of the 13th National Committee of the Chinese People's Political Consultative Conference who are no longer re elected in Shanghai Propose ideas and measures around releasing domestic demand, and the Chairman's Meeting of the Municipal Committee of the Chinese People's Political Consultative Conference deliberates on key research reports and measures | ideas | research

Economics

The highest temperature in multiple districts is expected to reach 37 ℃ today, and Shanghai has issued the first orange high-temperature warning of the year for the urban area | Minhang | warning The Shanghai project of the National Commodity Warehouse Receipt Registration Center has arrived, officially launched and running in Shanghai | Commodity | Warehouse Receipt The Municipal Flood Control Office has issued a notice that the highest temperature will drop by 6 ℃ within 24 hours, and the return of the rainy season to Shanghai will reduce fever weather | flood control | high temperature Shanghai East China received orders worth 20 billion yuan in the first half of the year, with 15500 containers delivered as the first container ship Exploring proactive claims, with over 4.5 million people participating in the "Shanghai Benefit Insurance" for centenarians | Age | Newborns The density of industrial robots and other factors have been arranged... Shanghai Manufacturing is making efforts to maintain a 25% red line in intelligent manufacturing within 3 years | industrial added value | industrial robots | red line | 25% | Shanghai Manufacturing | industrial development

Regional situation

Inheriting and promoting traditional folk customs, Yangpu Binjiang holds the Dragon Boat Festival Intangible Cultural Heritage Market Yangpu | Market | Folk Customs This "treasure garden" in Shanghai showcases the ancient charm of the Tang Dynasty everywhere, and Baoshan Temple Garden has officially opened. Temples | Gardens | Shanghai The secretaries of the residential areas have set up a "arena", focusing on how to build a "beautiful community", at the Minhang Jiangchuan Grid | Party Building | Secretary Focusing on the global design and creative industry, launching a new exhibition, Shanghai Navigation Instrument Factory revitalizes its brand | Creativity | New Appearance Effectively filling the legal gap in the upper level, Pudong has released the country's first CCC exemption management measure. Pudong New Area | CCC | Management Building a first store debut platform and CEO club? The imagination space is still large, in the demonstration area of Xintiandi Demonstration Zone | International | Space

Viewpoint

Why should "unfashionable love songs" be used to handle millions of fans?, "Xiucai" needs to be sealed | middle-aged and elderly people | Xiucai It is urgent to fill this gap with high standards and quality. Cities with frequent floods and waterlogging disasters | construction | flood disasters Don't ignore this dimension, judge the competitiveness of listed companies in the capital market | market value | listed companies How to better care for the lifeline of small and medium-sized private enterprises? After the release of the "31 Measures" by the central government, small and medium-sized enterprises | policies | lifelines Why are we moved?, From "Salute Doll" to "Wheelchair Boy" in the College Entrance Examination Score | Wheelchair Boy | Salute Doll We also need to correct the "baton of admission rate" and protect students' "right to take the high school entrance examination". The admission rate | the homeroom teacher suggests that candidates abandon the exam | the high school entrance examination

Vision

The delicious autumn that covers the sky and the earth, busy with "sun drying autumn", 【 looking at the world 】 Beginning of autumn to the United States | rice | looking at the world Typhoon Dussuri caused 6 deaths in the Philippines! All parts of our country are making every effort to cope with the impact on Mexico | Dussuri | Our country Shanghai Lingang Donggang Port Phase II Project Completed Pier Approach Bridge Pile Foundation Construction Approach Bridge | Pier | Project The Golden Jubilee Awards ceremony was starry, with 91 year old Yoji Yamada walking the red carpet. Hu Ge and Dapeng both won Best Actor award, 658 kilometers. Yoji Yamada's Journey | Film |A hundred meter canopy provides shelter from wind and rain for candidates. Shanghai high school entrance examination starts: teachers wear red T-shirts with the word "stable" to send for the exam. | Candidates | T-shirts The original French musical "Romeo and Juliet" returns to the Shanghai stage after a 5-year hiatus | Shanghai | Romeo and Juliet