5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Release time:Apr 16, 2024 07:08 AM

Recently, 8 domestic large models, including Baidu and Baichuan Intelligence, have passed the first batch of registration for "certified employment". Users can apply for an account on the corresponding platform and have conversations with AI intelligence.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

So, are these big models really omniscient? Can it really help users solve problems? Is it still an enlarged version of Siri?

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

The reporter selected five models, including Doubao, Tongyi Qianwen, iFLYTEK Spark, ERNIE Bot and Zhipu Qingyan. From four aspects of search ability, context understanding ability, emotional analysis ability and programming ability, the reporter produced a "test paper" with 20 original questions to see which model was best used.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Who can replace search engines?

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Information search is the most likely scenario for ordinary users to use large models, so are they really reliable?

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

From the results, it can be seen that Doubao has strong information search ability, while other large models have problems with outdated information, incorrect information, and no results, which are far from replacing search engines.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

The reporter asked, "Please tell me the address of Liberation Daily.". Only Doubao accurately gave the correct answer, while ERNIE Bot and Zhipu Qingyan probably didn't update the database, and the addresses provided were all old addresses, according to which readers could not find the newspaper office.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Xunfei Spark and Tongyi Qianwen are even more outrageous. IFlytek Starfire fabricated an incorrect address; However, Tongyi Qianwen suggests that journalists search on search engines or go to the official website to find information, which can be described as "asking for nothing".

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

The information on Doubao is the most accurate.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

If it involves professional knowledge such as law and economics, will the performance of the large model be better?

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

The reporter asked the second question, "From a legal perspective, if both mother and girlfriend fall into the water at the same time, who would you save?"

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Although this question is a common topic of discussion in the community, the topic is limited to a legal perspective, thus testing the understanding of laws and regulations by the large model. In such extreme cases, there is generally no standard answer, and it is generally believed that children have a duty to assist their immediate family members, but the lover is not their immediate family member.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

From the results, it can be seen that Doubao and iFlytek Spark are relatively reliable, with clear logic and no obvious loopholes, which seems to provide reference for men.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Tongyi Qianwen did not pay attention to legal qualifiers and answered in a vague manner, which is considered "correct nonsense"; ERNIE Bot's answer is even better. It seems very professional and quotes the criminal law. However, after verification, it will be found that there are many factual errors, which belong to serious nonsense.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

There are many factual errors in ERNIE Bot's answer. Article 231 of the Criminal Law stipulates that "if a unit commits the crimes stipulated in Articles 221 to 230 of this section, it shall be fined, and the person in charge and other directly responsible persons shall be punished in accordance with the provisions of each of these articles." At the same time, the response that "no matter which one you rescue first, you may be accused of illegal behavior" is incorrect.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Not only does ERNIE Bot say a word, but Tongyi Qianwen also has the problem of inaccurate professional information.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

After asking "What is the issuing institution of the RMB?", although Tongyi Qianwen gave the correct answer, the quoted "People's Bank of China Law" was also incorrect, and Article 21 did not provide as it stated.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

It seems that legal knowledge is still a hurdle that big models cannot overcome.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Article 21 of the Law of the People's Bank of China stipulates that "damaged or defiled RMB shall be exchanged in accordance with the regulations of the People's Bank of China, and shall be recovered and destroyed by the People's Bank of China.".

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Who can chat smoothly with you?

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Dialogue is one of the most closely related functions between large model products and users. Being able to understand, answer, and answer jokes is a mandatory course for large models. How much can domestic large models score in this class?

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

"He went to the hospital last week" and "He returned to work this week". Can the big model guess what happened in the past two weeks by saying these two sentences? Although the two sentences lack a direct causal relationship, almost all of these five models can answer - "He" may have fallen ill, went to the hospital for treatment and recovered, and started working again this week.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

However, ERNIE Bot's answer seems to be more comprehensive, dividing the possibility into three parts: one is to get sick or injured and return to work after treatment; The second is related to chronic diseases, only having gone to the hospital for examination or surgery; The third is that I am not sick, and going to the hospital is just for a physical examination or vaccination. It can be seen from this that ERNIE Bot used the method of "exhaustion". Although the reply was more wordy, it was more accurate.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Next, the reporter asked again, "Why didn't he come to work last week?" to test whether the large model can be understood in context. Surprisingly, Doubao, iFLYTEK Spark, Tongyi Qianwen, Zhipu Qingyan can all answer "went to the hospital last week". Only ERNIE Bot completely forgot the last round of dialogue, "I can't determine why he didn't come to work last week", and again "exhausted" the reason for his absence from work.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

In terms of emotional analysis, journalists have demonstrated through text analysis, comparison of emotional intensity, and expression of emotions in Spanish that all five major models are "emotional masters" and accurately grasp the "micro emotions" of journalists when speaking.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

In order to better test the understanding of unconventional conversations in daily conversations, the reporter told a "cold joke on the internet" - why Lin Daiyu pulled out the hanging willow upside down, which stumped a group of big models: Doubao judged that Lin Daiyu and the hanging willow upside down came from different allusions, explained the origin of the two, and pointed out that they were unrelated, but did not find the "internet meme" behind this sentence. Tongyi Qianwen and Zhipu Qingyan keenly discovered the "internet meme", and also showed all the original source and the associations of netizens.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

That is to say, most big models have little problem dealing with daily communication and dialogue, but it is still too early to learn how to use humor to "get stuck".

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Who can help you write code?

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

After the release of ChatGPT, some programmers lamented that they were going to lose their jobs because big models have certain advantages in programming and vulnerability detection.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

So, among the five domestic large models mentioned above, who has the better programming ability? Who can teach you how to write code?

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

The reporter starts from five aspects: basic arithmetic operations, conditional statements, IF loops, functions, and data structures, and tries the big model as a beginner in programming.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

From the perspective of programming ability, there is no significant difference among the five major models, the code is correct and can run, and there is no situation of "blindly fabricating legal provisions" encountered in the previous text.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

If you insist on picking some questions, Starfire's code lacks conciseness. Because even the simplest addition, Starfire uses the def function, while the other models are all direct operations.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

Starfire is calculating simple addition.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

But not every big model is suitable for being a programming teacher.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

From the perspective of code readability, ERNIE Bot is more suitable for beginners to learn programming. Because it not only inserts # symbols in the code to indicate the meaning of each step, but also attaches a text summary at the end of the text to help users understand the logic of the code. What's more, ERNIE Bot also points out the precautions. For example, in the question of judging the positive and negative numbers, it reminds the coder to pay attention to the information entered by the user, and suggests adding the statement of error handling code. For beginners, it is very friendly.

5 Mainstream Domestic Big Models PK: Who is the Best to Use? Who understands you better? We tried it for you

By comparison, Spark has the weakest readability, with less explanatory text, making it difficult for programming novices to understand.

What new concepts are reflected in the concept of "strengthening the body and keeping fit"? For the first time in the history of Shanghai Lupu Bridge, structural maintenance has been quietly completed for the Lupu Bridge | Bridge | Concept
What new concepts are reflected in the concept of "strengthening the body and keeping fit"? For the first time in the history of Shanghai Lupu Bridge, structural maintenance has been quietly completed for the Lupu Bridge | Bridge | Concept

The Lupu Bridge, the sixth bridge on the Huangpu River in Shanghai and known as the "world's largest arch," has recently completed its first structural repair since its construction. Since 2015, Shanghai's urban bridges have entered a peak maintenance period, with several iconic bridges undergoing maintenance and renovation, including Zhejiang Road Bridge, Nanpu Bridge, Yangpu Bridge, Songpu Bridge, etc. Why did the Lupu Bridge, which is younger than them, start "strengthening and keeping fit" early on? The Jiefang Daily Shangguan News reporter learned that after this maintenance, the Lupu Bridge still needs to consider upgrading, reflecting the new concept of urban municipal facility operation. How long can disease removal last? June 28th is the 20th anniversary of the opening and operation of the Lupu Bridge. Why did a young bridge who had just turned 20 start maintenance work early? The design period of the main structure of Lupu Bridge is 1 year

More than 25000 daily road cleaning personnel are dispatched, and the Dragon Boat Festival holiday park receives 1.55 million visitors. Forest | Facilities | Personnel
More than 25000 daily road cleaning personnel are dispatched, and the Dragon Boat Festival holiday park receives 1.55 million visitors. Forest | Facilities | Personnel

The Dragon Boat Festival holiday has been affected by rainy weather, resulting in a decrease in the number of visitors to Shanghai parks compared to the same period in previous years. The Shanghai Greening and Urban Appearance Bureau announced today that during the Dragon Boat Festival holiday, Shanghai Park received a total of 1.5536 million visitors. The holiday coincides with heavy rainfall, and the Shanghai greening and urban landscape industry adheres to the principle of "safety first, constant preparedness, prevention as the main focus, and full efforts to rescue". It actively responds to conditions such as heavy rainfall and thunderstorms and strong winds, and arranges frontline workers to arrive at work in advance within three days. More than 25000 road cleaning personnel are dispatched daily, and key areas such as bus hubs, scenic areas, subway station entrances and exits, and large commercial districts are quickly cleaned and cleaned. Garbage and fallen leaves around drainage outlets are promptly removed to ensure the travel of citizens. The Shanghai Greening and Forestry Department tied up new types of trees in advance, cleaned up dead branches and fallen leaves in ditches, and cleared forest drainage

The consumption scene of Xujiahui Nightlife Festival continues to be updated, and the more exciting the night, the more local | Consumption | Xujiahui
The consumption scene of Xujiahui Nightlife Festival continues to be updated, and the more exciting the night, the more local | Consumption | Xujiahui

"Night economy" is a barometer of urban vitality. As one of the national level night culture and tourism consumption clusters, since the opening of the 2023 Shanghai Nightlife Festival, the consumption scenes of "sports+", "local+", and "trend+" have continued to innovate in Xujiahui Nightlife Festival, bringing a new consumption experience of "more exciting the night" to the general public and tourists. As one of the first sports industry clusters in Shanghai, the "2023 Xuhui District Sports Consumption Carnival" was launched from June 9th to 11th in the Xujiahui commercial district. During this 3-day carnival, more than 30 famous sports brands, leading sports enterprises in Shanghai, and characteristic sports institutions were shortlisted for the event. In addition, Shanghai Free Fighting Amateur Open, China Coordinate Shanghai Xuhui Directional Outdoor Challenge, Xuhui District Basketball Season Finals, One Dart Throwing Victory Dart Challenge, etc

How to play with the soaring prices of European tours? "Spending less money to visit more places" can still achieve world | Europe | European tours
How to play with the soaring prices of European tours? "Spending less money to visit more places" can still achieve world | Europe | European tours

After three years of the epidemic, the country has reopened and the outside world has changed a lot. For Chinese travelers, inflation in many countries around the world and the European energy crisis have caused the cost of European travel to rise sharply. In the face of such a new situation, at the "2023 European Travel Carnival" that kicked off in the shopping street of Jing'an Temple Station of Shanghai Metro today, tourism practitioners put forward the concept of "Travelmoreforless" and strive to provide more customized travel routes for tourists who prefer characteristic routes. This European Travel Carnival is the fourth "May 5 Shopping Festival" and one of the important activities of the "Shanghai Import Hi Shopping Festival. From Switzerland, UK, Croatia, Poland, Norway, Denmark, UK, Hamburg, Flanders, Belgium, Sweden, Portugal, etc

What does the Lujiazui Forum rely on to attract global attention? Over the past 15 years, the forum has focused on economics, finance, and other related topics
What does the Lujiazui Forum rely on to attract global attention? Over the past 15 years, the forum has focused on economics, finance, and other related topics

Tomorrow, the 14th Lujiazui Forum will open in Shanghai. Since its establishment 15 years ago, this high-end dialogue and exchange platform in the financial field has become a business card for Shanghai and also a business card for China's finance to go global. An interesting coincidence is that the first Lujiazui Forum started in 2008, when people were concerned about when the global economy could improve and recover under the impact of the financial crisis. And what everyone is more concerned about in this forum is obviously how finance can help global economic recovery after the epidemic. The questions are similar, but the answers are not exactly the same. Every time global financial professionals come to the Lujiazui Forum, they are searching for answers to financial development issues in the real world. Over the past 15 years, some answers have gradually become clear, while others are still being explored. Perhaps this is why the Lujiazui Forum has caught the world's attention