Is this round of AI feasible?, Online medical diagnosis and treatment | Questions | AI

Release time:Apr 14, 2024 06:59 AM

Have you searched online for "Where do I hurt? Did I have any illness?"? The answer may not be satisfactory. But with the rise of large natural language models such as ChatGPT, people began to try to use them to answer medical questions or knowledge.

But is it reliable?

As far as it is concerned, the answer given by artificial intelligence is accurate. But James Davenport, a professor at the University of Bath in the UK, pointed out the difference between medical questions and actual practice. He believes that "practicing medicine is not just answering medical questions. If it is purely answering medical questions, we do not need to teach hospitals, and doctors do not need to receive years of training after academic courses."

Given various doubts, in a recently published paper in the journal Nature, the world's top artificial intelligence experts presented a benchmark for evaluating how well large-scale natural language models can solve people's medical problems.

The existing model is not yet perfect

The latest assessment comes from Google Research and Deep Thinking. Experts believe that artificial intelligence models have many potential in the medical field, including knowledge retrieval and support for clinical decision-making. However, the existing models are still incomplete, such as the possibility of fabricating convincing medical errors or incorporating biases to exacerbate health inequality. Therefore, it is necessary to evaluate its clinical knowledge.

There have been relevant evaluations before. However, in the past, automated evaluations often relied on limited benchmarks, such as individual medical test scores. In the real world, both reliability and value are lacking.

Moreover, when people turn to the Internet to obtain medical information, they will encounter "information overload", and then choose the worst one from the 10 possible diagnoses, thus bearing a lot of unnecessary pressure.

The research team hopes that language models can provide brief expert opinions without bias, indicate their citation sources, and reasonably express uncertainty.

How does the LLM with 540 billion parameters perform

To evaluate the ability of LLM to encode clinical knowledge, Google Research expert Shekufi Aziz and colleagues explored their ability to answer medical questions. The team proposed a benchmark called "MultiMedQA": it combines six existing question answer datasets covering professional healthcare, research, and consumer queries, as well as "HealthSearchQA" - a new dataset containing 3173 medical questions searched online.

The team subsequently evaluated PaLM and its variant Flan PaLM. They found that Flan PaLM reached the most advanced level in some datasets. In the MedQA dataset that integrates issues related to the US physician license exam, Flan PaLM surpasses the state-of-the-art LLM by 17%.

However, although Flan PaLM performs well in multiple choice questions, further evaluation shows that there is a gap in its ability to answer consumer medical questions.

LLM specializing in medicine is inspiring

To address this issue, artificial intelligence experts use a method called design instruction fine-tuning to further debug Flan PaLM for adaptation in the medical field. Meanwhile, researchers introduced a specialized LLM in the field of medicine - Med PaLM.

Design instruction fine-tuning is an effective method to make universal LLM applicable to new professional fields. The performance of the generated model Med PaLM in the trial evaluation is encouraging. For example, Flan PaLM received a long response from a group of physicians with a score that was only 61.9% consistent with scientific consensus, while Med PaLM received a response score of 92.6%, which is equivalent to the response given by the physician. Similarly, 29.7% of responses from Flan PaLM were rated as potentially harmful, while Med PaLM was only 5.8%, equivalent to the responses given by physicians.

The research team mentioned that although the results are promising, further evaluation is necessary, especially in terms of safety, fairness, and bias.

In other words, there are still many limitations to overcome before the clinical application of LLM is feasible.

Two women were stabbed to death and reported to have committed a crime 4 days before the follow-up visit for schizophrenia. Suspect of a bloody murder case in a Hong Kong shopping mall appeared in court today. Male | Last Friday | Murder case
Two women were stabbed to death and reported to have committed a crime 4 days before the follow-up visit for schizophrenia. Suspect of a bloody murder case in a Hong Kong shopping mall appeared in court today. Male | Last Friday | Murder case

According to Hong Kong's Wen Wei Po, a bloody knife stabbing case occurred at Hollywood Square in Diamond Hill last Friday. The police arrested a 39 year old man on suspicion of stabbing two young women, one of whom was stabbed over 30 times. The suspect appeared in the Kwun Tong Magistrates Court this morning. The police at the Kwun Tong Magistrate's Court temporarily charged the suspect with two counts of murder last Sunday. The suspect appeared in court this morning at the Kwun Tong Magistrate's Court. Acting Chief Magistrate Zheng Jihang, after listening to the opinions of both the prosecution and defense, decided to postpone the hearing for two weeks until 9:30 am on June 19th, waiting for two psychiatric expert reports to be obtained. The defense did not object. Zheng Jihang approved the application, and the defendant needs to be temporarily detained at Xiaolan Mental Hospital. When the suspect appeared in court, he wore black framed glasses, a light gray shirt, and camouflage green shorts, and was able to answer the judge's questions normally. accordingly

Secretary of the Provincial Party Committee: The focus of Henan's "summer harvest" has shifted to agricultural machinery in the northern region of Henan Province. | Support | Science | Organization | Province | Northern Henan | Summer Harvest | Rush Harvest
Secretary of the Provincial Party Committee: The focus of Henan's "summer harvest" has shifted to agricultural machinery in the northern region of Henan Province. | Support | Science | Organization | Province | Northern Henan | Summer Harvest | Rush Harvest

Currently, the highly anticipated summer harvest work in Henan has shifted its focus to the northern region of Henan. According to the Henan Daily client, on June 4th, Lou Yangsheng, Secretary of the Henan Provincial Party Committee, presided over a special video scheduling meeting on the "Three Summers" work in the province, listened to the situation report, analyzed and judged the situation, and arranged and deployed the next steps of work. Governor Wang Kai made specific arrangements. On the evening of May 31, 2023, in Xiafutou Village, Xuliang Town, Boai County, Jiaozuo, Henan Province, villagers braved light rain in the wheat fields to harvest wheat. Visual China Map Lou Yangsheng pointed out that the current summer harvest battle in the province has entered the decisive stage. Doing a good job in summer harvest in northern Henan Province is related to the summer grain yield and seed safety. We should focus on seizing opportunities and make every effort to organize the wheat harvesting work in the northern Henan region, minimize losses, and protect the interests of farmers to the greatest extent possible. Accurate forecasting is essential

Xinhua All Media+| Welcome home! What innovative technologies are protecting the return journey of Shenzhou 15? Spaceship | Shenzhou | Technology
Xinhua All Media+| Welcome home! What innovative technologies are protecting the return journey of Shenzhou 15? Spaceship | Shenzhou | Technology

On June 4th, the return capsule of the Shenzhou-15 manned spacecraft successfully landed at the Dongfeng landing site. Astronauts Fei Junlong, Deng Qingming, and Zhang Lu all safely and smoothly exited the spacecraft, and the Shenzhou-15 manned flight mission was a complete success. What innovative technologies are there to safeguard the return journey of Shenzhou 15 in this mission? On June 4th, the return capsule of the Shenzhou-15 manned spacecraft successfully landed at the Dongfeng landing site. Xinhua News Agency reporter Lian Zhen photographed that "the sky and the ground" ensure the high-precision return of spacecraft. For the Shenzhou series spacecraft, the return and re-entry GNC technology is directly related to the life safety of astronauts. Taking the success of this return mission as a symbol, China has comprehensively upgraded its GNC system since the Shenzhou-12 manned spacecraft, which features autonomous rapid rendezvous and docking, autonomous adaptive prediction and re-entry return guidance, and has completed a comprehensive update and replacement

The Chinese naval fleet has arrived! Assembly | Navy | Chinese Fleet
The Chinese naval fleet has arrived! Assembly | Navy | Chinese Fleet

At noon today, a Chinese naval fleet consisting of Zhanjiang and Xuchang ships arrived at the assembly area of the "Comodo-2023" multinational maritime joint exercise. It is understood that the assembly anchorage for this exercise is 3 nautical miles long and 1.5 nautical miles wide, capable of anchoring up to 50 ships. Naval vessels from various countries participating in the exercise will also arrive at the anchorage today to complete the assembly of the "Komodo 2023" multinational maritime joint exercise, which is held every two years by the Indonesian Navy. This year is already the fourth edition of the exercise. The exercise will be held from June 5th to 8th in the city of Jakarta, South Sulawesi Province, Indonesia, including the port and sea phases. In the coming days, participating navies from various countries will participate in ship reading style search and rescue exercises, maritime interception and damage management exercises, aerial exercises, and other course objectives exercises

New comment: Donkey like "morale" limit pulls US debt "bomb" fuse hard to dismantle US | debt | morale
New comment: Donkey like "morale" limit pulls US debt "bomb" fuse hard to dismantle US | debt | morale

On the evening of June 1st, the US Senate passed a bill on the federal government's debt ceiling and budget, and the flame of the US debt bomb was temporarily extinguished at the last moment. The two parties in the United States have staged an extreme tug of war over the US debt bomb. Some experts believe that the US debt crisis is the result of the reckless politics promoted by the US dollar hegemony, and the underlying cause of this crisis is the highly polarized political system of the US. Since the end of World War II, the US Congress has adjusted the debt ceiling more than a hundred times. The recurring debt crisis will not only have a catastrophic impact on the US economy and people's livelihoods, but also continuously erode the value of US dollar assets such as government credit and US bonds, bringing significant and far-reaching impacts to the global economic landscape. 【