Can ChatGPT pass a medical test? | | | | Turtles AI
Can ChatGPT pass a medical test?
DukeRem15 February 2023
In a recent study published in PLOS Digital Health, the performance of ChatGPT on the US Medical Licensing Exam (USMLE) was investigated. The research was conducted by a team of scientists at the University of California, San Francisco, and the University of California, Berkeley, who wanted to explore the potential of AI in the medical field.
The USMLE is a three-part test that assesses the knowledge and skills of medical students and physicians. The exam is used to determine whether or not a candidate is qualified to practice medicine in the United States. The minimum passing accuracy is 60%, and the pass rate is generally above 90%.
The software ChatGPT, developed by OpenAI, was used to generate answers to the questions in the USMLE.
According to the study, ChatGPT achieved an accuracy close to the passing accuracy in most settings, and within the passing range for some tasks. However, the authors noted that this does not suggest that ChatGPT has comparable knowledge to a human, as the test is designed to predict performance for a pre-selected population of MDs who have completed a residency.
The study authors also took care to ensure that the test questions were not part of the training set and recommended further exploration of the potential for mixed-blind assessments of ChatGPT and human answers.
The researchers emphasized that while ChatGPT's performance is impressive, there are limitations to written tests in assessing performance in complex and multi-disciplinary professions such as medicine. They called for a more comprehensive approach to technology solutions, including providing in-person clinical care to patients.
Despite these limitations, the results of the study highlight the potential of AI in the medical field. One of the most promising applications of ChatGPT is the development of tools to help researchers process large amounts of literature. Such tools can summarize information and answer questions, providing a useful resource for medical students and practitioners.
According to Dr. Stuart Armstrong, Co-Founder and Chief Researcher at Aligned AI, we should expect to see more successes like this in the future. However, he cautioned that there are many areas where humans are much more effective than AIs, and that human superiority won't last forever.
Prof. Alfonso Valencia, ICREA professor and director of Life Sciences at the Barcelona National Supercomputing Centre (BSC), provided additional insights into ChatGPT's performance. He noted that the system has improved significantly in just a few months, partly due to increased biomedical data. He also observed that the quality of the results was correlated with the quality of the explanations and the system's ability to produce non-trivial explanations.
Overall, the study provides a glimpse into the potential of AI in the medical field. While there are still many challenges to be overcome, including the limitations of written tests, ChatGPT's performance on the USMLE is a promising development. It highlights the need for a comprehensive approach to technology solutions that includes in-person clinical care, and the potential for AI tools to help medical researchers process large amounts of literature.