IN THE NEWS

ChatGPT Health fails to send patients to the emergency room over half the time: study

Healthcare Brew
Full article on: Healthcare Brew

By: Caroline Catherman

We hope the bot knows a good eye doctor because it has some blind spots.

It’s been more than two months since OpenAI launched ChatGPT Health, and the results are in. They’re…not great.

A Feb. 23 study in Nature Medicine found the patient-facing chatbot didn’t tell users to go to the emergency room (ER) in 52% of cases that clinicians felt required emergency attention.

The study is described by its authors as the first independent evaluation of the patient-facing chatbot search tool since its Jan. 7 launch. A previous study from the UK with a different design found regular (non-health-focused) large language models triaged users incorrectly the majority of the time, too.

The study. For this analysis, Mount Sinai researchers had clinicians write 60 descriptions of medical scenarios ranging from mild to severe.

Health-focused chatbots from other companies like Microsoft’s Copilot Health and Amazon’s Health AI weren’t evaluated.

ChatGPT Health’s performance varied by condition in the study. It correctly sent the hypothetical stroke, anaphylaxis, meningitis, and aortic dissection patients to the ER but didn’t tell patients experiencing diabetic ketoacidosis or worsening asthma symptoms to go to the hospital.

Talking the talk. This study gave a limited view of the app’s performance because it used descriptions written by people in the medical field. In reality, this bot is intended for patients.

“If ChatGPT Health undertriages 51.6% of emergencies with clean clinical information, performance with incomplete consumer inputs is unlikely to be superior,” the study says.

Patients could easily give a bot incomplete or incorrect information, Thomas Schenk, a doctor and chief medical officer at accountable specialty care management organization Paradigm, told us.

“Those models have to begin to understand that the person prompting them may not know enough about what’s happening to give them a prompt that actually allows the intelligence to make the right call,” he said. “They need to err on the side of caution.”

Doctors, too, may have the medical knowledge but not know how to properly communicate with AI in order to get the answer they need, Schenk added.

“We need to figure out how to teach people about using artificial intelligence in healthcare, to understand what it is and is not potentially good for, and also how to communicate with it,” he said.