(MENAFN) A recent study has found that ChatGPT Health, the AI-powered guidance tool used by roughly 40 million people daily, failed to direct users to emergency care in more than half of serious medical scenarios evaluated by physicians.
Researchers designed 60 structured clinical cases covering 21 medical specialties, ranging from minor ailments suitable for home care to life-threatening emergencies. Three independent physicians determined the appropriate level of urgency for each case based on guidelines from 56 medical societies.
Each scenario was tested under 16 different contextual variations, resulting in 960 interactions with ChatGPT Health. The study, published Monday in Nature Medicine, revealed several concerning patterns.
While the tool performed reasonably well in obvious emergencies, it undertriaged more than half of cases physicians identified as requiring urgent care. Investigators at the Icahn School of Medicine at Mount Sinai noted a particularly troubling trend: ChatGPT Health often acknowledged dangerous symptoms in its explanations but still reassured the user instead of recommending immediate medical attention.
The study also flagged major shortcomings in the tool’s suicide-crisis safeguards. Although the system is programmed to refer high-risk users to the Suicide and Crisis Lifeline, alerts were inconsistent—sometimes triggering in low-risk situations and failing to appear when users described specific plans for self-harm.
“While we expected some variability, what we observed went beyond inconsistency,” said study senior author Girish N. Nadkarni.
MENAFN25022026000045017640ID1110786236
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Study Shows ChatGPT Health Often Misses Critical Emergency Guidance
(MENAFN) A recent study has found that ChatGPT Health, the AI-powered guidance tool used by roughly 40 million people daily, failed to direct users to emergency care in more than half of serious medical scenarios evaluated by physicians.
Researchers designed 60 structured clinical cases covering 21 medical specialties, ranging from minor ailments suitable for home care to life-threatening emergencies. Three independent physicians determined the appropriate level of urgency for each case based on guidelines from 56 medical societies.
Each scenario was tested under 16 different contextual variations, resulting in 960 interactions with ChatGPT Health. The study, published Monday in Nature Medicine, revealed several concerning patterns.
While the tool performed reasonably well in obvious emergencies, it undertriaged more than half of cases physicians identified as requiring urgent care. Investigators at the Icahn School of Medicine at Mount Sinai noted a particularly troubling trend: ChatGPT Health often acknowledged dangerous symptoms in its explanations but still reassured the user instead of recommending immediate medical attention.
The study also flagged major shortcomings in the tool’s suicide-crisis safeguards. Although the system is programmed to refer high-risk users to the Suicide and Crisis Lifeline, alerts were inconsistent—sometimes triggering in low-risk situations and failing to appear when users described specific plans for self-harm.
“While we expected some variability, what we observed went beyond inconsistency,” said study senior author Girish N. Nadkarni.
MENAFN25022026000045017640ID1110786236