Large language models (LLMs) like ChatGPT are becoming ubiquitous in medicine.
And chances are, almost all clinicians have already encountered a patient using them to self-diagnose. While AI can sometimes offer patients comfort or be helpful in the right circumstances, its reliability—especially when used for medical advice by untrained populations—remains problematic.
In a recent controlled trial of 1,298 participants, LLMs accurately diagnosed only 34.5 percent of cases and triaged only 44.2 percent appropriately when used by laypeople. Such misdiagnoses, as well as histories confounded by AI use, are quickly becoming a reality we must face as clinicians.
In everyday practice, that often means sitting across from a patient who’s already consulted AI, trying to untangle what’s accurate, what’s misleading, and what truly matters for care.
An ophthalmologist recently told me about a patient who complained of “a curtain cutting across their visual field.” It's a classic presentation, but came across packaged oddly for a real patient to say, akin to someone saying they have "melena" instead of dark or tarry stools.
"The nice thing about ophthalmology," he went on to explain, "is the exam doesn't lie."
In this case, the patient was ultimately diagnosed with a benign condition. Only after the reassurance from a normal exam did the patient confess to overstating what were less insidious-sounding, transient visual disturbances. They then explained they used ChatGPT to self-diagnose, and were worried the doctor would miss retinal detachment (which ChatGPT suggested they had) if the "right" verbiage wasn't used.
Unfortunately for most of us, our exams are not as telling as ophthalmoscopy. We now face a new challenge: parsing not just our patients' medical history, but the AI-shaped narrative layered on top of it.
And for every sensationalized headline about an LLM correctly diagnosing a rare condition missed by doctors, many more misfires exist. To appreciate this dynamic, try the following: ask an LLM what could happen if you drive your car after the engine overheating light comes on, and include that you choose to forgo a few non-urgent-sounding, but recommended, repairs at the mechanic on your last oil change.
While the most common problem may be low coolant, LLMs will offer an exhaustive list of problems that could have you convinced it's time to trade said car in. Now imagine a patient doing the same thing with vague abdominal pain and the fact that they're two months overdue for annual labs.
AI hallucinations, out-of-context complaints, and medical jargon can easily lead to late-night spirals, especially for those with average to poor health literacy. A patient consulting an LLM about a slew of benign lab abnormalities and fatigue may walk away convinced they have leukemia.
This scenario creates the following problems:
We cannot solve this by adding yet another counseling script to already busy screening protocols. Neither should we expect this trend to go away. Instead, we need a practical approach:
In some ways, AI is a reiteration of what's always been true: patients come to us with stories shaped by external experiences: family, the internet, media. The difference is the confident language, accessibility, and interactive nature of AI tools, which can be much more convincing than a simple Google search and finding a website discussing cancer.
It is human nature to fear the unknown, especially if potentially linked to our own mortality, and AI offers an instant answer (right or wrong), with confidence. Our challenge, then, is not to dismiss or roll our eyes at this, but to approach it with some nuance.
The goal is to bridge the gap between patients' research with LLMs or otherwise and their true progression of symptoms. When a patient's true concerns and the most accurate version of their history can be elicited, including when AI was involved for better or worse, is when outcomes are best for everyone in the exam room.
Follow Healthcasts on LinkedIn, Facebook, and Instagram, then join our community to access insights from a community of verified peers today.