Health practitioners are becoming increasingly uneasy about the medical community making widespread use of error-prone generative AI tools.
One glaring error proved so persuasive that it took over a year to be caught. In their May 2024 research paper introducing a healthcare AI model, dubbed Med-Gemini, Google researchers showed off the AI analyzing brain scans from the radiology lab for various conditions.
It identified an "old left basilar ganglia infarct," referring to a purported part of the brain — "basilar ganglia" — that simply doesn't exist in the human body. Board-certified neurologist Bryan Moore flagged the issue to The Verge, highlighting that Google fixed its blog post about the AI — but failed to revise the research paper itself.
The AI likely conflated the basal ganglia, an area of the brain that's associated with motor movements and habit formation, and the basilar artery, a major blood vessel at the base of the brainstem. Google blamed the incident on a simple misspelling of "basal ganglia."
It's an embarrassing reveal that underlines persistent and impactful shortcomings of the tech. Even the latest "reasoning" AIs by the likes of Google and OpenAI are spreading falsehoods dreamed up by large language models that are trained on vast swathes of the internet.
In Google's search results, this can lead to headaches for users during their research and fact-checking efforts.
But in a hospital setting, those kinds of slip-ups could have devastating consequences. It's not just Med-Gemini. Google's more advanced healthcare model, dubbed MedGemma, also led to varying answers depending on the way questions were phrased, leading to errors some of the time.
"Their nature is that [they] tend to make up things, and it doesn’t say ‘I don’t know,’ which is a big, big problem for high-stakes domains like medicine," Judy Gichoya, Emory University associate professor of radiology and informatics, told The Verge."
What happened with Med-Gemini isn’t the exception, it’s the rule.
It is the perfect metaphor for the state of AI today: confidently wrong, beautifully phrased, and fundamentally hollow.
LLMs architecture cannot be evolved. No amount of scale, fine-tuning, or prompt engineering will turn a glorified guesser into a system that truly understands. It is structurally incapable of grounding itself in reality. It cannot model the world, cannot self-correct or admit when it doesn’t know because it doesn’t 'understand' anything to begin with.
The world deserves better, systems that are grounded in the real world, that learn and understand the way we do, not just mimic the surface of thought, but replicate the underlying process. Systems that know when they don’t know.
We call this performance theater as progress. We listen to LLM hypesters spinning foundational flaws into features, insisting that the next version will somehow cross the chasm, deep down even they know that it won't.
Build Cognitive AI → Unlock Real Intelligence.