Why ChatGPT Can’t Tell Us About the Future of Healthcare

HAL-9000, version 0.9

Like many, I’ve been reading about, and trying out, the latest AI tool.  ChatGPT, created by OpenAI, is the most advanced chatbot you’ve ever seen.  In its back and forth conversational interface you can ask questions about just about anything and get pretty impressive answers.

For example:

Me: What was the role of Benjamin Franklin in the American Revolution?

ChatGPT: Benjamin Franklin played a significant role in the American Revolution. He was a prominent figure in the colonies and was involved in many aspects of the revolutionary movement.

    • He was a member of the Continental Congress, where he represented Pennsylvania and helped draft the Declaration of Independence.

    • He served as a diplomat and helped negotiate the Treaty of Alliance with France, which was a crucial alliance for the Patriots during the Revolution.

    • He also served as a member of the committee that drafted the U.S. Constitution, and as a delegate to the Constitutional Convention of 1787.

Not bad, right?

 

All images courtesy of DALL-E, ChatGPT’s sister program

 

Responses can not only be factual, Wikipedia-type responses, but also jokes, songs, or short stories.   The internet is currently full of examples of ChatGPT writing new Seinfeld episodes, country songs on improbable topics, and more.

Want a haiku about traffic on interstate 95?

Me: Write a haiku about traffic on interstate 95

ChatGPT:

Cars rush by in waves

Interstate 95 hums

Endless motion stream

I find these creative aspects of ChatGPT amazing: songs, jokes, short stories, ad copy, movie scripts.  There’s some serious challenge brewing here for the artists, writers, musicians and others who have been creating this stuff up to now.

It will take years to fully play out the implications for all sorts of creative endeavors. Of course, for the creative stuff there is “no wrong answer”: as long as I ask for a haiku and the resulting answer has the required 5-7-5 syllable pattern (and it’s about traffic on 95), it’s “correct”.

Facts are stubborn things

For factual writing, like the Benjamin Franklin response above, it’s a bit more problematic, and the problems arise from how ChatGPT comes up with its responses.

From a conversation of NY Times columnist Ezra Klein with Sam Altman, the founder of OpenAI (note: GPT-3 is an earlier version of ChatGPT’s “engine”):

EZRA KLEIN: And so if I basically understand how GPT-3 works, it’s a system that has read a lot of stuff on the internet.

SAM ALTMAN: Yes.

EZRA KLEIN: And it’s predicting the next word in the sequence.

SAM ALTMAN: Slightly oversimplified but very close. Yes, it is trying to predict what comes next in a sequence.

So ChatGPT tries to figure out how to make its responses closely match what has already been said in the large but not all-encompassing database that it’s been exposed to (about 600GB of data).   But it’s quite often wrong — and for it to be useful you need to have enough knowledge yourself to know if it’s wrong. For example, I don’t actually know if Ben Franklin helped negotiate the Treaty of Alliance with France. Sounds plausible, but I’m no expert.

I’d have to spend time Googling some more to be sure — which makes ChatGPT for finding facts a lot less useful. And if you’re not already expert in a topic, it’s easy to be fooled.

Me in the multiverse

Another example of the “stubborn facts” problem showed up when a friend told me to try asking ChatGPT for a bio of . . . myself.  Here’s an excerpt of what it came up with, with incorrect bits crossed out.

Joel Selanikio is a public health technologist and entrepreneur. He is the co-founder and CEO of DataDerm, a medical software company that uses machine learning algorithms to improve the diagnosis and treatment of skin diseases, and Magpi, a mobile data collection platform that enables organizations to collect and analyze data in low-resource settings.

Selanikio is a recognized expert in the field of public health technology, and his work has been recognized with numerous awards and honors, including the Skoll Award for Social Entrepreneurship, the Tech Award for Health, the Lemelson-MIT Award for Sustainable Development, and the Wall Street Journal Technology Innovation Award for Healthcare. He is a member of the National Academy of Medicine and the National Academy of Engineering.

Wow.  It’s like reading a very plausible bio of another version of me, somewhere out in the multiverse.  I did found a company called DataDyne (not DataDerm) but it’s got nothing to do with skin diseases.  I was honored with the Lemelson-MIT Award and some others — but not the Skoll Award.  And the National Academies of Medicine and Engineering haven’t quite gotten around to calling me yet.

 
 

Of course, if you didn’t know me exceptionally well, you might just assume that all of that was true.

Why did this happen? Well, very likely there just isn’t quite as much material about me online as there is about Ben Franklin and so when ChatGPT patches together its statistical magic it fills in the commonly-agreed upon facts with plausible nonsense — because it really has no idea at all of the meaning of any of the words it’s using.

As Kenan Malik noted recently in the Guardian:

And yet the chatbot that can write grade A essays … can make up facts and reproduce many of the biases of the human world on which it is trained … It is, in the words of the cognitive scientist Gary Marcus, a “mimic that knows not whereof it speaks”.

So when there is (A) widespread agreement on something, and  (B) that something is often written about, the mimic is likely to be correct.  If those two conditions are not true . . . not so much.

What does this mean for healthcare?

Much of what happens in health is the application of widely-discussed, very well agreed-upon wisdom, at scale, in clinics and hospitals around the world.  And sometimes this can be very cookie-cutter, indeed.  In my pediatric clinic, for example, the treatment for a three-year-old with ear infection is pretty uncontroversial — which suggests that ChatGPT will do a good job explaining it. Which it does:

Me: Tell me a good treatment plan for a three-year-old with an ear infection.

ChatGPT: A common treatment plan for a three-year-old with an ear infection is:

  1. Pain relief: Over-the-counter pain relievers such as ibuprofen or acetaminophen can help relieve pain and reduce fever.

  2. Antibiotics: If the ear infection is caused by bacteria, antibiotics may be prescribed by a doctor.

  3. Warm compresses: Applying a warm compress to the affected ear can help relieve pain and discomfort.

  4. Fluids: Encouraging the child to drink plenty of fluids can help clear the infection.

Bravo! If I had asked this question of a medical student and they gave this response, I’d give them full marks (this may be why ChatGPT was able to pass the medical licensing exams).

And the key here, again, is that the treatment plan for ear infection is very widely written about and very well agreed upon — therefore easily summarized by ChatGPT (or maybe even better by GlassAI, another new tool specifically aimed at medical diagnosis and treatment plans).

But even if we are limited to the things we know well in medicine, and about which we’ve written much, this is potentially transformative for healthcare — because that describes so much of healthcare. For topics like this, do we really need to pay a lot of money to a human with 20+ years of education to tell us the same thing?

And combining this automatically-generated treatment plan with an automated diagnosis using the AI tools now available very likely means one more reason we won’t need to go to the doctor (contributing to the already accelerating Rise of Selfcare).

 
 

I’m sorry, Dave, I’m afraid I can’t do that

Predicting the future of healthcare (or anything else), unfortunately, doesn’t really fit with ChatGPT’s requirements: though predictions are common enough they aren’t very well agreed upon. That means predicting the future, or coming up with novel concepts (as opposed to just summarizing generally agreed-upon concepts, or making novel combinations of words in a song or a poem) is beyond the current capabilities of ChatGPT or any existing AI system. ChatGPT will just summarize the blandest, most well agreed-upon, most widely written-about opinions available.

The VC Paul Graham hit the nail on the head on Twitter:

To illustrate, I asked ChatGPT what the future of healthcare was, and this is an abridged version of what it told me:

ChatGPT: There are several trends and advancements that are likely to shape the future of healthcare:

  1. Increased use of technology: The adoption of electronic health records, telemedicine, and wearable devices is expected to continue to grow and revolutionize the healthcare industry.

  2. Personalized medicine: Advances in genomics and other technologies are enabling the development of personalized treatments tailored to individual patients' needs.

  3. Emphasis on preventive care: There will be a shift from reactive to proactive care, with a focus on preventing illness and promoting wellness.

These trends and advancements will likely bring both challenges and opportunities to the healthcare industry in the years to come.


Really? We’ll be using more technology in the future? More telemedicine? Who knew? 🤦🏻‍♂️

The above response is essentially the outline of a great high school essay written by someone who knows nothing about the topic and is just summarizing what they read after a quick Google search. Or the outline of a million blog posts that we’ve all read on LinkedIn, and a million more healthcare system web pages.

For real insight and creative thinking about the future of healthcare — or the future of anything — we’ll need to rely on real humans with real experience of healthcare, government, law, technology, and lots of other fields; keen observers with that (still) uniquely human ability to think across domains and make the kind of connections in brain cells that can’t yet be made on computer chips.

At least for now.