A Turing test for diagnosis: BMJ evaluates online symptom checkers; good Globe article

A new article in the BMJ this week reports on a good, clever evaluation of 29 online symptom checkers, showing that some have a clue and some don’t. I love it; in my view the bottom line is “Some are better than nothing, none is near perfect, and some are junk. Learn, but as with all things online, be careful.”

The method used (described below) is clever: it’s not unlike the famous Turing test (Wikipedia), which examines whether you can tell the difference between a computer and a real person. (Answer: real doctors, while far from perfect, are a lot better than today’s websites.)

Today’s Boston Globe has a good front page article by Felice Freyer about it. SPM co-founder Susannah Fox, currently Chief Technology Officer at HHS, is quoted, as am I – it’s great to see participatory perspectives sought out by the Globe!

Method: evaluate websites the way we evaluate docs

I love the method used by authors SemigranLinderGidengil, and Mehrotra at Harvard: they fed each website a “standardized patient” scenario. “SPs” are how medical students are tested: a trained actor answers the student’s interview questions; the student’s job is to hone down the options to reach a correct diagnosis, and in this test that was the website’s job, too. (That’s the Turing-like aspect: respond to the website like you’d respond to a human.) The researchers answered whatever questions each site asked, and recorded its suggested diagnoses.

Sensibly, the authors didn’t just report on whether the website produced the right diagnosis at the top of its list; they also reported whether the right answer was in its top three. That’s realistic: anyone who searches online for travel options, restaurants, etc knows that people look at the top few results. Plus, if you’ve ever googled your symptoms, I’m sure you at least scrolled a bit.

They also reported how often the right answer was in the top 20, but, as they noted, hardly anyone ever looks there. (To me the usefulness of this number isn’t who got it right, but who still didn’t have the right dx. Think about it: any site that doesn’t have the right answer in the top 20 is just… well, worse than nothing, eh?)

Separately, they reported how well the sites did at triage – recommending whether you should see a doctor. Not surprisingly, the sites were generally loathe to say “You don’t need a doctor” – who would want to be sued for giving that advice, if it turned out to be a real problem? In our Society we know that patients are responsible for their own decisions: seek information online, be responsible and cautious, and in particular be responsible for your decisions about when to seek help.

How did the sites do?

Happily, the BMJ article is open access so you can go comb through the results tables yourself. Here’s my highlight list. In short, there was a huge range:

  • Correct diagnosis was listed #1: sites’ performance ranged from 50% of cases (DocResponse) to an abysmal 5% (MEDoctor). 50% is impressive for a robot that can’t see you, but still wrong half the time, eh?
    • Runners-up: FamilyDoctor 47%, AskMD 43%
  • Correct diagnosis was in the top three: sites’ performance ranged from 71% of cases (Symcat) down to 29% (Better Medicine).
    • Runners-up: Isabel 69%, AskMD 68%, DocResponse 67%, iTriage 64%
  • Correct diagnosis was not in the top twenty: these sites get the booby prize for not even having the right answer in the top twenty half the time:
    • Doctor Diagnosis only mentioned the right answer 46% of the time, Symptify 44%, MEDoctor 43%, BetterMedicine 38%, Symptomate 34%, and EarlyDoc 33%.
    • That’s just epically bad. I mean, what are they thinking inside that robot, if 2/3 of the time the right answer isn’t even in their top 20 guesses??? Was the robot stoned during class?:-)
  • Triage advice (“should I see a doctor?”) I’ll only comment on the ratings of “Is self-care reasonable with these symptoms?”
    • Correct: Drugs.com 62%, Harvard Medical School Family Health Guide 62%, Family Doctor 60%, Steps2Care 57%
      • Note: even the best of them is very often wrong.
    • Booby prize: These sites never said it’s safe to stay home (what??): iTriage, DoctorDiagnose, Isabel, Symcat and Symptomate.

The other half of the Turing test: how well do doctors do?

Diagnosis is difficult. We wrote about it here in 2011 (Action in the face of uncertainty), and the New York Times reported in 2006, “Studies of autopsies have shown that doctors seriously misdiagnose fatal illnesses about 20 percent of the time. So millions of patients are being treated for the wrong disease.”

Note: that doesn’t say “Doctors are wrong 20% of the time”! Autopsies are done on dead people, not people who survive. It just says that it’s not at all rare for docs to miss a diagnosis – because diagnosis is hard.

e-Patient takeaways

Our 2011 post ended with the question, “What do you do in the face of uncertainty? How many sources do you check before you decide to move forward?” So it is with symptom checkers.

In the Society for Participatory Medicine we’ve long known (since the pioneering work in the 1980s and 90s of “Doc Tom” Ferguson) that patients and families contribute more to healthcare when they’re empowered, engaged, informed; more information is a good thing. But we also know that although everyone would like certainty, we are all often operating in an atmosphere of our current best guess based on available facts.

e-Patient takeaways:

  • Diagnosis is hard. Don’t expect perfection anywhere. Keep an eye on whether the treatment seems to be working.
  • There’s a huge range of accuracy in symptom checker sites. (SPM past president Dr. Josh Seidman notes that the best site is ten times better than the worst … but even the best will be wrong a lot.) (See comment below for more on his work.)
  • So, get a second opinion – from another robot (another website).
    • If two agree, that’s a good sign – but they still might be wrong. If three agree, that’s even better – but still, don’t assume you’ve got it.

I’ll add this personal opinion (not an official SPM policy statement):

Do not go in to your clinician and say “I figured out I have condition x.” Go in and say “Before calling you I tried to learn what I could, and here’s what these sites said. What do you think?”

(I also recommend not bringing the famous eye-rolling “big stack of printouts” – a summary is enough. And if you’re really lucky, your clinicians will discuss it in email first!)

And for visionaries, policy people, and software developers, the best advice is not in the BMJ article but in Susannah Fox’s comment to the Globe:

“The reality is, people are using these tools and consulting Dr. Google whether we like it or not,” Fox said. “What this study shows is that we have an opportunity to open up a new avenue to reach people with quality information.”

Caveat: when business interests meet patient needs

A closing note: as a businessman involved in patient empowerment, two things stuck out to me.

  • The famous and much-viewed site WebMD has a symptom checker, and you’ll notice it’s not listed above – because it was perfectly mediocre, placing in the middle of the pack: #10 of 19 in the “Top 3” test. In the Globe article WebMD commented defensively: “Dr. Michael Smith, chief medical editor of WebMD, defended his website’s symptom checker, which gets about 4 million visits a month, and described it as accurate and helpful.” Really? If “We got the right answer in the top three 51% of the time” is your idea of accurate, do I trust your thinking??
  • And I have to ask, what’s going on with iTriage, owned by insurance company Aetna, never endorsing self-care, always saying “we recommend you spend money”??

So it goes when business meets healthcare, I guess: buyer beware; patient be wise.

SPM member Dr. Michael Mascia summed it up on our members-only listserv:

People and Patients still, yet and again need reliable physician partners … or, in the future whomever the designated healer of the moment may be … partner and partners will be needed.

Technology can help us deliver better and best health and health care and each of us (responsible adults) must carry the bulk of the burden to achieve best care and treatment.

But, when it comes to sick care … there comes a time when each of us needs help from other folks.  That’s when we need reliable partners.



Posted in: e-pts resources | news & gossip | positive patterns | pt/doc co-care | shared decision making | trends & principles




8 Responses to “A Turing test for diagnosis: BMJ evaluates online symptom checkers; good Globe article”

  1. On the members-only listserv, SPM Past President Josh Seidman added this:

    This is totally in keeping with what I found in my own dissertation research at Johns Hopkins 13 YEARS AGO: If you want to know whether online info is good, you actually have to measure the accuracy and comprehensiveness of it.

    In case you’re interested, I published two peer-reviewed journal articles, both in JMIR from my dissertation:

    1. Methods paper–how I developed the objective, systematic tool for measuring Internet health info quality (for diabetes): http://www.jmir.org/2003/4/e29/

    2. More about the findings–including the range in quality, from 15% to 95%: http://www.jmir.org/2003/4/e30/

  2. This is a great post about evaluating the validity of today’s options: online symptom checkers sites and, yes, also of physicians.

    As alluded by the post, the problem is complex and definitely not black or white. Even when I see a human doctor, I like to get more information and possibilities online before and after the visit. The avilability of possibilities (even is some are wrong), enables me to ask better questions and brainstorm the situation with and without the human health provider. As Voltaire said: “Judge a man by his questions rather than his answers”…

  3. I was fascinated by the “rapid responses” (aka comments) on the article on the BMJ site, particularly including the response from the founder of Isabel.com, one of the websites tested. Look what he said – it’s participatory! emphasis added:

    … this study, like most discussions on this topic is framed in the wrong way. These tools are designed primarily to help the patient become better informed and be able to ask their doctor the right questions. They are not intended to encourage the patient to diagnose themselves and avoid a discussion with a clinician.

    This is about the patient and doctor working as partners to get to the right diagnosis and receive appropriate care and treatment as soon as possible. …

    Huzzah! I took for granted that all the evaluated sites market themselves as symptom checkers, but perhaps there’s more to it than that! If we’re instead talking (on some sites at least) about patients “pre-loading” themselves with a somewhat informed clue, then it shifts the time and location of where and when a patient GETS a clue … and then the face to face meeting (or telehealth e-visit) can be spent moving forward, not starting from clue 1.

    This would be a perfect example of what @EricTopol wrote about in The Creative Destruction of Medicine, in which the fundamental assets of something (in this case a diagnosis) get pulled apart and reconfigured, so the job gets done in a new way.

    And that is exactly what our “Doc Tom” Ferguson was talking about in his e-Patient White Paper in 2006 when he chose this for the title of the closing chapter:

    The Autonomous Patient and the Reconfiguration of Medical Knowledge

    Don’t you love a visionary?

  4. As a physician on the field for 30 years, I totally agree with e-Patient Dave on that this matter is not a black or white case.

    As Pew Research Center identified group “online diagnosers” includes about 35 percent of all US adults. What is more, eight in 10 of these online diagnoses start with a search engine like Google.

    After all, not everyone is a physician, nor does everyone have a spouse or relative who is a doctor; and in the circumstances, waiting to see a physician for answers, or not having the funds or health insurance to do so, merely delays diagnosis.

    A reliable balance with the human doctor and a digital doctor should be placed in front of the patients.

    In our new site we tried to add some different perspective and approach for the symptom checker.

    Progress list of simple-to-understang Q&As, possible diagnoses and suggestions that can be downloaded or e-mailed in report form to the patient and/or his/her physician (This data isn’t saved on the Caredir® website to protect the privacy of the user).

    By this way the patients would feel the freedom to have “some” reliable answers to their complaints but not a replacement for their conventional diagnostics -treatments.

    You may check it:

    Nothing can replace a “real human” doctor and a valuable patient/health concerned person face-to-face interaction.

    Kind Regards,
    Mustafa K.Calik,MD

  5. Want to understand how to improve the diagnostic process, the role of the patient in doing so, and how to influence the teaching of diagnostic skills? Our SPM members are also involved with the Soc to Improve Diagnosis in Medicine, whose Sept 26-29 conference is in Washington, DC. The Patient Summit opens the conference, with emphasis on outpatient diagnosis, and how the patient can be effective in getting a proper diagnosis, how to talk with the doctor,and how to influence policy to support that approach. http://www.improvediagnosis.org
    Did I mention that Helen Haskell and I are the organizers of the Patient Summit, and that it is free? Looking for support in general!

  6. Ted Eytan says:

    The desire or belief that a computer system will one day replace the human interaction between a doctor and a patient has always been fascinating to me. Agreed with Dave/all that more information and empowerment for patients is better, but maybe the idea that this will “replace” is unrealistic.

    Sometimes it seems that the root cause is actually lack of access to a an expert that can be trusted, who can (a) engage around this information and (b) be in a continuous relationship with a person, because sometimes a diagnosis takes more than 1 visit to figure out.

    Way before this article written, this book was written, and here’s what the author said, in 1966:

    “It may be that computers will soon diagnose better than doctors. But the facts fed to computers will still have to be the result of intimate, individual recognition of the patient.”

    The first part of his statement is still not true almost 50 (!) years later….

    • e-Patient Dave says:

      Thanks for yet another great comment, Ted. I googled that phrase, and found that on your review of Bob Wachter’s The Digital Doctor you cited the phrase as being from John Berger, 1966. Digging more, I found that in 2008 you reviewed the source book, A Fortunate Man: The Story of a Country Doctor.

      I highly recommend that everyone go have a look at the Fortunate Man review. The book was written a half century ago, and talks about “the deep but unformulated expectation of the sick for a sense of fraternity.” I ask: can we experience care without that? (Discuss!)

      To be sure, evaluating a list of symptoms is one thing doctors do, which computers can too. But I’ll be fascinated if someday we see all of Berger’s view of the caring expert expressed in a computer.

      Meanwhile I want Ted Eytan on call. :)

  7. I totally agree with the comments above. Although a symptom checker can misdiagnose a patient, it can also help them guide their doctor into diagnosing them because they are following the questions asked on the symptom checker.

    I do agree that people may take the diagnosis the wrong way. Especially if they are told to go to the A&E and end up just having indigestion instead of a heart attack. It is a growing problem in the UK that people are having to wait longer and people are going to the wrong facility for what they have. That’s why we have introduced 101 and other services to help patients get the right care. So if a person then is told they should go to the hospital, they can phone 101 who will then advise them further.

    As long as they are used correctly, which I don’t think they are, it can be beneficial. It would be better for websites, like patient.info have, to put disclaimers before the checker and tell people who to phone.

Leave a Reply