Open Health Data is Here. Welcome To The Great Data Divide!

The New England Journal of Medicine’s Health Policy and Reform just published an opinion piece about the first public release of online report cards regarding 221 of the 1,100 US cardiac surgery programs. The authors believe that

this event will fuel the debate regarding the risks and benefits of public reporting, including the question of whether it assists patients in discriminating among sites of care.

I hope this blog post can be a modest contribution to this debate, by raising the awareness about the very real risk of a new and serious data divide, just as Susannah is sayingmobile was the final front in the access revolution. It has erased the digital divide. A mobile device is the internet for many people”, adding “we may be entering a new era where access isn’t the point anymore. It’s what people are doing with the access that matters“. Thinking about access, could there be negative unintended consequences of OpenData? OpenData is Difficult! For example, do we know “how many effective users are there likely to be for such services?”

We all agree that individual health data made available without proper context can be dangerous. But could opening large clinical outcome datasets and offering open online access be as dangerous by helping to distort future results and durably impact negatively health outcomes for some?

The Health 2.0 world is abuzz lately with the great potential offered by opening up aggregate public health information for outside analysis. The OpenData movement has clearly stormed in. Hackers galore are working to use this information to create the next generation of useful websites that will make health choices easier. The US and UK governments are actively promoting the public availability of large data sets in many domains where the government is involved. I haven’t heard anyone doubting the great impact these governmental initiatives will have. It seems so obvious. As advocates of patients direct and constant involvement in their care, we are equally excited at the potential these Open Datasets could have on health in a short period of time. Just as we have been advocating for Open Access to full text articles resulting from federally funded scientific research and for full access to our health record, including doctors notes. So, please do not take this post as an attack against opening any health data repository!

Let’s look in some more details at the NEJM article. The opened dataset is composed of clinical outcomes results for coronary artery bypass grafting (CABG) and the ratings are calculated from a registry developed by the Society of Thoracic Surgeons (STS) in 1989 . This may be the largest dataset of clinical outcome and associated ratings made available online. It is an important step forward in the opening to the public of data until recently hard to find or housed in closed governmental data silos. While we are still very far from “Gimme my damned data!”, opening this country-wide dataset and associated ratings is obviously a step in the right direction.

Even though some of us would expect all such data sets to be made public in the near future the reality seems to be vastly different. One particular aspect of this first US-wide ratings release deserve attention: each program that chose to make its data public was assigned a rating of 1,2 or 3 stars for overall performance. The performance thresholds are designed to identify with a 99% probability the programs given 1 and 3 stars, respectively, those that are truly below or above average, called the outliers. This method, over the past 3 years, identifies 23 to 27% of the programs as outliers. This first public release of a large clinical results dataset, as important as it is, is imperfect. The voluntary nature of the public release creates a skewed dataset. Also noticeable are the lack of long-term outcome assessment and of individual physician ratings.

Now, you may ask, even with these imperfections, what could be the negative impact of such an innovative program?

A tweet last week alerted me to a ground-shaking study. Mike Gurstein in “Open Data: Empowering the Empowered or Effective Data Use for Everyone?” says

this drive towards increased public transparency and allowing for enhanced data enriched citizen/public engagement in policy and other analysis and assessment is certainly a very positive outcome of public computing and online tools for data management and manipulation.

However, as with the earlier discussion concerning the “digital divide” there would, in this context, appear to be some confusion as between movements to enhance citizen “access” to data and the related issues concerning enhancing citizen “use” of this data as part, for example, of interventions concerning public policies and programs. [...]

In an earlier paper dealing with the digital divide discussion I suggested the use of the concept of “effective use” to distinguish between the opportunity for digitally-enabled activity presented by ICT access, from the actual realization of those opportunities in the form of “effective use”. At that time I introduced a set of layers of requirements, which can be understood as “pre-conditions” for the realization of “effective use” of digital “access”.

Susannah Fox at the Pew Internet and Life Project relentlessly analyzes the US population use of the Internet. Lately she has been writing about the Chronic Divide:

U.S. adults living with chronic disease are significantly less likely than healthy adults to have access to the internet (62% vs. 81%). The internet access gap creates an online health information gap. However, lack of internet access, not lack of interest in the topic, is the primary reason for the difference. [...] Living with chronic disease is also associated, once someone is online, with a greater likelihood to access user-generated health content such as blog posts, hospital reviews, doctor reviews, and podcasts. These resources allow an internet user to dive deeply into a health topic, using the internet as a communications tool, not simply an information vending machine.

The full report adds “statistically speaking, chronic disease is associated with being older, African American, less educated, and living in a lower-income household. By contrast, internet use is statistically associated with being younger, white, college-educated, and living in a higher-income household. Thus, it is not surprising that the chronically ill report lower rates of internet access.”

I cannot separate the above from Mike Gurtein’s comment:

Efforts to extend access to “data” will perhaps inevitably create a “data divide” parallel to the oft-discussed “digital divide” between those who have access to data which could have significance in their daily lives and those who don’t. Associated with this will, one can assume, be many of the same background conditions which have been identified as likely reasons for the digital divide—that is differences in income, education, literacy and so on. However, just as with the “digital divide”, these divisions don’t simply stop or be resolved with the provision of digital (or data) “access”. What is necessary as well, is that those for whom access is being provided are in a position to actually make use of the now available access (to the Internet or to data) in ways that are meaningful and beneficial for them.

The question then becomes, who is in a position to make “effective use” of this newly available data? [...]

Given in fact, that these above mentioned resources are more likely to be found among those who already overall have access to and the resources for making effective use of digitally available information one could suggest that a primary impact of “open data” may be to further empower and enrich the already empowered and the well provided for rather than those most in need of the benefits of such new developments (unless of course, they have means or the luck to find benefactors such as the Cedar Grove Institute or Harvard Law School graduates willing to work pro bono or on a contingency basis).

I have had this conversation with e-patient Dave a few times. Dave is a remarkable writer and can describe his vision of patient empowerment in a masterful way. But I am convinced he doesn’t speak for the poor and downtrodden, for those who have no job, no insurance and no powerful connections. For this segment of the US population the positive impact of the wide availability of eHealth resources is less than clear. In fact I am very worried that we are fast building a nation of health data outliers, with large numbers of both super-empowered and health data virgins, not by choice. In this context, the example of empowering the empowered mentioned by Michael Gurstein resonates strongly. Tim O’Reilly after reading the account said “we need to think deeply about the future” as he was preparing to launch the GOV 2.0 event in DC!

Read on:

newly available access to digitized land ownership and title information in Bangalore was primarily being put to use by middle and upper income people and by corporations to gain ownership of land from the marginalized and the poor. [...] They were able to directly translate their enhanced access to the information along with their already available access to capital and professional skills into unequal contests around land titles, court actions, offers of purchase and so on for self-benefit and to further marginalize those already marginalized. [...] This is not to suggest that processes of computerization inevitably lead to such outcomes but rather to say that in the absence of efforts to equalize the playing field with respect to enabling opportunities for the use of newly available data, the end result may be increased social divides rather than reduced ones particularly with respect to the already poor and marginalized.

Does anyone believe that something different will happen with the data releases by the STS? Based on the Pew studies it is hard to imagine how we will witness anything else than a growing data divide in health. And if that data divide really happens, the empowered will have access to the life-saving dataset and will act upon it, while many of the people suffering from chronic diseases (the same population that would benefit most from access to this information) won’t. Over time it is therefore probable that the 3 stars outliers, the centers of excellence, will treat an ever growing number of empowered while the 1 star outliers, the centers with high mortality, will get worse and worse result, simply because they will treat an ever growing number of digital outliers who haven’t the possibility to obtain health information/data and apply filters.

I believe the Open Data movement, particularly as it applies to health, is an important, highly transformative and positive development. For some! What do you think? What should be done?

13 Responses to “Open Health Data is Here. Welcome To The Great Data Divide!”

  1. Thank you for such a timely and important discussion. I am one who has been very concerned that the advances in access to and therapeutic benefits from open access to digital data will create a two tiered (or even three tiered) health care system with regard to access to and benefits from new technologies. However, I choose to see the glass as half-full. I am looking at innovators who have marshaled their skills to solve such problems in other venues.

    It’s clear to me that any revolution in e-patient health will require the concerted effort of health care professionals and patient’s to create effective tools to make access possible for individuals with low income and chronic illnesses.

    If the tools are made easy and attractive (I’m thinking about Facebook right now) then the adoption rate will be enhanced. But I also feel that there need to be “data champions” to evangelize for such tools in particular communities. Some approaches will work, some will not. But hopefully we will learn as we create the tools to assist the communities you have described.

  3. Susannah Fox says:

    Run, don’t walk, to watch the searing speech delivered at Mayo Transform by Alice Tolbert Coombs, MD, a critical care specialist/anesthesiologist and president of the Massachusetts Medical Society.

    Here is a link to the site (you have to scroll and click – no direct link to certain vids):

    She gave 3 case studies to illustrate her point that it is not just about health care access. As she put it, “Sometimes having health care coverage is like having a $40,000 check and not being able to cash it.” She described a resourcefulness/empowerment gap that is not limited to minority populations. She identified a different kind of disparity – people who don’t speak up, who don’t know to ask questions, get a lower level of care.

    Another quote I wrote down: “Forget medical home. How about a health home?” Meaning: intervene early, let people know they *can* quit smoking, lose weight, go to the gym. They *can* make a difference in their lives by taking action.

    • “people know they *can* quit smoking, lose weight, go to the gym. They *can* make a difference in their lives by taking action”

      That’s true but it will only help the medical problems generated by lifestyle issues. It won’t help any of the people who develop cancer due to a series of genetic mutations whose origin is unclear or unknown. Or for anyone suffering from any orphan disease. For those, having access to systems that both collect, aggregate and filter the data they input may become more and more important. These systems will require high literacy of the digital, health and data kinds. And they will require trust in the research enterprise. Considering the highly justified high level of mistrust many African-Americans have for the medical-research systems I believe that this divide will only get bigger unless a concerted effort is made with leading churches and other trusted entities with the African-American community.

  4. Two issues here: (1) is robust access a good thing? (we all agree it is), and (2) does it make a difference to those on the wrong side of the digital divide?

    When books first became available they were hand written and only acquired by the super rich; when movable type was introduced they were affordable by the ordinary rich and more middle class. When they became dirt cheap and libraries could acquire them, many more people of all social classes enjoyed the benefits of access.

    Making information digitally has a similar impact, albeit it is happening at warp speed rather than over centuries as the price of Internet distribution has dropped like a rock. I remember getting an email at Medscape in 1996 or 1997 from Indonesia. The writer was a health practitioner who had access to the Internet in a library. Prior to the availability of Medscape he relied on old versions of the Merck Manual, Harrison’s and a 10-year old PDR, and now he had the same access to current information as clinicians in the most rarified corners of the First World.
    So the simple act of providing inexpensive fast access to trusted information and data IS a huge deal that does promote understanding and better healthcare across the digital divide, for patients and providers. Like “trickle down economics” it is far from perfect. But at least we finally have a cheap technology that gives us a shot at improving information equity and we should continue to press that front.

    The digital divide is real. We all need to work to end the plight of people who live in poverty, ignorance, or those trapped in exploitation, war, or oppressive governments that deprive them of the ability to benefit from the information technologies we find so promising. There’s no reason why we can’t push both fronts.

    A good case history on the impact of access was the massive deployment of public libraries in New York City in the first part of the 20th Century, where millions of impoverished residents took advantage of free and powerful educational resources to acquire skills in language, literacy, and technology. Today the system is probably the largest provider of free Internet access and courses in how to use it in the city (and government support of libraries among the first items to be cut, which is just stupid). Smart politics should press the case for access to data and education in digital literacy at low or no cost.

  5. bev M.D. says:

    I agree with Peter. I do not think we can yet envision the entire impact of this new information; the book analogy is a good one. I also think, Gilles, that you are assuming the health care system will remain as it is today, and applying the effects of the “digital divide” to that. I think the system will continue to evolve, perhaps explosively, making predictions even more difficult. So I, too, am optimistic that “a rising tide carries all boats” or however that goes.

    Susannah, as for the 3 cases cited by Dr.Coombs, I don’t think they had anything to do with something the patients didn’t do.I have only one word for them: malpractice.

  6. bev M.D. says:

    Correction: I should have said “data divide” instead of “digital divide.” This is getting confusing! (:

  7. Excellent discussion. Peter’s right, I think, to see the rise of provider “report cards” (generally speaking) as a technological innovation with an uncertain but likely positive effect, as with public libraries–or the Web. But there’s a broader point to make.

    Most of us health care consumers don’t rely on report card “grades,” even if we know report cards exist. The best effect–especially with procedures like CABG–is to spur providers, competitive monsters that they are, to work hard to improve their scores. (Doubtless that was the thinking of the Society of Thoracic Surgeons, in developing and publicizing these markers.)

    In that way, everyone benefits, on both sides of the data divide, and roughly equally. This rising tide lifts all boats.

  8. marnie webb says:

    This *is* an excellent discussion. And it begs the question — what are we going to do about it? Is it about getting individuals access and training — maybe through the organizations that are serving them? Is it about providing internet tour guides that help get the needs met (in the same way that some hospitals provide advocates)?

  9. Dr. David Blumenthal published an important letter yesterday, in which he states

    These are historic times. The HITECH Act is bringing the power of electronic health records to our health care system.

    We are writing to solicit your assistance in making sure that we are not creating a new form of “digital divide” and want to make sure that health IT vendors include providers who serve minority communities in their sales and marketing efforts. [...]

    It is absolutely necessary that the leading EHR vendors work together, continuing to provide EHR adoption opportunities for physicians and other healthcare providers working within underserved communities of color. Despite our best efforts, data from the National Ambulatory Medical Care Survey indicates that EHR adoption rates remain lower among providers serving Hispanic or Latino patients who are uninsured or relied upon Medicaid. Moreover, this data also identifies that EHR adoption rates among providers of uninsured non-Hispanic Black patients are lower than for providers of privately insured non-Hispanic White patients.

    Racial and ethnic minorities remain disproportionately affected by chronic illness(es), a contributing factor to intolerably high mortality and morbidity rates. Electronic health records possess the ability to help improve both the quality and efficiency of medical care accessible by minorities, so that perhaps rates of chronic illness, mortality and morbidity decrease within these communities. It is critical that this administration, Regional Extnesion Centers and EHR vendors work together and focus substantial efforts on these priority populations.

    To discuss outreach opportunities further, please contact Dr. Sachin H. Jain, at ONC and Commander David Dietz, at OMH.”

    It is wonderful to see that ONC is now actively engaged in limiting the negative impact of the digital and data divide.

  10. Gustavo Speed says:

    Great post, I agree completely. Obviously those with chronic disease that do not have access to the international probably have no caregivers, family or friends who are interested in the patient and could do the internet research for them.
    This why I have long advocated a federally funded “internet advocate”. Someone full time to look up and research the disease and therapies available for each patient. At the same time thay could discover the location of best care in the nation (?world) and we could have the patient moved to that facility at no expense to the patient. Family and friends would be able to visit the person at no expense to themselves.
    This person should have at least a high school education and be completely versed in internet usage. Medical familiarity would be beneficial as well such as watching house. With these full time Internet advocates, patients with chronic disease will have the latest in cutting edge knowledge that only the internet makes available and no longer have to be at the mercy of the local physicians.

