Vaccine Allocation at Stanford Hospital


In a video that was widely shared last Friday, a representative from Stanford Medical Center spoke to residents protesting how the hospital chose to allocate its first shipment of COVID-19 vaccines. The hospital had around 5,000 initial doses to distribute (and expects to have tens of thousands more within the next few weeks), and came up with an allocation scheme in which only 7 of the approximately 1,300 residents were on the list. Many of these residents deal directly with patients who have COVID-19, whereas other more senior physicians, as well as other front line workers, such as nurses and food service employees, were given priority. In the video, the spokesperson explains that the algorithm they used to come up with an allocation scheme “clearly didn’t work”, to which protestors respond by shouting “Algorithms suck!” and “Fuck the algorithm!”

This episode is exceptionally revealing of numerous aspects of algorithmic decision making more broadly, the kinds of problems we are likely to see in the future, and how we can do better. To be clear, I have no special knowledge of processes involved, but want to use this episode as an example to think through various ways things can go wrong, and how we communicate about processes and problems.

First, it is worth quoting from the video at length.

Spokesperson: Our algorithm, that the ethicists, infectious disease experts, worked on for weeks, to use age, high risk work environment, prevalence of positivity within job classes—a very complex algorithm—clearly didn’t work. There’s problems with our algorithm. And with the best of intentions, which I appreciate don’t matter today, that was the hope, that we were going to move very quickly through people who were in those high risk categories. It’s about people at risk for dying, people at risk for transmitting the disease, and what we know about our population.

Crowd: We have families too!

Spokesperson: So we have problems with that algorithm. And as soon as we determined it, which only happened yesterday, because we moved up our vaccinations by a full day …

Crowd: [shouting… ] Don’t lie to us!

Spokesperson: I’m not lying to you.

Crowd: It seems that the intentions have been very good, but there was a mistake, and there was no resident representation. How do we prevent this from happening in the future?

Vaccine distribution is a classic allocation problem, and a difficult one. Giving someone a vaccine has a direct benefit, in terms of massively reducing their personal risk, but also a collective benefit, in terms of mitigating the transmission of the disease. (It’s worth noting that we don’t know for sure yet that the COVID-19 vaccines are effective at slowing transmission, but it seems like a reasonable assumption, given that they are highly effective at protecting individuals). Different people face different levels of risk, as determined by many factors, including how much they come into contact with infected patients, age, underlying health conditions, etc. They might also face different levels of risk in terms of their likelihood of transmitting the disease to others. And many other factors could be brought into play, including who they are at risk of transmitting the disease to, some measure of “importance” (key decision makers might be prioritized, for example), and so forth. Everyone would benefit from getting vaccinated early, and there is not enough to go around, at least initially, and so some method of determining who gets the initial doses must be used.

To unpack, this let’s go through the transcript in more detail.

Our algorithm”: The spokesperson describes the way they came up with an allocation scheme as an “algorithm”. This is likely accurate, as algorithm is a broad term, really just implying an explicit process. However, in this case it also tends to convey a greater degree of complexity than was likely involved. A likely possibility is that the team came up with some sort of formula that weighted different attributes of people, such as age and role, and assigned people a score based on that formula, then sorted the list. (Edit: subsequent reporting has shared a presentation slide that indicates something very much like this). It is not wrong to call that an algorithm, but it may diverge from people’s expectations of what an algorithm is. More importantly, in part because of the suggestion of complexity, the use of the term “algorithm” here plays a rhetorical role here to deflect responsibility. Rather than saying “our team”, “algorithm” suggest that the choice was to some extent out of their hands, or that it was somehow more principled or fair. Mention of algorithms might still suggest a degree of disinterestedness that perhaps cues “reasonable” for some people (though I expect this is quickly evaporating).

that the ethicists, infectious disease experts”: Here, the spokesperson emphasizes that their process was designed by experts, in this case ethicists and infectious disease specialists. It seems clear what the latter might have contributed (knowledge of what puts people at risk and how the disease spreads), but less so what the ethicists would have contributed, or what perspective they brought. A reasonable possibility would be that they were there to ensure conformity to certain norms, including protecting patient rights, preventing inappropriate use or release of information, etc. One might hope that ethicists here would assume the responsibility of ensuring that there were no obvious failures of the output, though that seems to not have been successful here. As with “algorithm” above, the use of “experts” also simultaneously works to suggest “informed” and “reasonable”, and yet undermines its own credibility in the face of failure.

worked on for weeks”: The spokesperson underlines that that this was a devoted, arduous effort that the people involved took seriously. Unfortunately, this also serves to further undermine their credibility. It is unclear exactly what labour was involved in this process, and why it required weeks to come up with a way of deciding the allocation. It seems highly unlikely that there would be any computational roadblocks involved, but perhaps there was a process of consulting the literature or the relevant parties (and possibly dealing with bureaucratic hurdles). In any case, more time clearly should have been reserved for interrogating the result and considering how people would react.

to use age, high risk work environment, prevalence of positivity within job classes”: Here we get an indication of what sort of information was used in determining people’s priority. The fact that this list is so short and incomplete reveals one of the most obvious problems with the process used, namely it’s opacity. To the residents it seems incomprehensible that they would have been left off the list, when other people at seemingly lower risk were prioritized. Making the formula used explicit would have removed much of this confusion. Moreover, making it explicit earlier might have provided the opportunity for others to discover potential flaws. One could certainly imagine a case in which a reasonable approach nevertheless produced a seemingly unreasonable result, in part due to the limited number of doses available. In this case, however, it seems much more likely that it was a failure of information or design. We obviously don’t actually know how to calculate any one person’s risk. Even if we did however, this sort of approach necessarily reduces the complexity of each individual to more coarse representations—hinted at here by the term “job classes”—such that there could easily be flawed outcomes for many individuals, even if the result for the “typical” person is reasonable. In this case, however, it seems likely that this was primarily an informational failure. Presumably a fairly basic data point, such as how frequently a person interacts with infected patients, was either not used, or not properly measured for residents, such that the result might seem end up seeming absurd.

a very complex algorithm”: Again the spokesperson emphases the complexity of the process. It is certainly possible that the algorithm was quite complicated, but that would in many ways be another strike against it. Given that we still know so little about exactly how much risk each person faces or how the disease is transmitted, there is presumably insufficient information to be able to justify a truly complicated model. That being said, it seems more likely that this is simply a rhetorical exaggeration, again serving to partially deflect responsibility.

clearly didn’t work”: This is perhaps an anticipated way of acknowledging the grievances of the protesters. However, another way of reading this is as a statement of characterizing the broader outcome. The process “clearly didn’t work” in so far as it produced protests by the residents. Even the process had led to a prioritized list that did the most to protect the most vulnerable, it would clearly not have worked if was not accepted by the people involved. This underscores again that in cases like this, where there is no single correct solution, it really does come down much more to a question of who has the power to make the decisions. In this case, the administrators presumably assumed that they had the power and responsibility to determine who would be vaccinated first, but underestimated the power held by the residents as an organized group.

There’s problems with our algorithm”: This is both another affirmation of the grievances of the protestors, and a hint at how the administrators conceptualize the situation. Suggesting there is a problem with the algorithm suggest that they believe their general approach—of using some sort of formula to come up with a prioritized list—is appropriate. It’s just that they ended up with the wrong formula in this case. What this doesn’t do is question the broader processes involved, or whether this sort of algorithmic assignment is the right approach. And of course it is unclear how all parties feel about this. Would the residents have opposed this sort of approach if it had given them top priority? It’s hard to know, but if not them, then others likely would have had similar grounds for complaint, not least due to the lack of transparency.

And with the best of intentions, which I appreciate don’t matter today”: This is another expected statement, affirming that they were trying to do the right thing, and it is not irrelevant. It is certainly better that the administrators tried to do the right thing, than if they had maliciously set out to harm people. And yet, as with many other parts of this, it tends to subtly undermine credibility. If they truly had the best of intentions, how could this result have emerged? Again, it is hard to know, but one can speculate it must have been some combination of insincerity, incompetence, inadequate flexibility, and inaccurate estimation of the power or preferences of the people subject to their decisions. It is because good intentions are no guarantee of good outcomes that more is required in attempting to make use of the power to decide for others.

that was the hope, that we were going to move very quickly through people who were in those high risk categories”: This provides yet more insight into the intentions. The goal was to vaccinate those at high risk, and to do so quickly. It is somewhat unclear what “those” high risk categories refers to here, but presumably it means those who are older, those who work in high risk environments, those with high “prevalence of positivity within job classes”. More interestingly, it also provides another window into how this was conceptualized. The spokesperson refers to high risk “categories”, which presumes the clean division of people into groups representing either high or lower amounts of risk. It is completely understandable why people adopt this sort of approach, as it simplifies the problem greatly, but it also tends to mask the underlying complexity—that there could be great variation within a category, and that systemic use of these sorts of categories can lead to a totalizing result that people may perceive as highly unfair.

It’s about people at risk for dying, people at risk for transmitting the disease, and what we know about our population”: This reinforces that they were trying to optimize for the dual purpose of vaccination (protecting individuals and slowing transmission). It is also interesting for its emphasis on information—“what we know about our population”. Even if we did know perfectly how to compute someone’s risk (of either type), such calculations would only be as good as the available information. Indeed, this is a classic way in which these sorts of models can fail—missing data is challenging to handle, and can lead to poor decisions, but it can also be easy to ignore. The system will still produce a score for each person, whether or not their information is complete, and without greater transparency, it may not be obvious that this is a problem. In this case, ProPublica reports that residents did not have an assigned “location”, which sounds exactly like a classic informational failure.

We have families too!”: This comment from the crowd seems to be a classic appeal to equality. There has been no previous mention of families in this video, but it may be a reference to some other prior communication. It is clearly a different kind of rhetorical move—making an appeal to a certain kind of information as being especially important (having a family), and implying that the administrators are being heartless. Although it is perfectly rational to want to come up with an optimal allocation, this is also an emotional issue, and any “solution” which ignores that is unlikely to be accepted. How to balance calculation with sensitivity remains a serious challenge with thinking about any kind of allocation, but can be expected to be especially dramatic when lives are on the line.

So we have problems with that algorithm. And as soon as we determined it, which only happened yesterday, because we moved up our vaccinations by a full day …”: Here, the timeline works against the decision makers. They spent “weeks” coming up with an approach, and yet only discovered a problem at the last moment. Indeed, it seems that they only realized there was a problem (not least in terms of how their decisions would be received) when they began trying to implement their solution. Again, this highlights the need for some sort of interrogation farther in advance of deployment. There is no reason to wait until the last moment to look for potential systemic problem or to consult with the groups who will be affected.

Don’t lie to us!”: The crowd really does not like the last comment, for unclear reasons, but seemingly in relation to the vaccination being moved up by a day, or the claim that a problem was only noticed at the last minute. Again this turns emotional, and one can’t help but wonder what determined the vaccination timetable. Did it get moved up by a day because the hospital received their vaccines earlier than expected? Or were they waiting on their team of experts to reach a consensus before beginning the distribution? If the latter, it would seem somewhat absurd to spend time perfecting the process, when there is an obvious benefit of starting as soon as possible. In many ways, the hospital seemingly missed a great opportunity, given the emotional impact of starting to vaccinate people (which will inevitably involve a certain ceremonial aspect), underestimating the potential emotional benefits of distribution and overemphasizing a rational risk reduction strategy. The fact that the crowd thinks the speaker is lying clearly shows the extent to which credibility has been destroyed by this (and presumably other events leading up to this).

It seems that the intentions have been very good, but there was a mistake, and there was no resident representation. How do we prevent this from happening in the future”: Finally, an extremely important remark from the crowd. It is all well and good to have a process, but again, good intentions are not enough. This would seem to be a fairly clear example of why having broad representation among a decision making body is important. Even if a resident had nothing substantive to contribute to the design of the algorithm, they would presumably have noticed the lack of priority assigned to residents, and pushed back against this. That does not necessarily mean they would have been in the right (one can imagine the many groups within the hospital wanting to have the chance to make the case for why they should be included), but having some representation at least ensures that someone is going to be in place to look out for their interests, and ideally bring knowledge to the process of why the risk faced by that group is higher than had been estimated. How do we prevent this from happening in the future is perhaps the inevitable question in this sort of scenario, and again can be seen as a kind of negotiating tactic. It argues for a change to the broader process, which will be ongoing struggle, given the lack of any single “correct” solution, and something we can expect to see replicated in many other contexts.

//

I have intentionally emphasized some of the rhetorical aspects of the transcript above, but, as suggested, I think this episode is highly revealing of a type of scenario we are increasingly likely to encounter in the future. As algorithmic decision making processes see ever more widespread adoption (in domains such as student evaluation, resume screening, criminal sentencing, etc.), we will increasingly run up against the real and perceived problems of such approaches. To reiterate some of the above, decisions in these domains may seem to be calculable, but this tends to underplay the lack of knowledge and understanding involved in many of these settings, giving a false impression of rational justification. Algorithmic solutions are also likely to ignore both the emotional aspects of these decisions and the latent power of affected groups.

The real issue involved here, as I have hinted at, is that we exist in systems in which individuals or small groups have the power (or the apparent power) to make decisions which will have serious impacts of the lives of many others. Turning to an “algorithm”, or any kind computational solution is an understandable instinct, as a way of trying to place the decision on more well justified foundations. In many cases, this move is at least partially rhetorical, given the inevitable trade-offs involved in any realistic scenario, and the difficulty of accurately modeling the full set of consequences that will result from any decisions. More importantly, however, making things precise (as is inevitable with computational solutions) paradoxically reveals the arbitrariness involved in having a particular group of experts making these decisions, especially in the absence of transparency. Claiming “the algorithm did it” might seem to deflect responsibility, but everyone knows by now that algorithmic solutions are not magical, unbiased, or infallible. They are the product of human decisions and human data, and any group making use of such processes should know that they will be expected to take responsibility for and justify the use of the processes they come up with.

That being said, there are still many more and less obvious ways of making any process better. This episode highlights many of these, including the importance of getting input from those who will be affected by the decision, of interrogating the output for potential problems, and of providing clarity about the process used to arrive at a decision. No matter what process is used, there will be people who will be displeased (often justifiably), and, somewhat counter-intuitively, this is all the more assured when a process is made explicit (as someone will inevitably have specific grounds on which to challenge it).

There are of course many alternatives, though they are not necessarily more appealing. John Cochrane has proposed selling vaccine doses to the highest bidder. UCSF apparently used some sort of lottery. The CDC itself used a voting mechanism to approve the plan to prioritize health care workers and nursing home residents and staff. The United States as a whole is using a combination of opaque components that will surely be rife with inefficiencies and inequalities. To be clear, this is a difficult problem and there are no “correct” solutions. All of these processes have benefits, flaws, and limitations.

In the end, although the process used at Stanford hospital seems to have been flawed in multiple ways, (not least in the lack of transparency, which seems completely unjustified in this case), it seems probable that such processes are often likely to end up not far from where we currently stand, in which resources are allocated roughly in proportion to the amount of power that groups or individuals have to lay claim to them (if that is not simply tautological). In this case, according to the New York Times reporting, “the system rapidly devolved into a first-come, first-serve situation”, which is of course another kind of market-based allocation.

Regardless, let me express my sincere gratitude and appreciation to all the residents, nurses, doctors, other front line workers, and administrators, showing incredible courage in impossible conditions in doing their best to try to keep everyone healthy and safe.