Photo: Rockefeller University
America was unprepared for the magnitude of the pandemic, which overwhelmed many counties and filled some hospitals to capacity. A new paper in PNAS suggests there may have been a mathematical method, of sorts, to the madness of those early COVID days.
The study tests a model that closely matches the patterns of case counts and deaths reported, county by county, across the United States between April 2020 and June 2021. The model suggests that unprecedented COVID spikes could, even now, overwhelm local jurisdictions.
“Our best estimate, based on the data, is that the numbers of cases and deaths per county have infinite variance, which means that a county could get hit with a tremendous number of cases or deaths,” says Rockefeller’s Joel Cohen. “We cannot reasonably anticipate that any county will have the resources to cope with extremely large, rare events, so it is crucial that counties—as well as states and even countries—develop plans, ahead of time, to share resources.”
Ecologists might have guessed that the spread of COVID cases and deaths would at least roughly conform to Taylor’s Law, a formula that relates a population’s mean to its variance (a measure of the scatter around the average). From how crop yields fluctuate, to the frequency of tornado outbreaks, to how cancer cells multiply, Taylor’s Law forms the backbone of many statistical models that experts use to describe thousands of species, including humans.
But when Cohen began looking into whether Taylor’s Law could also describe the grim COVID statistics provided by The New York Times, he ran into a surprise.
Ninety-nine percent of counties’ counts of cases and deaths between April 2020 and June 2021 conformed to a “lognormal” distribution of Taylor’s Law, which predicts that the variance of cases or deaths in each location will be proportional to the squared mean of cases or deaths. For example, if the average number of cases per county is 50 in Arizona and 100 in California, this version of Taylor’s Law would predict that the scatter of case counts in California would be four times larger than the scatter of case counts in Arizona. Similarly, if the case counts per county in those two states were 50 and 150, respectively, the scatter would be nine times larger in California.
The top one percent of counts of cases and deaths, however, did not fit the lognormal distribution. Instead, the high counts matched the Pareto distribution—a model more often seen in economics than biology, in which extremely high values are rarely but regularly observed (think: income or wealth distribution). What made this particular Pareto distribution unique was that it also had infinite variance, implying that the scatter would increase beyond any finite limit, the more counts of cases or deaths observed. The challenge was to understand why even the top 1% of counts still conformed to Taylor’s Law with the same exponent as the lower 99%.
“It was a puzzle,” Cohen recalls. “And I sat on that puzzle, every so often taking it out, torturing it a bit, and putting it away. Until, one day, I called in the heavy artillery.”
Cohen sent his computer simulations and unproved conjectures to Richard A. Davis of Columbia University and Gennady Samorodnitsky of Cornell University, asking for their input. A few months later, the two sent him some theorems: the missing proof that Taylor’s Law would hold even for the Pareto-distributed top 1% of counties, with the same exponent as the 99% of lognormally distributed counties. “These theorems helped prove that Taylor’s Law accurately describes all of the data,” Cohen says. “The pandemic produced an orderly pattern of counts of cases per county and deaths per county. The unexpected part of that order was that, in the most extreme cases, there was no limit to how bad things could get.”
Why the pandemic follows this hybrid (lognormal-Pareto) version of Taylor’s Law so closely is unclear. One possibility is that Taylor’s Law—which describes the variance of many ecological systems, including infectious diseases like measles and Chagas’s disease—simply captures the nature of infection. If one patient infects two people (with some probability) and each of those two patients infect another two people (with some probability), we would expect cases to increase exponentially (with some probability), and occasional random events could cause infinite variance.
Cohen hopes that the study will sound an alarm for policymakers. An infinite variance of cases and deaths per county means that there is a very unlikely but possible scenario in which a COVID spike gets every individual in that county sick, or worse. Although the advent of vaccines makes such a scenario increasingly unlikely, areas in the United States and abroad with low vaccination rates still face the possibility of spikes that they cannot handle.
The math, Cohen says, suggests that COVID cases and deaths could far exceed the capacity of local jurisdictions to cope. “Governments had better be prepared to call in their friends,” he says.