A quarter of a million deaths in the next twelve months; tightening rather than relaxing the lock-down restrictions; or trying to explain why our case number data is so useless compared to Italy’s, Spain’s, France’s and Germany’s.
That may be the choice UK government ministers are facing this morning.
On the 12th April, I published the article Lock-down: did it work, and did we do it at the right time?. That was a bit of a special request, in response to a lot of mainstream- and social-media chatter about whether the UK government had got it right or not. Conclusions: it was too early to say if it worked, but our lock-down timing (based on deaths per million) was broadly in line with other countries (China, Italy, Spain, France and Germany) – albeit perhaps towards the end of this pack.
That article also user the ‘per million of population’ approach to normalize data from different countries. At the time, most visualizations were giving absolute numbers. It was obvious to me that putting the number of deaths or cases amongst the 1.4 billion living in China on the same chart as those for the 66 million in the UK, or 327 million in the US was largely pointless. I’m glad to see the following week this style of reporting has been much more widely adopted.
But if we didn’t have enough data back then, do we now? Well, the answer is yes, and no. But the data does tell us something about how the effectiveness of the UK – lock-down restrictions, compliance by the public, and/or collection of data – compares to other countries.
Spoiler alert: it isn’t pleasant reading.
Let me first just provide the factual updates on the two visualization I included last time: daily deaths and cases per million since lockdown (1).
In terms of deaths, the UK is still to reach the half-way mark through the transitional period between all deaths likely to have been caused by pre-lock-down contraction of the virus to post-lock-down contraction. Italy and Spain, both well past half-way, are showing downward trends in deaths (2). For France, Germany and the UK, it is still too early to say (3). This data with the corresponding trend lines is shown below.
When it comes to cases, however, things start to get much more interesting.
The UK has 18 days of data since the critical day 14 after lock-down. This day is important because it is the day after which it is expected that all new cases being reported will have been as a result of infection after lock-down was imposed. So it is the first day on which we see data that is not corrupted by pre-lock-down cases. Other countries have more data because they locked-down earlier – Italy for example now has 30 days of data after day 14.
This – I thought – might be enough to start seeing some kind of trend. So using the same regression techniques applied to the deaths data above, I looked for best-fit trend lines for the daily case data from these same five countries: UK, France, Germany, Italy and Spain.
The results are shown on the chart below, and contained a bit of a shock for me.
Firstly, it is worth considering what shape of curve we might expect to fit data bout daily new cases in an environment where we are actively trying to reduce the spread of the disease.
Perhaps a sudden step-down as the effect of the measures kicked-in and the spread was immediately and significantly reduced? The first set of charts above show that this definitely was not the case – no country has seen such a sudden change.
I felt it natural to assume that there would be an initial decline in daily cases, that would gradually level-off once it reached the new (lower) level of daily cases that were still a result of infection despite the measures. Such a fit would be an exponential trend line with a negative x coefficient.
When I checked to see which trend line had the best fit, Spain, Italy, France and Germany all were modelled most accurately by negative x coefficient exponential lines, with Italy, Space and Germany having the best fit for this trend line (4). These trend lines (showed as dashed lines in the same colour as the country’s case data is plotted) can clearly be seen to be downward over the 30 days shown on the chart, and – broadly at least – similar to one another.
But what is happening in the UK? No trend line gave a good fit, with the best fit being a line (5). And a line that is not going down – well, at least not fast: 1 case per million reduction every week approximately.
This tell us that the daily cases being reported in the UK is not following a discernible trend. It is fluctuating apparently randomly at around the peak value it reached around day 14 after lock-down.
So what does this mean?
It means one of two things:
- If the data is correct, then our lock-down measures have not been effective at reducing the rate of infection. They have managed to stabilize it at around 5,000 new cases per day. If we do nothing to reduce the rate of infection (never mind easing lockdown and increasing it), and if it takes a year to find and administer a vaccine, then this rate of infection will result in over 1.8 million people getting Covid-19, and based on the currently UK mortality rate (13.5% based on 18,100 deaths and 133,495 cases), that will lead to approximately a quarter of a million deaths.
- If the data is not correct – which is quite possible, for example cases could be falling but the increasing number of tests taking place could be discovering a high proportion of the cases each day masking what is really happening – then we have to ask ourselves the extent to which it is reasonable to make any assessment of the number of cases based on this published and widely reported data. We also have to ask why is our data behaving so badly compared to that from other countries (especially Italy, Spain and Germany). The next most obvious data to look at is number of deaths, but as has been discussed in the previous article, this requires a much longer period of data following each change to lock-own measures before the impact of those measures can be seen, and so significantly delays our responsiveness (to both improvement and deterioration in the effectiveness of the measures).
As interesting as this is, I am always keen to find what is actionable.
The government needs to acknowledge the problem, and publish more data about testing – we need to understand who is being tested and the split of positive from tests in different settings. This would allow us to model the impact of the variable number of tests of this period as a possible explanation for the data.
But – as unpopular as this might be – the government also have to seriously consider the possibility that the data cannot wholly be explained by our scrabble to test. Which means there are other factors, including the effectiveness of our measures (either their design or the public compliance),. This could lead to amending the measures themselves and the way in which they are enforced.
What we know is that the UK data makes clear we have a problem. More transparency may help us understand it, even if that data highlights shortcomings in the way this pandemic has been handled. No one is perfect and mistakes will surely have been made, and I am not interested in pointing fingers by this statement. But like no other crisis, we all play a part in navigating this, and the last thing we need is to be trying to do that wearing a blindfold.
Notes and references:
- All case and death numbers by date and country data in these charts is sourced from the data published in a Creative Commons license on the EU Open Data Portal here https://data.europa.eu/euodp/en/data/dataset/covid-19-coronavirus-data/resource/55e8f966-d5c8-438e-85bc-c7a5a26f4863.
Date of lockdown for each country has been sourced from various news sources.
Country populations are provided by the EU Open Data Portal and are based on 2018 data.
- Italy and Spain deaths have R-squared values of 80% and 62% respectively for negative coefficient exponential trend lines. There appears to be little bias when residuals are considered.
- Best-fit trend lines using, exponential, linear, logarithmic and power curves give optimal R-squared values of less than 8% (and just 3% for UK).
- R-squared values of 69% (Italy), 66% (Spain), 60% (Germany) and 28% (France).
- Linear fit for UK has R-squared value of 0.2% and is higher than is achievable using exponential, logarithmic or power series trend lines.