C-19: Predictive modelling to better understand (and overcome) this pandemic

Updated: May 11, 2020

In a previous article here we explored how COVID-19 testing may unknowingly provide a misguided and underestimated view of how the virus is spreading within the population of each country. In conducting the analysis required, we arrived at a credible method to model the evolution of infections and deaths from a single wave of COVID-19 spread and found that the growth to be appropriately modelled by three distinct phases: initial growth, decline during lock-down, and a residual growth floor. Here we highlight the main growth statistics for a number of countries across these phases, in order to determine the viability of this approach for predictive modelling of the virus' trajectory and the impact of control interventions. In doing so we will also contrast how effective the approaches taken by each countries has been thus far.

#COVID #statistics

TL;DR key-takeaways

  • It is possible to model the evolution of the COVID-19 pandemic by utilising a 3 phase growth model (initial growth, growth taper, and longer term residual transmission)

  • The model fits well with existing datasets with good prediction accuracy, and is also further reinforced by seeking agreement with actual mobility data available for each country. In all, this provides decision makers a useful tool to better plan different scenarios of control interventions to combat the spread of the pandemic

  • Existing data gathered from 20+ countries under study provides an insightful baseline for input assumptions required for modelling purposes

(1) Outline of the general approach

Within our daily COVID-19 reports we track the daily growth of infection and death data for each of the 20 or so countries under study. The datasets and projections are updated daily to reflect the latest statistics that are reported overnight. Those reading our reports should already be familiar with the structure of our reporting.

Within these reports we also conduct data fitting to determine an accurate numerical fit to the empirical data reported from each country. More specifically we model a change to the transmission factor, Tr as a function of time in order to best fit the data where;

New infections day N = New infection day N-1 x Tr

The relationship between daily flows and the evolving growth factor is best illustrated in the diagram below. Further below that is an excerpted summary of our daily report for Italy up to 13 April 2020.

Figure: modelled daily fit to actual reported data for the Case of Italy up to 13 April 2020

The three phases of growth are clearly visible given the change in transmission factor over time.

Figure: Analysis of the COVID-19 spread in Italy by number of deaths highlights distinctly three phases of growth:

(1) initial growth doubling every 2.3 days, (2) a taper of -1.2% per day in transmission during lock-down, and (3) a residual transmission of 0.96, or halving every 17 days. [Updated with version from 18 April 2020]

  • All key important metrics are highlighted (e.g. total infected, testing error and delay, predicted deaths, all major growth statistics across phases)

  • Onset of reduced mobility / lock-down is further supported by seeking agreement with latest mobility data

  • Trailing 15-day trendline helps identify phase of growth the country is in (linear or declining means Phase 3, exponential growth is still. Phase 1 or 2)

(2) Country comparison during the initial growth phase - Phase 1

We had previously covered this topic in a previous article, so will only recap some brief highlights here. The tabulated results from the study conducted previously with data up to 9 April 2020 identified a wide range of initial growth across countries - from 2.2 days to double, through to 5-7 days to double (see below). While it is clear what the difference in these figures mean in terms of the scale, speed and magnitude of the COVID-19 calamity for each country, we are yet to have a robust opinion on why these figures vary so much from one to another.

Tabulated results of fitting data across 3 growth phases (data fitting as at 9 April 2020)

(3) Country comparison during the taper phases - Phases 2 and 3

This is a primary focus of this article. Note: data fitting up until 14 April 2020.

While a lot of coverage is afforded by news media on the day to day growth of COVID-19 infections and deaths, there is a genuine lack of robust information and analysis on the effectiveness of control and lock-down measures conducted from country to country. Having traced the trajectory and conducted multiple iterative projections for some 20+ countries over the last 4-5 weeks, we are now in a position to investigate this.

The figures below summarise our model fits as at 14 April 2020 for the listed countries, where we normalise all models to start at the same day (Day 1), and begin the lock-down/taper at the same time (Day 10); in order to better visualise the effects of the different intrinsic growth characteristics of each. It is clear how countries are demarcated heavily by their initial growth rate; with countries such as Italy, Spain, US and UK show high initial growth, whereas it is lower for Switzerland, Turkey, Hubei; and lower still for South Korea, Malaysia, and Thailand.

Several interesting observations arise from this analysis;

  1. As mentioned previously, the initial transmission factor has a tremendous bearing on the peak of the curves. A rather small change in this factor has exponential implications - the peak of the infection curve for Malaysia, Hubei and Switzerland are respectively 0.7%, 3.9% and 7.7% the height of the peak for Italy.

  2. A higher transmission factor does not just mean a taller peak, but also a longer time period to take to track down the taper of the curve. Notice how the decline gradient in the transmission factor does not vary greatly from country to country. This in a physical sense represents collective and public efforts on social distancing, lock-downs and other control measures put in place by each country; and to put bluntly, cannot realistically be accelerated much further than it already has been given tremendous efforts by all the countries here. Ignoring South Korea and Thailand as outliers, we note a range of -1.9% to -0.9% decline per day with the median and mean estimates of -1.25% and -1.31% respectively. For countries such as Italy, Spain and the US this means a long drawn battle to keep their economies locked-down in order to taper the number of deaths and infections over quite some time.

  3. Relatively speaking, Spain seems to have clamped down harder than Italy and France; and while it has recently made a decision on gradually relenting on some of these controls (13 April 2020), it is perhaps starting from a more stringent starting point. The UK and US efforts of locking down are within the same ballpark as Italy and France; whereas Asian countries like Malaysia and Thailand have not only had the privilege of lower starting growth rates, but their control efforts also seem to be more stringent in comparison (greater than -1.9% decline day on day).

  4. These taper efforts eventually will culminate in a combination of Phase 2 and Phase 3 effects, i.e. both a transmission factor decline and reaching the eventual floor - see diagram below. While Phase 3 estimates are at this point still relatively uncertain, we can already conduct a preliminary exploration of the long term transmission factor for countries like Spain [.945], Italy [.96], Hubei [.91], Belgium [.93] and Switzerland [.93] - the lower the figure, the more effective and 'stringent' control is. It is clear that none of these European countries have applied the same stringency as what was done in Hubei back in February; which is understandable, yet at the same time concerning given this would mean a longer drawn taper.

Figure: modelled transmission factors and resulting daily infection/death curves

Start with n=1 at day 1 and decline/lock-downs commencing on day 10

Discussion and takeaways

Firstly, it is indeed possible to model the evolution of the COVID-19 spread, particularly under a single wave 3-phase scenario, where control measures are introduced some time after initial growth, and have the effect of tapering the infection and death curves into that of a logistic (S-curve) function. Judging by the results of fitting this to real world data, the model fitting is 'good enough' for predictive modelling purposes, and provides decision makers a reasonably robust method to predict outcomes under different scenarios. What is challenging however, is to determine accurate input assumptions for different growth factors that occur at different stages of the pandemic growth.

We have already covered initial growth (Phase 1) effects in a previous article; our focus here therefore will be on the taper phases, i.e. Phases 2 and 3. It is perhaps important to reiterate again here, in case it is not obvious, that this growth taper is a representation of real world controls, for example social distancing and movement control policies put in place by governments in each country, together with any net-effect of non-adherence; and as such (1) while it is shown as a discontinuous curve/function in reality the change is gradual and continuous, and (2) there are practically no 'theoretical' limits to represent appropriate outside-in assumptions for these figures. To this latter point, instead, what is most useful is comparing figures across different countries to empirically derive what these limits could be.

From the analysis, we see three groups of countries showing distinctly different behaviour across their Phase 2 developments (growth taper);

  1. Hard lock-down measures resulting in rapid reduction of the transmission factor at a rate of -(1.5-2%) or more per day. Countries like Thailand and Malaysia have taken a fairly hard stance at locking-down - the mobility data we have gathered here illustrates this point very well. And so too have hard hit countries like Spain and Belgium, particularly during the toughest periods where daily deaths peaked. Based on data, mobility during this time reduced by up to 80-90% from baseline levels for activities such as retail and public transportation.

  2. Medium level lock-down measures registered a reduction in transmission factor of -(0.8-1.4%) per day; e.g. countries such as France, Italy, UK and US. This could be termed 'middle-ground' controls where lock-downs were not as harsh as the previous countries. On average a taper at this rate would require some 20-25 days to reach a factor <1; which by the way is consistent with duration of lock-downs implemented in these countries.

  3. Light or no lock-down measures with reduction at a rates lower than -0.8%. Few countries sit in this category, namely South Korea, India, Indonesia and Iran, four countries that have practically taken quite different approaches to facing the pandemic with largely different outcomes. Note: see our daily reports for data on countries omitted from table above. South Korea has taken a light lock-down approach to managing the pandemic; its citizens are still free to roam and show high mobility based on activity tracking data. As a result, the taper effects are less distinct although no less effective given the heavy focus on testing that started early. The challenge for India and Indonesia is likely down to the high population in those countries; making it much harder for authorities to enforce lock-downs. Indeed mobility data here shows distinctly higher mobility compared to other countries over the past month. As a result, tapering growth is likely much more challenging. Iran on the other hand has already endured a high number of deaths; some 4,700 at time of writing. For them it was a perfect storm; the country celebrated the Iranian New Year and went ahead with parliamentary elections during the course of the outbreak. The country refused an official lock-down until early April, and although it showed early signs of a mild taper, without official controls, it was not fast enough.

As for Phase 3 (growth floor), the reported data at present is inconclusive for most countries as each has yet to reach this phase of evolution. Some early results, that can be used with a reasonable level of confidence, pertain to that of Spain [.945], Italy [.96], and Hubei [.91]; as each country is already in Phase 3 and has hit their respective floor levels of COVID-19 transmission. What is notable here;

  1. It is well publicised how stringent the control measures implemented in Hubei were during the process of cracking down on the virus (this is the province where the city of Wuhan is located). See here for coverage by Time magazine. This is perhaps the absolute maximum limit achievable by any country.

  2. For most countries, the residual transmission of the virus is likely to remain for some time - for example, at transmission factor of 0.95, new infections at Day 10 are still 60% of what they were on Day 1 (0.96^10); a sizeable number if Day 1 infections were in the order of 10,000 to begin with.

  3. Continued lock-downs would perhaps enable countries to reach a lower floor level of transmission between 0.91-0.95. However, decision makers would need to deeply weigh up the cost vs benefit of doing so, as their economies take a beating.

Article extension (for those technically inclined): Exploring improvements to transmission factor modelling using a continuous logistic function (added 22 April)

We further explored the possible use of a continuous function to represent a smoother transition in the changes to transmission factor; represented by the following equation with appropriate constants in place.

The results of the regression fits for a number of countries are shown in figures below - note the difference between earlier transmission factor curves and these. It is quite clear from the results that the method indeed yields good results. However, for the time being we intend to maintain use of the discontinuous curves in our daily projection modelling for the following reasons:

  • The onset of the taper is more straightforward to input into the model vs. the variable w in the equation above which leaves some leeway to mimic physical mobility changes

  • Both logistic curves provide good fit to real world data; the axiom 'faster, and easier is better' therefore holds true

  • Likewise, the error in data reporting (delays, gaps) in addition to discerning between potential waves of outbreak is something best done by eye (rather than purely mathematical fitting) and thus a more flexible method is preferred

  • We will continue to monitor fitting with the current method and deploy the continuous curve if deemed necessary in the future

Read more of our COVID-19 coverage at https://www.agility.asia/covid


Recent Posts

See All