This note has been written to obtain a simple and aware modeling for epidemics outbreak and, in particular, to follow the evolution of COVID-19 spreading in Italy. The methods adopted are briefly summarized. The starting observation is that the spreading of a epidemics is intrinsically a time-varying exponential growth model. This means that it can be realistically modeled with an exponential function with time-varying rate. This is compatible with the assumption that the time-scale for variation of the rate and contagion are separated. More specifically, it is expected that the rate changes on a slower time-scale than the contagion.
The above reasoning justifies the modeling of the number of infected people at given time with the following differential equation:
(1) |
where is the time-varying rate of exponential growth. This equation can be solved in closed form and the solution is expressed as:
(2) |
The other assumption that is compatible with the evolution of an epidemics is that in the long term the number of infected people must approach a saturation value. In this respect, it is very important to be able to forecast the time instant associated to such saturation.
Mathematically speaking, the above observation means:
(3) |
This is compatible with what has been observed for the disease evolution in China[1] starting from end of December 2019, as it is reported in fig.1(a). As one can clearly see, the contagion has an initial exponential growth in the early stage of the epidemics (more or less one month), then the growth rate decreases and the total number of cases approaches saturation around day 50. Analysis of this behavior suggested to use an appropriate logistic function to model . Namely, the following form is assumed:
(4) |
where are three fitting parameters. One can easily recognize that is a scaling factor, controls the initial exponential growth, and defines a delay time before saturation sets in. We observe that the asymptotic number of infected individuals is given by:
(5) |
The result of the fitting procedure shows very good agreement with measured data, as it is visible in fig. 1(a), predicting approach to saturation at day 54.
In this respect, according to (4), the saturation is reached when and, consequently, . A practical criterion is to require that:
(6) |
which can be easily understood by rewriting eq.(4) in terms of as:
(7) |
and observing that for one has . This is similar to what is done to estimate the decay time (, with playing the role of the maximum time constant) of the transient response for linear time-invariant dynamical circuits.
Another quantity of interest in monitoring an epidemic is the maximum growth rate, namely the maximum number of contagions per day and the associated prediction for the peak date . It can be easily shown that the maximum of the derivative is given by:
(8) |
Now, we observe that the function (4) can be expressed as:
(9) |
where obviously . Now, recalling equation (1), and computing the time derivative of the latter expression, one has:
(10) |
The latter term can be recast in terms of the function itself as follows:
(11) |
Thus, it is apparent that the differential equation which models the growth of fig.1 is:
(12) |
where one can clearly recognize the quantity that plays the role of the rate .
This model is amenable of epidemiological interpretation in terms of the so called mechanistic models, such as the SIR model[2] (Susceptible Infected Recovered). In fact, under the assumption that the whole population of individuals is exposed to contagion, one has that the dynamics of the number of infected and susceptible individuals is described by the following coupled equations:
where is the transmission rate (reciprocal of the mean time between contagions ), is the recovery rate (reciprocal of the mean time to recover from the disease or mean hospitalization time ). It is apparent that this model keeps the sum . An important parameter associated with this model is the basic reproduction number , defined as:
(16) |
which represents the average number of infected people by a single infected individual during time (infectious time interval) and plays a very important role in the early stage of an epidemic. As far as time advances, the effective reproduction number reduces proportionally to the susceptible individuals number and is usually denoted as .
If one assumes that separation of time scales is present between infection and recovery, namely , for long enough time the recovered population , which yields . By plugging the latter relation into eq. (14), one ends up with the equation for the sole infected population :
(17) |
where we have factored out in the last term and applied eq.(16). The latter equation, compared with eq. (12), suggests to perform the following identification of parameters of growth model (4),(9):
These equations can be used to estimate epidemiological parameters from the fitted model.
In order to test the predictive capability, this model has been applied to perform extrapolations based on data updated to day 32, 35, 40, 45 of the epidemics. The result is reported in fig. 1(b), which shows that the model is able to predict saturation around day 50 already two weeks before. This suggests that this model can be used to monitor the progress of the COVID-19 outbreak in Italy.
To this end, data collected daily by Department of Civil Protection[3] (DPC) are being used to monitor the behavior of the epidemics in Italy. The same independent fitting is applied for each single Region of Italy and the model parameters are extracted. In addition, by using formula (6), the expected saturation day has been estimated on the basis of contagion data. Figure 2 reports the situation of the 9 Regions with higher number of contagions as of March 12th, 2020.
Remarkably, the prediction for the whole nation (lower left corner of fig.2) reports an expected saturation around the beginning of April, slightly more than three weeks from now, consistently with the statement currently made by the DPC. It is also worth noting that the rate is very similar to that observed for China. Thus, following eq. (18) and considering a mean recovery time days (), one would derive a basic reproduction rate which is compatible with that suggested by Chinese epidemiologists[4] for COVID-19 outbreak in China.
NEW: Automated forecast tool (daily updated) available at
http://wpage.unina.it/mdaquino/covid19/auto/forecast.html
In the following, the latest updates based on daily collected data are presented for Italian regions and for selected world countries. The quantities listed in table 1 are extracted from data and reported in the figures.
quantity | description |
maximum number of infected people at the saturation of contagions , see , eq. (5) | |
expected day of contagion saturation, see eq.(6) | |
expected day of maximum growth rate (maximum number of positive cases per day), see eq.(8) | |
maximum contagions per day at day , see eq.(8) | |
In the figure below, the day-by-day forecast for the dates (red) and (blue) associated with the peak (maximum number of positive cases per day) and saturation (maximum number of total positive), respectively.
The standard deviation over the last 7 days is also computed for each Region. Values of (and smaller) indicate more stable forecast.
For example, the lower right panel shows that the growth rate peak in Italy would be located around end of March and the saturation of contagions is expected in the beginning of May. Regione Campania, where my hometown Napoli is located, after going through some oscillation, is also expected to reach the saturation in the first decade of May.
In the figure below, the day-by-day situation of positive cases in Italy and other countries (Spain, France, Germany, United States of America, United Kingdom) is reported for comparison. The forecasts for peak and saturation as well as the asymptotic number of total positive cases are reported on the figure panels.
In the figure below, the day-by-day forecast for the dates (red) and (blue) associated with the peak (maximum number of positive cases per day) and saturation (maximum number of total positive), respectively.
The standard deviation over the last 7 days is also computed for each country. Values of (and smaller) indicate more stable forecast.
In this section we present data analysis performed with appropriate fitting of the SIRD (Susceptible-Infected-Recovered-Deceased) model, which is an extension of model (13)-(15). By using this model, one can obtain more accurate forecast which overcome the drift of dates reported in figures 4 and 6.
To this end, we report in the following the time evolution of :total number of positive, referred to as , number of currently positive , number of recovered people , number of deaths .
Moreover, the quantities listed in table 2 are extracted from data and reported in the figures.
quantity | description |
maximum number of infected people at the saturation of contagions . | |
expected day of contagion saturation when . | |
expected day of maximum growth rate (maximum number of positive cases per day). | |
maximum number of recovered people at the saturation time . | |
expected day of recovered saturation when . | |
expected day of maximum growth rate (maximum number of recovered cases per day). | |
maximum number of deaths at the saturation time . | |
expected day of deaths saturation when . | |
expected day of maximum growth rate (maximum number of deaths per day). | |
maximum number of positive at day | |
expected day of maximum positive people (excluding recovered and deceased). | |
expected day of positive cases when 99% of recovered is reached. | |
In the figure below, the day-by-day situation of positive cases in United States and other EU countries (Spain, France, Germany, United Kingdom, Austria, Sweden, Netherlands, Belgium) is reported for comparison. The forecasts for quantities in table 2 are reported on the figure panels.
[1] European Center for Disease Control, https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide, March 12th.
[2] Kermack W.O., McKendrick A.G. (August 1, 1927). ”A Contribution to the Mathematical Theory of Epidemics”. Proceedings of the Royal Society A. 115 (772): 700-721. https://doi.org/10.1098/rspa.1927.0118.
[3] Department of Civil Protection, official data repository for COVID-19 outbreak, https://github.com/pcm-dpc/COVID-19/, March 12th.
[4] Li Q., Guan X., Wu P. et al., Early transmission dynamics in Wuhan, China, of novel coronavirus–infected pneumonia. N Engl J Med. 2020; (published online Jan 29.), https://doi.org/10.1056/NEJMoa2001316.
[5] Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis; published online Feb 19, https://doi.org/10.1016/S1473-3099(20)30120-1. Data available at https://github.com/CSSEGISandData/COVID-19.