In my last post I mentioned the new language Julia. It deserves more than a single paragraph, so I thought I’d walk through a problem, and tackle it with the language.
The problem is a stock standard one: investigating the fitting of an SIR model of disease spread to an outbreak of influenza at an English boarding school. Of the 763 pupils, on the first day only one was infected, and the number of infected pupils changed over 14 days as:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 3 6 25 73 222 294 258 237 191 125 69 27 11 4
We will try to find parameters and so that when the equations
are solved, the values of fit the data.
Note that the sum of all the right hand sides is zero; this indicates that is constant, which for this model is the population. (Other models allow for births, deaths and other population changes).
Finding the parameters can be done in several steps:
- Set up a function for the system of differential equations.
- From that function, create a second function which is the sum of squared differences between the data, and the computed values from the equations.
- Minimize that sum.
The first function can be created in Julia (following the example here) as:
function SIR(t,x,p) S = x I = x R = x beta = p gamma = p [-beta*S*I, beta*S*I-gamma*I, gamma*I] end
The default behaviour of functions in Julia is to return the line before the “end” statement; thus in a function such as this one, there is no need for a “return” statement (although one exists).
To solve a system of equations we need access to the ODE package, which I’ll assumed you’ve installed with
This can be loaded into a running Julia session either with
which adds all the ODE functions into the top namespace, or as
which adds all the functions in the ODE namespace. So to call the Dormand-Prince method, you would use
if all methods were in the top namespace, or as
in the second case.
So, for example, we could attempt to solve the equations as:
p = [0.01, 0.1] t, x = ODE.ode45((t,x)->SIR(t,x,p),[0:0.1:14],[762.,1.,0.]);
To plot the functions, we’ll use the plotting library Winston (Winston, Julia from 1984 – geddit?), which is fairly basic, but enough for our simple needs:
import Winston wn = Winston wn.plot(t,x[:,1],"b",t,x[:,2],"g",t,x[:,3],"r",[1:14],data',"k*")
The green curve corresponds to , the number of infected individuals at a given time, and the asterisks to the data. Clearly these parameters do not provide a very good fit.
The next step is to create a sum of squares function to minimize. To do this, we will solve the ODE system with the ode4 method which uses a fixed step size. This means we can ensure that there are computed values for at all the integer points:
function ss(b::Vector) data = [3 6 25 73 222 294 258 237 191 125 69 27 11 4]; t,x = ode.ode4((t,x)->SIR(t,x,b),[0:0.05:14],[762.,1.,0.]); sum((data-x[21:20:281,2]').^2) end
Now this function can be optimized, using a method from the Optim package:
import Optim Optim.optimize(ss,[0.001,1],method=:cg)
This produces quite a bit of information, of which the salient bits are:
* Minimum: .0021806376138654117 .44517197444683443 * Value of Function at Minimum: 4507.078964
So, let’s try these parameters and see what happens:
p = [.0021806376138654117, .44517197444683443]; t, x = ODE.ode45((t,x)->SIR(t,x,p),[0:0.1:14],[762.,1.,0.]); wn.plot(t,x[:,1],"b",t,x[:,2],"g",t,x[:,3],"r",[1:14],data',"k*")
As you can see, the fit is remarkably good.
As far as problems go, this is not particularly hard – conceptually speaking it’s probably at an undergraduate level. On the other hand, it’s not completely trivial, and does seem to me to give the computational environment a bit of a work-out. And Julia solves it with ease.