In my last post I mentioned the new language Julia. It deserves more than a single paragraph, so I thought I’d walk through a problem, and tackle it with the language.

The problem is a stock standard one: investigating the fitting of an SIR model of disease spread to an outbreak of influenza at an English boarding school. Of the 763 pupils, on the first day only one was infected, and the number of infected pupils changed over 14 days as:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 3 6 25 73 222 294 258 237 191 125 69 27 11 4

We will try to find parameters and so that when the equations

are solved, the values of fit the data.

Note that the sum of all the right hand sides is zero; this indicates that is constant, which for this model is the population. (Other models allow for births, deaths and other population changes).

Finding the parameters can be done in several steps:

- Set up a function for the system of differential equations.
- From that function, create a second function which is the sum of squared differences between the data, and the computed values from the equations.
- Minimize that sum.

The first function can be created in Julia (following the example here) as:

function SIR(t,x,p) S = x[1] I = x[2] R = x[3] beta = p[1] gamma = p[2] [-beta*S*I, beta*S*I-gamma*I, gamma*I] end

The default behaviour of functions in Julia is to return the line before the “end” statement; thus in a function such as this one, there is no need for a “return” statement (although one exists).

To solve a system of equations we need access to the `ODE` package, which I’ll assumed you’ve installed with

Pkg.add()

This can be loaded into a running Julia session either with

using ODE

which adds all the ODE functions into the top namespace, or as

import ODE

which adds all the functions in the `ODE` namespace. So to call the Dormand-Prince method, you would use

ode45

if all methods were in the top namespace, or as

ODE.ode45

in the second case.

So, for example, we could attempt to solve the equations as:

p = [0.01, 0.1] t, x = ODE.ode45((t,x)->SIR(t,x,p),[0:0.1:14],[762.,1.,0.]);

To plot the functions, we’ll use the plotting library Winston (Winston, Julia from 1984 – geddit?), which is fairly basic, but enough for our simple needs:

import Winston wn = Winston wn.plot(t,x[:,1],"b",t,x[:,2],"g",t,x[:,3],"r",[1:14],data',"k*")

The green curve corresponds to , the number of infected individuals at a given time, and the asterisks to the data. Clearly these parameters do not provide a very good fit.

The next step is to create a sum of squares function to minimize. To do this, we will solve the ODE system with the `ode4` method which uses a fixed step size. This means we can ensure that there are computed values for at all the integer points:

function ss(b::Vector) data = [3 6 25 73 222 294 258 237 191 125 69 27 11 4]; t,x = ode.ode4((t,x)->SIR(t,x,b),[0:0.05:14],[762.,1.,0.]); sum((data-x[21:20:281,2]').^2) end

Now this function can be optimized, using a method from the `Optim` package:

import Optim Optim.optimize(ss,[0.001,1],method=:cg)

The “method” variable here defaults to the Nelder-Mead method, but for this particular function I found that the conjugate-gradient method gave the best results.

This produces quite a bit of information, of which the salient bits are:

* Minimum: .0021806376138654117 .44517197444683443 * Value of Function at Minimum: 4507.078964

So, let’s try these parameters and see what happens:

p = [.0021806376138654117, .44517197444683443]; t, x = ODE.ode45((t,x)->SIR(t,x,p),[0:0.1:14],[762.,1.,0.]); wn.plot(t,x[:,1],"b",t,x[:,2],"g",t,x[:,3],"r",[1:14],data',"k*")

As you can see, the fit is remarkably good.

As far as problems go, this is not particularly hard – conceptually speaking it’s probably at an undergraduate level. On the other hand, it’s not completely trivial, and does seem to me to give the computational environment a bit of a work-out. And Julia solves it with ease.

Filed under: Computation, Software | 1 Comment »