# Category Archives: Computation

## Number theoretic computations in C

Recently I’ve been experimenting with pseudo random number generators (PRNGs), and I’ve been doing this in C. This is a big step for me, as most of my computational work over the past few decades has been in high level computing environments (Matlab, Sage, Maxima etc). And with a PRNG one wants to generate lots of values, so that statistical tests can been performed. I am particularly interested in using the TestU01 library, which is coded in C. I was idly wondering whether the following recurrence:

$x_i = b^{x_{i-1}}\pmod{p}$, $u_i=x_i/p$

with

$b = 7,\quad p=2^{31}-1,\quad x_0 = 7$

would make a reasonable PRNG. (It doesn’t: it’s too slow, and fails some tests). But this led me into some programming for testing. First, here it is in C, using the multiple precision GMP library:

/*
A program to compute some values x[i] = a^x[i-1] mod p, with a = 7, p = 2^31-1.

Compile with:
gcc modpowers.c -o modpowers -lgmp -std=gnu99 -g
*/

#include <stdio.h>
#include "gmp.h"
#define pp 2147483647

mpz_t y,b,p;
static mpz_t x;

unsigned long power(void)
{
mpz_powm(y,b,x,p);  // Set y = b^x mod p
mpz_set(x,y);       // x = y
return (mpz_get_ui(x));
}

int main()
{
mpz_init_set_ui(b,7);  // Initialize values for use in GMP functions
mpz_init_set_ui(p,pp);
mpz_init_set_ui(x,7);
mpz_init(y);

for (int i = 0; i < 1000000; i++)
{
unsigned long t = power();
if ((i % 100000) == 0)
printf("%10i,%10lu\n",i,t);
}

return 0;
}


And here is its timing, after compiling and running:

$time ./modpowers 0, 823543 100000,1919783493 200000, 198289041 300000,1263498529 400000, 887186099 500000, 944134154 600000,1803260766 700000, 555710010 800000, 469238760 900000,1834709953 real 0m0.904s user 0m0.900s sys 0m0.000s  So: a million values in less than a second. Now, because I’m not very sure of my C skills, I often use Sage to check whether I’ve implemented a function properly in C. So here’s Sage doing the same thing: def printpowers(n): a,p = 7,2^31-1 for i in range(10^n): a = power_mod(7,a,p) if mod(i,10^(n-1))==0: print i,a  We can check that printpowers(6) gives the same results as the output of the C program above. But: sage: timeit('printpowers(6)',repeat=1) 5 loops, best of 1: 73.8 s per loop  As you see, the time is far greater. Now, I know I can write code in Cython, and use that within Sage, but sometimes it’s nice – especially given some highly optimized and specialized libraries for C – to be able to work directly with C. Anyway, I’m enjoying the huge speed of compiled code. ## A piecewise polynomial approximation to the normal CDF The normal cumulative distribution function $\displaystyle{\Phi(z)=\frac{1}{\sqrt{\pi}}\int^z_{-\infty}e^{-x^2}\;dx}$ has the unfortunate property of not being expressible in terms of elementary functions; that is, it can’t be computed with functions available on a standard scientific calculator. Some modern calculators, such as the Casio ClassPad and the TI-Nspire (and some of their predecessors) can indeed evaluate this function, but for students without such calculators, they need to look up tables – surely an old-fashioned method! To overcome this there have been masses of approximations produced over the years; one of the simplest is the logistic approximation: $\displaystyle{\Phi(z)\approx 1-\frac{1}{1+e^{1.702z}}\mbox{ for }z\ge 0.}$ Note that the integrand is an even function, which means that $\Phi(z)=1-\Phi(-z)$ if $z<0$, and so we only have to find an approximation for positive $z$. However, there is a conceptually even easier function, which dates back to a 1968 paper by John Hoyt: $\displaystyle{\Phi(z)\approx\left\{\begin{array}{ll}\displaystyle{z\left(\frac{9-z^2}{24}\right)+\frac{1}{2}}&\mbox{if }0\le z\le 1\\[5mm] \displaystyle{\frac{(z-3)^3}{48}+1}&\mbox{if }z\ge 1\end{array}\right.}$ Here is a plot of the two of them together (created using Maxima and gnuplot): and here is a plot of their difference: In fact the absolute difference is 0.01 or less for $0\le z<3.78$. This piecewise polynomial function provides a nice quick-and-dirty approximation to the normal CDF. ## “Matlab clones” for Android Always on the lookout for portable computing environments, I’ve discovered two for Android: Addi, and Mathmatiz. Both are “matlab-like” in that they act like Matlab, and they both support a subset of Matlab’s standard functions. Addi is based on the JMathLib engine, and so its full list of functions is available. I can’t find a similar set for Mathmatiz; the set given here is in fact only a small sample. Here are pictures of them both, first Addi: and then Mathmatiz: The picture of Addi comes from a good review at Walking Randomly. (This is a blog you should certainly check out, if you don’t already). Of the two, Addi seems to have the more powerful set of commands: just playing around I discovered that Addi implements fft and randn (the latter providing a matrix of normally distributed random elements); neither of which are in Mathmatiz. However, Mathmatiz does seem to have the better interface. It uses a fixed width font, of which you can choose the size, and has a history mechanism: you just touch on a previous input, and a popup menu asks if you want to copy the expression or the result, or delete it. It also has a cursor. Addi uses a variable width font, which makes editing extremely difficult (placing something between a sequence of parentheses can be frustrating), and a very limited set of options: the menu key brings up three items: Create M-file, Open M-file, and Preferences. The Preferences item, when touched, brings up just one item – a tick box to indicate whether or not you wish to use the custom keyboard. Addi however has one very nice interface feature: when you type in a letter, a horizontal menu appears showing all commands beginning with that letter. The more letters you enter, the shorter the menu becomes. This makes entering commands very easy. Editing them (to include values) is however difficult, owing to the font and lack of cursor. There are also some commands which appear in the menu list but which are not yet implemented, such as rref, and ifft (the latter is odd, given fft). Neither Addi nor Mathmatiz currently support matrix powers – even integer powers – and neither seem to support the handy Matlab left division A\b for solving linear systems. It seems to me that neither system yet is perfect, or even fully usable. What I’d like is a hybrid system which uses Addi’s power, the interface of Mathmatiz, and the functions list from Addi. (This idea has already occurred to the developer of Addi; see the comments here.) I’d also like to see some decent documentation. However, as Matlabs for road-warriors, they both show extraordinary promise. ## MISG day 5 I didn’t explain the problem well enough in my last post: AGL, the energy supply company, are trying (among other things) to create several “reference profiles” of energy use, to which customers can be compared. This will give them a better idea about supply and demand, and allow them to more accurately forecast energy requirements. This was a problem more statistical than mathematical, and people were throwing all sorts of clustering analysis techniques (about which I know nothing) at it. However, an averaging plot showed two parallel “clumps” of data, which a single curve of best fit couldn’t describe. Since I know nothing about statistics, but quite a lot about imaging, I had the idea of treating each customer plot as an image: Notice the two clumps at the bottom left – this is an example of a plot to which a curve of best fit could not be usefully applied. The image has been subdivided into unequal quadrants, of which the upper left and right parts are the energy usage during hot and cold temperatures. The vertical dividing line is at 20C = 68F. The horizontal line is one-quarter from the bottom. This seemed (without any analysis) to be a good place for cutting off “standard” energy usage (below the line) to “high” usage (above the line. The customers can then be classified by choosing thresholds for “high hot” and “high cold” usage, and clustering them depending on whether they have used more or less than the threshold value. The nice thing about his approach is that it’s both very easy and very transparent, and also can be adjusted; for example using finer grids on the image, by subdividing the original data into times of day, or into weekdays/weekends (or both), or by using a better approach than “by eye” to determine the cutoff point between standard and high usage. The workshop ended before I could explore this idea any further… so I might play around with this on my own some more. ## MISG day 3 This week I’m at the Mathematics and Statistics Industry Study Group, a yearly event where companies from Australia and New Zealand provide problems for the assembled mathematicians and statisticians to solve. I’m in a crowd of people working on a problem provided by the energy company AGL. They have given us electricity smart meter data for 95 customers (there is a larger data set with 900 customers) of electricity readings taken every 30 minutes over a 199 day period. That’s 9552 observations for each customer! You can read the problem here. So far everybody has (in the words of one participant) “thrown every imaginable package at the data and drawn lots of pretty pictures”. Other folk are using Excel, Minitab, R and other statistics packages; I’m using Octave. Today we have decided to break down the data into times of day: night (0000 – 0600), mornings (0600 – 0900), daytime (0900-1700) and evening (1700 – 2400), and to look at electricity usage versus temperature over those times, and over the course of the 199 days, which go from winter to summer. We can also break down the data into weekdays, and weekends. Here is an example of one data plot: a customers usage by temperature (which is the horizontal axis and given in celsius, so the upper value, 40, is very hot: 104F). This customer’s usage, as you can see, is skewed up to the right (where the temperature is hotter), which would seem to indicate the use of an air-conditioner. Other such plots are skewed to the left (colder temperatures) which would appear to indicate the use of heating in cold weather, but no a/c in hot. Other plots have humps at both ends. A few people have been trying to fit quadratic functions to these plots, without a huge amount of success, as the correlation is so low. If you simple plot the data of a single customer over the entire data period you can see some interesting stuff. Here for example, is one data set: Notice the “break” towards the right (this corresponds to early summer). We can infer that this customer went away at that time, and the residual power usage would have been for refrigerator, maybe security lights etc. I am not myself any closer to solving the problems proposed by AGL, but I’m getting a better handle on the data. I still intend to do some data smoothing, probably with a Gaussian filter, and see if it’s possible to classify customers based on the outputs. Or I might head off to the staffroom in search of tea. ## Easy Lagrangian interpolation As everybody knows, the formula for Lagrangian interpolation though points $(x_i,y_i)$, where all the $x_i$ values are distinct, is given by $\displaystyle{P(x)=\sum_{i=0}^n\left(\prod_{j\ne i}\frac{x-x_j}{x_i-x_j}\right)y_i}$ It is easy to see that this works, as the product is equal to zero if $x=x_k\ne x_i$ and is equal to one if $x=x_i$. This isn’t particularly easy to program, and in fact methods such as Newton’s divided differences or Neville’s algorithm are used to obtain the polynomial. But a method introduced by Gabor Szegö provides a very simple form of Lagrange’s polynomial. Start with defining $\pi(x)=(x-x_0)(x-x_1)\cdots(x-x_n)$. Then $\displaystyle{\frac{d}{dx}\pi(x)=\sum_{i=0}^n\Biggl(\prod_{\substack{0\le j\le n\\j\ne i}}(x-x_j)\Biggr)}.$ This is obtained immediately from the product rule for arbitrary products, and using the fact that the derivative of each individual term of $\pi(x)$ is one. This means that in particular $\displaystyle{\pi'(x_i) = \prod_{\substack{0\le j\le n\\j\ne i}}(x_i-x_j)}$ because all other terms of $\pi'(x)$ contain an $(x-x_i)$ term and therefore produce zero for $x=x_i$. This means that the coefficient of $y_i$ in the Lagrangian polynomial can be written as $\displaystyle{\frac{\pi(x)}{(x-x_i)\pi'(x_i)}}$ and so the polynomial can be written as $\displaystyle{P(x)=\sum_{i=0}^n\frac{\pi(x)}{(x-x_i)\pi'(x_i)}y_i}$. Here is an example, with $x=-2,1,3,4$ and $y=-3,-27,-23,3$. First, Maxima: (%o1) xs:[-2,1,3,4]; (%o2) ys:[-3,-27,-23,3]; (%o3) define(pi(x),product(x-xs[i],i,1,4)); (%o4) define(pid(x),diff(pi(x),x)); (%o5) define(P(x),sum(pi(x)/(x-xs[i])/pid(xs[i])*ys[i],i,1,4)); (%o6) ratsimp(P(x)); $x^3-11x-17$ Now Sage: sage: xs = [-2,1,3,4] sage: ys = [-3, -27, -23, 3] sage: pi(x) = prod(x-i for i in xs); pi(x) (x - 4)*(x - 3)*(x - 1)*(x + 2) sage: pid(x) = diff(pi(x),x); sage: P(x) = sum(pi(x)/(x-i)/pid(i)*j for (i,j) in zip(xs,ys)) sage: P(x).collect(x) x^3 - 11*x - 17  And finally, on the TI-nspire CAS calculator: $n:=4$ $xs:=\{-2,1,3,4\}$ $ys:=\{-3,-27,-23,3\}$ $\displaystyle{{\rm Define}\quad q(x)=\prod_{i=1}^n(x-xs[i])}$ $\displaystyle{qs:={\rm seq}\left(\left.\frac{d}{dx}(q(x))\right|x=xs[i],i,1,n\right)}$ $\displaystyle{{\rm Define }\quad p(x)=\sum_{i=1}^n\left(\frac{q(x)}{(x-xs[i]) \cdot qs[i]}ys[i]\right)}$ $p(x)\hspace{20mm} x^3-11\cdot x-17$ Note that all the mathematical symbols and programming constructs are available through the calculator’s menus. Notice that this approach requires no programming with loops or conditions at all – it’s all done with simple sums and products. ## Computing from the command line Like many die-hard, old-school Linux users I have an inordinate fondness for the command line, and would rather use a command line interface than a snazzy GUI. This post is about how to use the command line to perform simple arithmetic computations, and a few examples of slightly more advanced mathematics. There are many times when I just want an immediate answer to a (usually simple) computation without powering up Octave or Sage, both of which use up sometimes overloaded memory. I’m assuming the use of the Bash shell. Use of expr For simple integer computations, the expr command is sufficient. Here are a few examples: home:$ expr $[3+4] 7 home:$ expr $[5**6] 15625 home:$ expr $[100/17] 5  Because the Bash shell only allows integers as input and output, a division will always be rounded down. Use of bc To obtain fractional output, and to use more mathematical functions, the command bc may be used. This should be available on all Linux distributions, and is a remarkably powerful calculator. See an old article of mine here for an introduction to it. Any expression can be piped to bc for computation, and if bc is invoked with bc -l, this loads a mathematics library which includes the functions a(x), s(x), c(x), e(x), l(x),j(n,x), for inverse tangent, sin, cos, exponential function, natural logarithm and Bessel function of order $n$ respectively. You can read the man page for bc online. So here are some examples of the use of bc: home:$ echo "1/3+4/7" | bc -l
.90476190476190476190
home:$echo "2^100" | bc 1267650600228229401496703205376 home:$ echo "scale=100;4*a(1)" | bc -l
3.14159265358979323846264338327950288419716939937510\
58209749445923078164062862089986280348253421170676


The value scale is an internal variable which gives the number of figures after the decimal point. Alternatively, we can use this syntax:

home:$bc -l <<< "pi=4*a(1);s(pi/5)" .58778525229247312915 home:$ bc -l <<< "e(l(100)/3)"
4.64158883361277889236


This last calculation, of course, uses $exp(x)$ and $ln(x)$ to compute the cube root of 100.

Because bc is a fully featured programming language, you can write programs to perform other functions: a huge number of such functions can be found here and here. Suppose we put a handful of useful functions in a file called stuff.bc and include in our file .bashrc the line

alias b="bc -l /path/to/stuff.bc"

Then we’re able to do things like this:

home:$b <<< "gcd(1001,161)" 7 home:$ b <<< "bin(100,50)"
100891344545564193334812497256
home:$b <<< "fact(100)" 9332621544394415268169923885626670049071596826438\ 1621468592963895217599993229915608941463976156518\ 2862536979208272237582511852109168640000000000000\ 00000000000  (assuming that our file contains commands for the greatest common divisor, binomial coefficients, and factorials.) Use of calc A newer and even more powerful computational language is provided by calc. It has far more built-in commands than bc, as you can see here. It can be used interactively (as can bc) but here we are interested in its use as a command-line calculator. Here’s a couple of examples: home:$ calc '100!'
933262154439441526816992388562667004907159682643816214685
92963895217599993229915608941463976156518286253697920827223758251
185210916864000000000000000000000000
home:$calc 'sin(pi()/5)' 0.58778525229247312917  Just as scale can be used to increase the number of decimal points in bc, so epsilon is used in calc for the real precision, and display changes the number of displayed digits. Here’s an example, to compute the almost integer value $e^{\pi\sqrt{163}}$. home:$ calc 'exp(pi()*sqrt(163))'
262537412640768743.99035456305787674737


This is not quite as close as it should be, so we’ll increase the internal precision, which at the moment is $10^{-20}$:

home:$calc 'log(1/epsilon())' 20 home:$ calc 'epsilon(1e-40);exp(pi()*sqrt(163))'
0.00000000000000000001
~262537412640768743.99999999999925007260




## A note on regression

In Matlab and similar matrix oriented systems such as Octave and Scilab, the backslash operator can be used to solve systems of linear equations. Thus to solve, for example

$x-y+2z=3$

$2x+2y-z=-1$

$x-3y+z=2$

we would enter:

>> A = [1 -1 2;2 2 -1;1 -3 1]
>> b = [3;-1;2]
>> A\b
ans =

0.28571
-0.14286
1.28571


If the matrix $A$ is not square, then the backslash operator solves the associated linear least squares regression problem, using the equation

$(X^TX)A=X^Tz$

where $X$ is the matrix containing the independent variables, $z$ is the matrix containing the dependent variables, and $A$ are the coefficients of the best fit linear function obtained using least squares.

For example, suppose we create random array of $x$ and $y$ and a variable $z$ which is approximately equal to $a_1+a_2x+a_3y$:

>> N = 5
>> [x,y] = meshgrid(1:N);
>> z = randn(N)+3*(randn()*x+randn()*y);
>> X = [ones(N^2,1),x(:),y(:)]
>> Z = z(:);
>> A = X\Z
A =

0.97722
2.99300
0.70215


To show that these values are in fact the same as that obtained with $(X^TX)A=X^Tz$:

>> inv(X'*X)*X'*Z
ans =

0.97722
2.99300
0.70215


Now we can see how close the best fit values are to the $z$ values:

>> > r = A(1)+A(2)*x(:)+A(3)*y(:);
>> [z(:),r,abs(z(:)-r)]
ans =

4.780556    4.672364    0.108192
6.130427    5.374509    0.755918
7.459690    6.076654    1.383036
7.336100    6.778800    0.557300
4.672154    7.480945    2.808791
7.323456    7.665366    0.341910
9.637936    8.367511    1.270425
10.092996    9.069656    1.023340
11.415496    9.771802    1.643694
8.659346   10.473947    1.814601
11.073199   10.658368    0.414831
10.165530   11.360513    1.194983
12.077467   12.062658    0.014809
11.450084   12.764804    1.314720
12.881417   13.466949    0.585532
12.402667   13.651370    1.248703
14.885301   14.353515    0.531786
14.859297   15.055660    0.196363
16.480020   15.757806    0.722215
16.656739   16.459951    0.196788
15.183698   16.644372    1.460674
16.990119   17.346517    0.356398
18.229809   18.048662    0.181147
19.423194   18.750808    0.672387
21.299761   19.452953    1.846808


Here’s a quick proof of the formula $(X^TX)A=X^Tz$. Suppose we have variables $x_i$, $y_i$ and $z_i$ and we wish to express $z$ as a linear function of $x$ and $y$:

$z = a+bx+cy$

in such a way that the sum of squares:

$S=\sum(a+bx_i+cy_i-z_i)^2$

is minimized. To do this we simply differentiate $S$ with respect to $a$, $b$ and $c$, and solve the system in which each derivative is set equal to zero.

Differentiating with respect to $x$, and dividing by two, produces

$\sum(a+bx_i+cy_i-z_i)x_i = 0$

Similarly ($dS/db=0$):

$\sum(a+bx_i+cy_i-z_i)y_i = 0$

and for $dS/dc=0$:

$\sum(a+bx_i+cy_i-z_i) = 0$

This can be written in matrix form as:

$\displaystyle{\left[\begin{array}{ccc}\sum x_i^2&\sum x_iy_i&\sum x_i\\ \sum x_iy_i&\sum y_i^2&\sum y_i\\ \sum x_i&\sum y_i&\sum 1\end{array}\right]\left[\begin{array}{c}a\\b\\c\end{array}\right]=\left[\begin{array}{c}\sum x_iz_i\\ \sum y_iz_i\\ \sum z_i\end{array}\right]}$

Now define the vectors:

$\mathbf{x}=[x_1\; x_2\;\ldots\; x_n]^T$

$\mathbf{y}=[y_1\; y_2\;\ldots\; y_n]^T$

$\mathbf{z}=[z_1\; z_2\;\ldots\; z_n]^T$

$\mathbf{1}=[1\; 1\;\ldots\; 1]^T$

And so the matrix equation above can be written as

$\displaystyle{\left[\begin{array}{ccc}\mathbf{x}^T\mathbf{x}&\mathbf{x}^T\mathbf{y}&\mathbf{x}^T\mathbf{1}\\ \mathbf{x}^T\mathbf{y}&\mathbf{y}^T\mathbf{y}&\mathbf{y}^T\mathbf{1}\\ \mathbf{x}^T\mathbf{1}&\mathbf{y}^T\mathbf{1}&\mathbf{1}^T\mathbf{1}\end{array}\right]\left[\begin{array}{c}a\\b\\c\end{array}\right]=\left[\begin{array}{c}\mathbf{x}^T\mathbf{z}\\ \mathbf{y}^T\mathbf{z}\\ \mathbf{z}^T\mathbf{1}\end{array}\right]}$

This can be written as

$\displaystyle{\left[\begin{array}{c}\mathbf{x}^T\\ \mathbf{y}^T\\ \mathbf{1}^T\end{array}\right]\left[\begin{array}{ccc}\mathbf{x}^T& \mathbf{y}^T& \mathbf{1}^T\end{array}\right]\left[\begin{array}{c}a\\b\\c\end{array}\right]=\left[\begin{array}{c}\mathbf{x}^T\\ \mathbf{y}^T\\ \mathbf{1}^T\end{array}\right]\left[\begin{array}{c}\mathbf{z}\end{array}\right]}$

This is clearly the same as the equation $(X^TX)A=X^Tz$, and is obviously true for any number of independent variables.

## Ruth-Aaron pairs

On May 25, 1935, at Forbes Field in Pittsburgh, Babe Ruth hit his 714th (and last) home run, setting a record for major league baseball which would stand unchallenged for nearly 40 years. The record was broken on April 8, 1974 by Hank Aaron who hit his 715th home run. (He went on to hit a total of 755 home runs. This record has since been broken by Barry Bonds with 762 home runs, and those three: Ruth, Aaron, Bonds, are the only three players ever to hit more than 700 home runs in major league baseball.)

Not long after Aaron’s 715th homer, Carl Pomerance, then at the University of Georgia, and two others, noticed two facts about the numbers 714 and 715:

1. The two numbers between them factored into the first seven primes:

$714 = 2\cdot 3\cdot 7\cdot 17$

and

$715 = 5\cdot 11\cdot 13$.

2. The sums of the prime factors of each were the same:

$2+3+7+17 = 5+11+13 = 29.$

Pomerance and his colleagues wrote a paper called “714 and 715” which he later described as “humorous” discussing these numbers. This paper, had the effect of catching the eye of Paul Erdős, who suggested that he visit Pomerance in Georgia, thus starting a long and fruitful collaboration between the two (at least 40 joint papers).

In honour of the two baseball players, two consecutive integers the sum of whose prime factors (with multiplicity) are equal, are called Ruth-Aaron pairs. It’s not hard to set up a brute force program to list them:

sage: def S(n): return sum(i*j for i,j in factor(n))
....:
sage: for k in range(1,100000):
....:     if S(k)==S(k+1):
....:         print k
....:
5
8
15
77
125
714
948
1330
1520
1862
2491
3248
4185
4191
5405
5560
5959
6867
8280
8463
10647
12351


… and so on. This sequence is A039752 in the OEIS.

It is not yet known whether there are infinite Ruth-Aaron pairs, but it is known, thanks to Pomerance and Erdős, that they are quite rare; in fact Pomerance proved that for any integer $x$, the number of integers $n for which $S(n)=S(n+1)$ was bounded as

$\displaystyle{O\left(\frac{x(\log\log x)^4}{(\log x)^2}\right)}$

and that the sum of the reciprocals was bounded.

In 1995 Emory University awarded honorary degrees to both Aaron and Erdős. Pomerance, who was then at Emory, asked for a baseball – several had been provided for Aaron to sign – and asked both Aaron and Erdős to sign it. “I joke that Aaron should have Erdős-number 1 since, though he does not have a joint paper with Erdős, he does have a joint baseball.”

## The best Matlab alternative

Matlab is probably as close as you’ll find these days to an international standard for numerical computations. It seems to be taught at almost all universities in one form or another, is loved by engineers, and contains many thousands of lines of highly optimised code. And as well as the base package, there are lots of toolboxes: add-ons which provide functionality in particular areas: image processing, data acquisition, curve fitting and lots more.

Matlab’s interface, general ease of use, power, and extensibility have made it deservedly popular, and it has spawned a vast publishing industry.

As a teaching tool, though, it suffers from one major defect: it’s very expensive. And the add-on toolboxes add to its cost.

There’s therefore the need of a low-cost – preferably open-source – alternative, which can be used by students as a sort of drop-in replacement for experimentation at home, or on their own laptops.

Here is my list of requirements for such an alternative:

1. It must install under Windows. This is absolutely necessary for me, at least, since most of my students use Windows. A very few use OSX, and hardly any use Linux.
2. It must, as far as possible, be identical in use with Matlab. The commands should be the same, and simple scripts and functions should transfer between the two with no effort.
3. It should have a reasonably nice interface, as similar as possible to Matlab’s own. A basic command line only interface running in a terminal won’t cut it with my students.

Note that I’m personally satisfied with a lot less. I have, for example, no problems with a terminal text-based interface and in fact do most of my own computing from just such. However, I’m interested in making things as easy as possible for my own students, many of whom are not hugely computer-savvy.

The alternatives

There are three alternatives: Octave, Freemat, and Scilab. A little bit about each one.

Octave (more properly GNU Octave) has been around since about 1998, or 1992, depending on how you measure it, and was conceived and initially developed by John Eaton at the University of Wisconsin–Madison to support a course in chemical reactor design. It’s named after one of Eaton’s professors, Octave Levenspiel, who apparently had a genius at “back of envelope” calculations. Anyway, you can read about it on its wikipedia page. As of now, it is up to version 3.2.4, and is a highly mature product, with an emphasis on Matlab compatibility, and supported by an army of users and developers. There are also many add-on packages.

Freemat has been in development since about 2004, mainly by one person, Samit Basu, with help from some others. It seems to have sprung into life very quickly. I can’t find anything online about its history or provenance, but my guess is that it’s a fork or the continuation of some older project. It exists in forms for windows, Linux and MacOS, and has the most Matlab-like interface of all of them. It has a small wikipedia page.

Scilab is in some ways the worthiest alternative to Matlab, in terms of raw power, development (mostly at INRIA, France), and add-on packages. It also has installers for Windows, MacOS and Linux.

There are of course several good comparisons of these software tools, of which “A Comparative Evaluation of Matlab, Octave, Freemat, and Scilab for Research and Teaching” by Neeraj Sharma and Matthias K. Gobbert from the University of Maryland is probably the most far reaching. You can read it here. They don’t make any conclusions as such, but one or two comments are made, such as

“The syntax of Octave and FreeMat is identical to that of Matlab in our tests. However, we found during our tests that FreeMat lacks a number of functions, such as kron for Kronecker products, pcg for the conjugate gradient method, and mesh for three-dimensional plotting. Otherwise, FreeMat is very much compatible with Matlab. Even though Scilab is designed for Matlab users to smoothly utilize the package and has a m-file translator, it often still requires manual conversions.

The tests in this work lead us to conclude that the packages Octave and FreeMat are most compatible with Matlab, since they use the same syntax and have the native capability of running m-files. Among these two packages, Octave is a significantly more mature software and has significantly more functions available for use.”

For numerical solutions of differential equations, the authors state:

“Matlab, Octave, and Scilab have state-of-the-art variable-order, variable-timestep methods for both non-stiff and stiff ODEs available, with Matlab’s implementation being the richest and its stiff solvers being possibly more efficient. FreeMat is clearly significantly weaker than the other packages in that it does not provide a state-of-the-art ODE solver, particularly not for stiff problems.”

Another nice comparison (with lots of screenshots) is at http://www.dedoimedo.com, and is a 2010 discussion on scientific computing. Here’s the conclusion, to save you the trouble of actually opening a web page yourself:

“All three programs have their merits. For Windows users, the simplest choice is FreeMat, while Octave is the most powerful and best run on Linux. Scilab works better on Windows, but it is not fully compatible with Matlab language and requires more effort to master, while leveraging these disadvantages with the fleet of toolboxes and Scicos.

The best thing is, you can use them all together. But if you want to be picky, then I’d recommend you start with FreeMat and Octave and move on to Scilab when you gain enough expertise. “