Category Archives: Software

Easy Simpson’s rule

Simpson’s rule, as I’m sure all readers of this blog will know, is a simple rule for numerical integration (or quadrature). It can be obtained in two ways, either by determining the interpolating polynomial over three equally spaced points and integrating that polynomial, or by finding the values w_i for which

\displaystyle{\int_a^bf(x)\,dx = w_0f(a)+w_1f(a+h)+w_2f(b)}

(where h=(b-a)/2) is accurate for f(x)=1,x,x^2.  Over the interval [0,2]\, Simpson’s rule is

\displaystyle{\int_0^2f(x)\,dx = \frac{1}{3}\Bigl(f(0)+4f(1)+f(2)\Bigr)}

and over a general interval

\displaystyle{\int_a^bf(x)\,dx = \frac{b-a}{6}\Bigl(f(a)+4f(a+h)+f(b)\Bigr)}.

Simpson’s rule has the added advantage that it is also accurate for cubic polynomials.

On its own the rule is not hugely accurate, so one way of increasing the accuracy is to divide the interval into smaller, equally sized regions and apply Simpson’s rule to each of them. This provides the composite Simpson’s rule:

\displaystyle{\int_a^bf(x)\,dx \approx \frac{b-a}{3n}\Bigl(f(a)+4f(a+h)+2f(a+2h)+4f(a+3h)+\cdots\Bigr.}

\displaystyle{\Bigl.+2f(a+(n-2)h)+4f(a+(n-1)h)+f(b)\Bigr)}

where n is even and h=(b-a)/n. Sometimes this rule is written

\displaystyle{\int_a^bf(x)\,dx \approx \frac{b-a}{6n}\Bigl(f(a)+4f(a+h)+2f(a+2h)+4f(a+3h)+\cdots\Bigr.}

\displaystyle{\vspace{1cm}\Bigl.+2f(a+(2n-2)h)+4f(a+(2n-1)h)+f(b)\Bigr)}

where there is no restriction on n and h=(b-a)/2n. For a spirited criticism of this rule, both from practical and pedagogical perspectives, see here.

However, in a standard generic mathematics course, Simpson’s rule is as far as most students get with quadrature. The trouble is that the order of the weights (the coefficients of the function values):

1,4,2,4,2,4,...,2,4,1

is tricky to program. Either you have to run two loops, one for the 4′s and the other for the 2′s, or you have to do a little test at each stage to see whether you are at a 2 or 4 point, or you might invoke the modulus function. None of those ways are particularly straightforward, especially if, like many beginning students, you have little programming experience.

So here’s a super easy way. We start with end points a and b and an even integer n. We first create a list of all the values at which the function is to be computed:

xs=[a+kh \mbox{\; for\;} k=0,1,2,\ldots n]

where h=(b-a)/n, and then a list of the weights:

w=[3-(-1)^k \mbox{\; for\;} k=0,1,2,\ldots n].

This last list is [2,4,2,4,\ldots,2,4,2].

We then compute

\sum w_if(xs_i)

However! note that our list of weights starts and finishes with 2′s instead of 1′s. We correct for this by simply subtracting f(a) and f(b) from the sum. Our “easy” Simpson’s rule is then:

\displaystyle{\frac{b-a}{2n}\Bigl(\sum w_if(xs_i)-f(a)-f(b)\Bigr)}.

This is almost trivial to compute. For example, on a TI-nspire CAS calculator, we might have:

a:=0

b:=1

\mbox{Define\;}f(x)=e^{-x^2}

n:=16

h:=(b-a)/n

xs:=\mbox{seq}(a+k\cdot h,k,0,n)

w:=\mbox{seq}(3-(-1)^k,k,0,n)

(\mbox{sum}(w\cdot f(xs))-f(a)-f(b))\cdot (b-a)/n

The TI-nspire has the nice attribute that a function applied to a list will produce a list of all function values, and the product of two lists is a list containing all the products of corresponding elements. The Casio Classpad has the same functionality with lists, and the above commands can be used.

This approach of using 3-(-1)^k to obtaining the weights for Simpson’s rule seems not widely used, but it seems to me to be absurdly simple. I’ve only come across it in one other place.

Initial comparison of TI-nspire CAS and Casio Classpad 330 calculators

The TI-nspire CAS and the Classpad 330 are the top of the line CAS calculators, and are used in both secondary and tertiary education worldwide.  And each one has its passionate adherents and equally passionate detractors.  However, it’s hard to find a balanced comparison (would you expect to find the TI calculator favourably reviewed in a Casio forum?) TI have released a color version: the TI-nspire CX CAS; and Casio have announced a color version of the ClassPad, to be available in the second half of this year.

For the purposes of this comparison, I’ll use the current monochrome TI-nspire CAS, and the current ClassPad 330.  Here is what they look like:

Also this post will be entirely based on using the handheld calculators, not the computer software which is also available for each of them. And it will be by its very nature superficial; I hope to cover particular topics in more detail in later posts.

Interface

The TI-nspire has a screen size of 320×240 pixels; the ClassPad 160×240, exactly half the size of the TI-nspire. This means that the TI-nspire display is sharper. The TI-nspire has a menu driven system entirely driven by button presses; the ClassPad uses a touch screen controlled with a stylus. I find the TI-nspire much easier to use; I don’t particularly like jabbing the stylus at a screen with such poor resolution. It’s worth noting as a side issue that the stylus is easily lost; they seem to get looser in their housing with age. And many second hand ClassPads are advertised with “stylus missing”.

The ClassPad also defaults on readability. Here’s a couple of screenshots from Casio:

Note that at the top of the screen there is a menu bar, underneath of which is a “toolbar”. At the low resolution of the screen, I find the toolbar icons to be all but unreadable. Some of them are menu headers, which when selected produce a tools menu. This is a list of tiny icons, which to my eyes could mean anything. (Even with my reading glasses, and my eyesight, though not as sharp as once it was, is not that bad.) It may have been a cost issue: producing a touch screen cheaply, but I find the low resolution of the ClassPad a real hindrance to its use.

Commands on the TI-nspire are usually obtained with at most 4 keystrokes: a menu key, then at most 3 menu choices. Commands on the ClassPad, resolution notwithstanding, are also fairly easy to obtain, either with the menu bar, or from the onscreen keyboard. This image:

shows that Casio have in fact designed a very clever interface which makes good use of the limited space to obtain most commands and functions easily.

For comparison, here is a TI screenshot, showing a linear function and some values:

Note that the screen, though maybe cluttered for some, is still quite readable and sharp.

The TI-nspire uses a “Documents” system, and a document may have any number of different “pages”: calculation, geometry, lists & spreadsheet, data & statistics and so on. Variables may be transferred between pages, so that a variable, say a list created and named in a spreadsheet, can be opened in a calculator page. The ClassPad has “eActivities”; here is a description from a Casio page:

An eActivity is a new, exciting innovation. It is small packet of learning that can be written by anyone, even yourself! It is loaded into the calculator and a student is then free to work through it. It is normally a scripted learning experience that has all the functions of the calculator at its disposal. eActivities can be used to present students with assignments, learning opportunities or in fact to build your own simple software…

Mathematical strength

I think that the calculators are much of a muchness here. True, the ClassPad offers some goodies such as the Laplace and Fourier transforms, which might give it a slight edge. However, there are also a few anomalies. The TI-nspire provides LU factorization of a matrix M in the form of the matrices P,\,L,\,U, where P is a permutation matrix for which PM=LU; the ClassPad just returns L and U. This means that given a matrix M the ClassPad will return L and U matrices for which possibly LU\ne M. An example is the matrix

\displaystyle{M=\begin{bmatrix}0&4&3\\ 1&2&-1\\ 3&1&2\end{bmatrix}}

for which

\displaystyle{P=\begin{bmatrix}0&0&1\\ 1&0&0\\ 0&1&0\end{bmatrix},\,L=\begin{bmatrix}1&0&0\\ 0&1&0\\ \frac{1}{3}&\frac{5}{12}&1\end{bmatrix},\,U=\begin{bmatrix}3&1&2\\ 0&4&3\\ 0&0&-\frac{35}{12}\end{bmatrix}}

This may be a small worry, but surely it’s a simple thing and worth getting right?

I haven’t applied (as yet) a CAS test-suite to either calculator. You have to remember that in terms of raw computing power, these calculators are lightweights, and can only be expected to solve relatively small problems. But here’s one test I did yesterday. On the TI-nspire, I found the factorization of Cole’s number:

2^{67}-1=193707721 \cdot 761838257287

performed almost immediately. Impressive! Just to check this wasn’t a fluke, I obtained the following factorization:

24!-1=625793187653 \cdot 991459181683

in a little under three minutes. I tried the first factorization on the ClassPad and got nothing in a reasonable time (several minutes). So either the TI-nspire uses a better factorizing algorithm, or it has a more powerful processor.

This is an issue I’ll try exploring in more detail in later posts.

Variables and programming

The TI-nspire has three ways of assigning a value to a variable:

  • Define x=3
  • 3 \rightarrow x (the arrow is obtained by using the “sto” key)
  • x:=3 (The “:=” is obtained with two key presses: crtl, :=)

The ClassPad has two ways:

  • 3 \Rightarrow x
  • x:=3

without, however, any single key to obtain “:=”. And only the arrow method can be used in programs.

Both calculators use a version of Basic on the handhelds. Apparently the scripting language lua can be used with both, but I suspect only on the associated computer software, and not directly on the handhelds. And they both provide the full gamut of control statements (loops, tests etc). Both calculators distinguish between functions and programs, although in slightly different ways.

The ClassPad offers very nice sequence handling: you can set up a number of interlinked sequences, and set ‘em off. For example, you can enter

\displaystyle{\begin{array}{rl}a_{n+1}&=(4-b_n+c_n)/4\\ a_0&=1\end{array}}

\displaystyle{\begin{array}{rl}b_{n+1}&=(2-2\times a_n-c_n)/5\\ b_0&=1\end{array}}

\displaystyle{\begin{array}{rl}c_{n+1}&=(11-a_n)/3\\ c_0&=1\end{array}}

(which are the sort of equations which may be encountered in Jacobi’s iteration method for linear equations), in the Sequence page, and just run them. The TI-nspire has a “Lists & Spreadsheet” page, in which you can perform similar calculations, but not as elegantly as this.

As a mini-comparison, here are for-loops on each:

TI-nspire ClassPad
For i,1,10
x:=x+1
EndFor
For 1\Rightarrowi to 10
x+1\Rightarrowx
Next

Graphics

Both calculators support 2D graphing (they would hardly have a market if they didn’t). The ClassPad also supports 3D graphing, which is supposed to be a huge feather in its cap, but in fact I find this to be of limited use. Just as an experiment, I attempted to plot the monkey saddle surface z=x^3-3xy^2 and couldn’t make any sense of the result. Even with zooming, moving the graph around with the stylus, I couldn’t obtain any intuitive feel of the shape of the graph. Partly this is because there doesn’t seem to be any hidden line algorithm available. However, the drag and drop capability of the ClassPad, with the stylus, is one area which it shines. The TI-nspire’s touchpad, although it works, is clumsy in comparison.

Which is better?

I much prefer (for my purposes) the TI-nspire’s sharper screen and interface. However, when the new colour model fx-CP400 is released I will certainly reconsider. I have no loyalty to either brand. In the end, it comes down to what you and everybody else is using. My daughter is using a ClassPad 330 for her year 11 mathematics (its use is mandated by her school) and is very satisfied with it. I would imagine that even with the models discussed in this post (the monochrome TI-nspire already superseded, the ClassPad 330 about to be), either one would happily manage pretty much all school, and most college, mathematics.

Maxima on Android

Such a long long time since I last posted!  Very poor blogging etiquette.  Anyway, to break the drought, here’s news of the best thing to come my way in the last six months – fully featured Maxima running on Android (this screenshot from its website):

maxima-screen-shot1 maxima-screen-shot2

At a conference I attended late last year, sponsored by calculator companies, the Maple reps (both of whom I knew), claimed that calculators were an old-fashioned outmoded technology, especially now that with smartphones, ipods etc, you can have access to the Internet in your pocket (Wolfram Alpha, anybody)?  I still think there’s a lot to be said though for a device which you can use independently of a network, and in fact I believe most calculator users use very little of their functionality.  How many CAS calculator users, I wonder, program on or with their machines?

But now here’s a fully featured, mature CAS running on a portable device – the first such one, as far as I know.  There are in fact other CAS’s available for portable systems: PocketCAS, Mathomatic, Yacas (which I’ve written about before), no doubt others.  None of them, as far as I can tell, work particularly well, or are in any way complete.  It doesn’t take long before you run up against a wall of something they are unable to do.

So I see Maxima as being the first complete portable CAS – and I’m delighted to see it on Android.  It works fine on a smartphone, and is quite wonderful on a tablet.

Its home page is here.

The best Matlab alternative (2)

A year or so ago, I published a blog post called The best Matlab Alternative.  In that post, I discussed the merits of Scilab, Octave and Freemat, and decided that for me, Octave suited me best.  However, Scilab is probably the more full featured and powerful.

I’ve since discovered that there’s another contender, in some ways better than all of the above: Python.  My requirement for a “Matlab alternative” was that:

  1. It must install under Windows. This is absolutely necessary for me, at least, since most of my students use Windows. A very few use OSX, and hardly any use Linux.
  2. It must, as far as possible, be identical in use with Matlab. The commands should be the same, and simple scripts and functions should transfer between the two with no effort.
  3. It should have a reasonably nice interface, as similar as possible to Matlab’s own. A basic command line only interface running in a terminal won’t cut it with my students.

Suppose we drop condition 2, and require software which can do pretty much most of what Matlab can do, in a nice visual environment, we find Python coming out looking very good indeed.  Now, I have been a lover of Matlab, and its open source alternatives, for many years, and I do like the rapid prototyping possible with Matlab, as well as its ease for dealing with large data types.  And vectorization (where a single command is automatically applied to every element of a multi-dimensional array) provides some wonderfully elegant and simple code.  (It can also of course provide for dense obtuseness.)

However, I have also become aware of some criticisms of Matlab.  A major one is vendor lock-in: the code, the language, the IDE, is all owned by The Mathworks.  If they decide to radically change something, you have no alternative but keep paying the license fees to keep up.  Second is that the language has no object-oriented capabilities, at least in a natural way;(see comment below) that the language has no standard as such – it’s just whatever The Mathworks decide to provide.  Thirdly is the namespace: all functions exist as top-level objects, and all functions (“m-files”) in the Matlab path (which usually will include multiple directories) are available when you start up.  Fourth is that an m-file can only contain one function (and maybe some sub-functions), whose name must be the same name as the file.  This means that if you create some new functions – and most users do – you need to be sure that it doesn’t conflict with any other functions.  This problem is exacerbated if you include lots of toolboxes.  The Image Processing Toolbox alone adds 270 new functions to the Matlab namespace  and there are 38 other toolboxes available.  That’s a lot of functions.

One list of criticisms is available here; another (called “Abandon Matlab”) is here.  Note that the second site recommends (as well as Python), R, which the first site also pans.  There’s no accounting for tastes.

So, back to Python.

Python is a deservedly popular language, supported by a huge user base, tons of books, articles, websites, conferences, seminars, and so on.  It is fully open source, and it is managed by a non-profit organization called the Python Software Foundation.

Python has lots of libraries for scientific computing, starting with numpy (“numeric python”) and scipy (“scientific python”).   And there are others – such as scikits-image which extends scipy’s image processing capabilities (the Python Imaging Library does not seem to be currently in active development).  Many more are listed here.

What’s more, Python has lots of IDEs which mean that you can use it pretty much as you would use Matlab, Octave or Scilab.  For a nice visual IDE with all the trimmings, there are Eric IDE, Spyder, or PyDev, which is an Eclipse plugin, plus lots more, both open source and commercial.

For those of us who are satisfied with less, there’s iPython, an enhanced shell (which is what I like to use).  And there are again lots more, including editors and all sorts of other goodies for development, here.

Let’s finish with a tiny example: to create a 1500 x 1500 random matrix, invert it, and check the trace of the product of the matrix and its inverse.  I’ll do all this in the iPython shell, which I invoke from my linux command line with ipython.

Once in the iPython shell, we import the pylab modules; this includes numpy, scipy, and the plotting library matplotlib:

In[1]: from pylab import *

(In fact this can be done from the command line with iPython, but this command should work in any Python environment.)

Now to define the matrix, compute its inverse, and check the trace of their product:

In [2]: m = mat(rand(1500,1500))
In [3]: mi = linalg.inv(m)
In [4]: np.trace(m*mi)
Out[4]: 1500.0

And of course all this is nice and fast (on my linux laptop):

In [5]: %time m = mat(rand(1500,1500)); mi = linalg.inv(m); np.trace(m*mi)
CPU times: user 6.52 s, sys: 0.21 s, total: 6.73 s
Wall time: 3.78 s

Fast modular exponentiation

I’m a “program first, think later” kind of bloke, and I often write programs which are very slow, until I see some way later of vastly improving their speed. This was the case recently, when I was experimenting with the PRNG

x_0=1,x_1=1, x_i=7^{x_{i-1}}+7^{x_{i-2}}\pmod{p}

with p=2^{31}-1.

Modular exponentiation is computationally expensive. If you use the standard square and multiply algorithm, then you need a squaring for every bit, and an additional multiplication for every 1. In a 32 bit number (in which we would expect on average 16 ones), this is 48 square/multiply operations for each exponentiation. This meant that putting my PRNG though the TestU01 “Big Crush” suite took over 188 hours. (I just had it grind away in the background.)

However, since our exponents n are always 32 bits, we can write

n=aT+b

where T=2^{16} and a,b< T. Given n, the values a,b can be found with simple bit operations:

a = n>>16; b = n&0xFFFF;

Then

7^n=7^{aT+b}=(7^T)^a(7)^b.

Suppose we have two arrays H[] and L[] each if size T, and containing all the powers of 7^T modulo p and 7 modulo p. Since

7^{2^{16}}=118315438\pmod{p}

we can precompute H and T using the following pseudo-code

H[0] = 1
L[0] = 1
for i from 1 to T-1:
    H[i] = 118315438 * H[i-1] (mod p)
    L[i] = 7 * l[i-1] (mod p)

Then

7^n = H_aL_b

with a and b computed as above. This reduces the exponentiation from 48 multiplications/squarings to just 1. When I ran Big Crush again on my speedier implementation, it took just 11 hours.

For those of you who like such things, here is a C program to test it. On my laptop it can produce 10^8 random values in just over 9 seconds.


/*
The Fibonacci exponential generator:

x[0]=1, x[1]=1, x[i] = 7^x[i-1]+7^x[i-2] (mod p) for n, with p=2^31-1.

Compile with

gcc testexpfib.c -o testexpfib -lm -std=gnu99 -g
*/

#include <stdlib.h>
#include <stdio.h>
#define p 0x7FFFFFFF
#define T 65536
#define ulong unsigned long

ulong n,a,b,i,xH,xL,yH,yL,t1,t2,t3,temp,x,y;
ulong high[T],low[T];

ulong modp(ulong z) {return (((z)&p)+((z)>>31));}

ulong mod2p(int q, ulong x) // returns (2^q)x mod p
{
  return ((x>>(31-q))+((x<<q)&p));
}

ulong mul_modp(ulong x,ulong y) // returns xy mod p
{
  xH = x>>16;
  xL = x&0xFFFF;
  yH = y>>16;
  yL = y&0xFFFF;
  t1 = mod2p(1,xH*yH);
  t2 = modp(mod2p(16,xH*yL)+mod2p(16,xL*yH));
  t3 = xL*yL;
  return (modp(modp(t1+t2)+t3));
}

ulong expmodp(ulong x) // returns 7^x mod p
{
  a = x>>16;
  b = x & 0xFFFF;
  return (mul_modp(high[a],low[b]));
}

ulong expfibp(void)
{
  temp = modp(expmodp(x)+expmodp(y));
  x = y;
  y = temp;
  return (y);
}
 
int main(int argc, char *argv[])
{
  high[0]=1;
  low[0]=1;
  for (i = 1; i < T; i++)
    {
      high[i] = mul_modp(118315438,high[i-1]);
      low[i] = mul_modp(7,low[i-1]);
    }

  
  if (argc<2)
    n = 1000000;
  else
    n = strtol(argv[1],NULL,10);

  x = 1;
  y = 1;

  for (i = 0; i < n+1; i++)
    {
      ulong s = expfibp();
      if (i%(n/10)==0)
  	printf("%10lu, %lu\n",i,s);
    }

  return 0;
}

A conceptually simple PRNG

I like pseudo-random number generators, and I’ve been experimenting with them for the last few weeks. Now, Stephan Mertens has developed a PRNG which he calls YARN (Yet Another Random Number), defined in general by

x_i=ax_{i-1}\pmod p

z_i=g^{x_{i-1}}\pmod p

u_i=z_i/p

where gcd(a,p)=1 and g is a primitive root modulo p. You can read his material here. The final values u_i will be uniformly randomly distributed over the interval (0,1). The exponential definition of z_i removes the linearity of the x_i values. It is clear, however, that the period can only be at most p: after at most p-1 iterations, the x_i values will start repeating, and hence so will the z_i values.

A better (but slower) PRNG is defined as

x_0=7, x_1=7, x_i=7^{x_{i-1}}+7^{x_{i-2}}\pmod p for i\ge 2

with p=2^{31}-1. This simple PRNG (which I’ve called a “Fibonacci exponential generator”), passes all the Crush tests of the TestU01 suite of randomness tests, currently the most stringent known. Here’s a little C program called fexp.c, which uses the FLINT libraries, to compute and display some of these values:

/*
A program to test the Fibonacci exponential  PRNG 

x[i] = b^x[i-1] + b^x[i-2] mod p, 

with b = 7, p = 2^31-1.

Compile with:
gcc fexp.c -o fexp -lflint -lmpir -ltestu01 -l probdist -lmylib -l -std=gnu99 -g
 */

#include <stdlib.h>
#include <stdio.h> 
#include <flint.h> 
#include <fpmz.h> 
#define ulong unsigned long
#define pp 0x7FFFFFFF

fmpz_t x,y,new,r1,r2,b,p;
ulong i,n,t;

ulong fexp(void)
{
  fmpz_powm(r1,b,x,p); // r1 = b^x (mod p)
  fmpz_powm(r2,b,y,p); // r2 = b^y (mod p)
  fmpz_add(new,r1,r2); // new = r1 + 22
  fmpz_mod(new,new,p); // (mod p)
  fmpz_set(x,y);       // x = y
  fmpz_set(y,new);     // y = new
  return (fmpz_get_ui(y));
}

int main (int argc, char *argv[]) 
{
  if (argc < 2)
    n = 1000000;
  else
    n = strtol(argv[1],NULL,10);

  fmpz_init_set_ui(b,7);  // Initialize values for use in Flint functions
  fmpz_init_set_ui(p,pp);
  fmpz_init_set_ui(x,7);
  fmpz_init_set_ui(y,7);

  for (i = 0;i < n+1;i++)
    {
      t = fexp();
      if (i%(n/10)==0)
  	printf("%10lu,%10lu\n",i,t);
    }

  return 0;
}

It is known that if

x^r+x^s+1

is a primitive polynomial, (with r>s>0), then the “lagged Fibonacci generator”

x_i = x_{i-r}+x_{i-s}\pmod m

where m=2^M will have a maximum period of (2^r-1)2^{M-1}. See this page of Richard Brent and the wikipedia page for more details. So, if for example we set (r,s) =  (6972593,3037958) and M=32, then the resulting generator will have a maximum period of

(2^{6972593}-1)2^{31}\approx 2^{6972624}

which is far far greater than the period of the Mersenne twister, one of the most popular PRNGs in current use. At the cost of keeping a state vector of 6972593 4-byte values in memory (roughly 28 megabytes), then a single PRNG defined by that recurrence would exceed all the world’s current and projected uses for random numbers. However, lagged Fibonacci generators pass some tests of randomness, but not all. And partly this is due to the strictly linear nature of the recurrence.

To remove the linearity and apply the exponential generator, suppose we have a finite field GF(2^{32}). If g is a primitive element of that field, then we may define, say

x_0=1, x_1=x, x_i=g^{x_{i-r}}+g^{x_{i-s}} for i\ge 2

Such a PRNG is both extremely robust (passes very stringent tests of randomness), and assuming that the period is obtained from the base recurrence, has a very long period.

Speeds of C libraries for number theory

Back in 1998, Paul Zimmermann wrote an article “Comparison of three public-domain multiprecision libraries: BigNum, Gmp and Pari” which is available (as a gzipped PostScript file) here. I can’t find anything more recent. Since then, there have been several other libraries produced. The main contenders seem to be:

  • GMP, the “GNU Multi-Precision Arithmetic Library”. This is now at version 5.05, and is a very complete and mature package. The authors take pains to point out that it is not a number theory package, and although it has a few number theoretical routines (prime testing, computing of Jacobi and Legendre symbols), these are not the focus of the library.
  • MPIR, which is a fork of GMP, and which is supposed to be used as a drop-in replacement for it. It has been optimized for different operating systems and C compilers.
  • LibTomMath, which is one of a suite of open source libraries, of which the chief developer is Tom St Denis. They are designed to be easy to use, and to provide a simple and efficient API for further development.
  • PARI, which is a C library for number theory, and the engine for GP, a number theory calculator. Its current version is 2.5.1.

There are several other big integer libraries available, but most seem to be the interests of a single person, or only available for C++ (such as NTL or LiDIA). There are also some other libraries which I have not considered , such as FLINT, which is a ” Fast Library for Number Theory”, but whose focus is on polynomial computations.

The problem I’m considering is the one from my previous post: given

x_0=7,\quad x_n=7^{x_{n-1}}\pmod{p}

with

p=2^{31}-1,

how long does it take to produce one million values of this sequence?

I’ve written four programs, cpowers.c, mpirpowers.c, tompowers.c, paripowers.c. The first one of these just uses standard C with no trimmings, and no optimization, aside from using bit shifts where possible. (These were cleverly devised by Lih-Yuan Deng, and can be found here.) And in the interests of portability, I made sure that all modular multiplications and squarings remained within 32 bits. The next three programs use the libraries MPIR, TomMath and PARI repectively, and are compiled by being linked with them.

I ran each program five times, and averaged the results. And they are:

  • Standard C: 1.54 seconds.
  • Using the MPIR library: 0.78 seconds.
  • Using the PARI library: 0.523 seconds.
  • Using the TomMath library: 12.83 seconds.

I don’t know why the TomMath library is so slow. In the document describing the library and its algorithms, the author says:

While LibTomMath is certainly not the fastest library (GMP often beats LibTomMath by a factor of two) it does feature a set of optimal algorithms for tasks such as modular reduction, exponentiation, multiplication and squaring. GMP and LIP also feature such optimizations while MPI only uses baseline algorithms with no optimizations. GMP lacks a few of the additional modular reduction optimizations that LibTomMath features.

I just used this library “off the shelf”, so to speak, with no more care than I did with the MPIR library (and in my last post, the GMP library). Maybe there’s some optimization of which I’m unaware.

The PARI library, although clearly the fastest, is also the hardest to use. (This point is also made by Zimmermann in his article.) PARI uses a stack for its computations, and at the beginning of a program, PARI must be initialized by giving it a chunk of memory to use. So if the stack grows until it’s bigger than the assigned memory, the program just aborts. This can be overcome in two ways: allocating more memory at the start, or better, by periodically clearing the stack. This is what I have done in my program (with help from the PARI users mailgroup), and it allows my program to run without eating up too much memory.

It would be nice to write some other programs using different problems, and see if these times are consistent, or if some libraries excel at particular tasks.

MISG day 5

I didn’t explain the problem well enough in my last post: AGL, the energy supply company, are trying (among other things) to create several “reference profiles” of energy use, to which customers can be compared.  This will give them a better idea about supply and demand, and allow them to more accurately forecast energy requirements.

This was a problem more statistical than mathematical, and people were throwing all sorts of clustering analysis techniques (about which I know nothing) at it.  However, an averaging plot showed two parallel “clumps” of data, which a single curve of best fit couldn’t describe.  Since I know nothing about statistics, but quite a lot about imaging, I had the idea of treating each customer plot as an image:

Notice the two clumps at the bottom left – this is an example of a plot to which a curve of best fit could not be usefully applied.

The image has been subdivided into unequal quadrants, of which the upper left and right parts are the energy usage during hot and cold temperatures.  The vertical dividing line is at 20C = 68F.  The horizontal line is one-quarter from the bottom.  This seemed (without any analysis) to be a good place for cutting off “standard” energy usage (below the line) to “high” usage (above the line.  The customers can then be classified by choosing thresholds for “high hot” and “high cold” usage, and clustering them depending on whether they have used more or less than the threshold value.

The nice thing about his approach is that it’s both very easy and very transparent, and also can be adjusted; for example using finer grids on the image, by subdividing the original data into times of day, or into weekdays/weekends (or both), or by using a better approach than “by eye” to determine the cutoff point between standard and high usage.

The workshop ended before I could explore this idea any further… so I might play around with this on my own some more.

Computing from the command line

Like many die-hard, old-school Linux users I have an inordinate fondness for the command line, and would rather use a command line interface than a snazzy GUI. This post is about how to use the command line to perform simple arithmetic computations, and a few examples of slightly more advanced mathematics. There are many times when I just want an immediate answer to a (usually simple) computation without powering up Octave or Sage, both of which use up sometimes overloaded memory. I’m assuming the use of the Bash shell.

Use of expr

For simple integer computations, the expr command is sufficient. Here are a few examples:

home:$ expr $[3+4]
7
home:$ expr $[5**6]
15625
home:$ expr $[100/17]
5

Because the Bash shell only allows integers as input and output, a division will always be rounded down.

Use of bc

To obtain fractional output, and to use more mathematical functions, the command bc may be used. This should be available on all Linux distributions, and is a remarkably powerful calculator. See an old article of mine here for an introduction to it.

Any expression can be piped to bc for computation, and if bc is invoked with bc -l, this loads a mathematics library which includes the functions a(x), s(x), c(x), e(x), l(x),j(n,x), for inverse tangent, sin, cos, exponential function, natural logarithm and Bessel function of order n respectively. You can read the man page for bc online.

So here are some examples of the use of bc:

home:$ echo "1/3+4/7" | bc -l
.90476190476190476190
home:$ echo "2^100" | bc
1267650600228229401496703205376
home:$ echo "scale=100;4*a(1)" | bc -l
3.14159265358979323846264338327950288419716939937510\
58209749445923078164062862089986280348253421170676

The value scale is an internal variable which gives the number of figures after the decimal point. Alternatively, we can use this syntax:

home:$ bc -l <<< "pi=4*a(1);s(pi/5)"
.58778525229247312915
home:$ bc -l <<< "e(l(100)/3)"
4.64158883361277889236

This last calculation, of course, uses exp(x) and ln(x) to compute the cube root of 100.

Because bc is a fully featured programming language, you can write programs to perform other functions: a huge number of such functions can be found here and here. Suppose we put a handful of useful functions in a file called stuff.bc and include in our file .bashrc the line

alias b="bc -l /path/to/stuff.bc"

Then we’re able to do things like this:

home:$ b <<< "gcd(1001,161)"
7
home:$ b <<< "bin(100,50)"
100891344545564193334812497256
home:$ b <<< "fact(100)"
9332621544394415268169923885626670049071596826438\
1621468592963895217599993229915608941463976156518\
2862536979208272237582511852109168640000000000000\
00000000000

(assuming that our file contains commands for the greatest common divisor, binomial coefficients, and factorials.)

Use of calc

A newer and even more powerful computational language is provided by calc. It has far more built-in commands than bc, as you can see here. It can be used interactively (as can bc) but here we are interested in its use as a command-line calculator. Here’s a couple of examples:

home:$ calc '100!'
        933262154439441526816992388562667004907159682643816214685
92963895217599993229915608941463976156518286253697920827223758251
185210916864000000000000000000000000
home:$ calc 'sin(pi()/5)'
	0.58778525229247312917

Just as scale can be used to increase the number of decimal points in bc, so epsilon is used in calc for the real precision, and display changes the number of displayed digits. Here’s an example, to compute the almost integer value

e^{\pi\sqrt{163}}.

home:$ calc 'exp(pi()*sqrt(163))'
	262537412640768743.99035456305787674737

This is not quite as close as it should be, so we’ll increase the internal precision, which at the moment is 10^{-20}:

home:$ calc 'log(1/epsilon())'
	20
home:$ calc 'epsilon(1e-40);exp(pi()*sqrt(163))'
	0.00000000000000000001
	~262537412640768743.99999999999925007260

The CAS Yacas on the iPhone

I’m always on the lookout for a good stand-alone symbolic algebra system for the iPhone/iPod Touch; when I saw that Yacas was available I grabbed it with glee – even (can you believe this) actually paid for it.

You can see what Yacas can do at its site, by browsing through its help, or by looking over its tutorial. I am supposing that the iPhone version has the same functionality.

First, the interface is very good; clean, and easy to use:

Yacas interface

The buttons are a good size, and the switching between “123″, “abc” “mno” and their shifted variants gives a complete alphabet, and a few useful mathematical and symbolic commands. The “Cmd” button brings up an alphabetical list of all commands (which is extensive), with an alphabet on the right (similar to the iPhone’s contact list) so that a command can be quickly reached with a minimum of scrolling. After a little time it is quite easy and fast.

The main problem with Yacas is simply its lack of basic functionality. For a system which promises so much, it is frustrating to be always coming up against a brick wall. Yacas seems to handle most basic linear algebra well. But it’s easy to fool. Here’s a little example testing the Cayley-Hamilton theorem:

Yacas linear algebra

See what’s happened? The last final “-8″ is interpreted by Yacas to mean subtract 8 from every element in the matrix, rather than 8 times the identity matrix.

Calculus. Everybody would want this, and for differentiation, at least, Yacas delivers:

Yacas derivatives

But its integration is very poor indeed. It can manage polynomials, simple integrals with trigonometric, logarithmic and exponential functions (the sort of simple integrals you don’t need a symbolic system for), but trips up on anything more complex. For example,

\int\frac{1}{1+x^3}\,dx

which is not really very difficult (if a little fiddly to do by hand) tied up Yacas for over 5 minutes, after which I aborted the computation. There is also no built in method for numerical integration. Its symbolic simplifying is very basic; as you can see above, it can’t simplify \exp(x)\exp(-x).

However, its help system is good:

Yacas help

and organized by type, rather than alphabetically. This can make it hard to find some commands – it took me a good few minutes to discover where “Subst” was kept. (Under “Evaluation of expressions”.)

At this stage I think Yacas is to be considered more as a “proof of concept” than a finished system. A bit more work, stronger integration and simplification, and it might be a useful system. At the moment though, it simply lacks too much basic functionality.