The promise of CAS calculators

I’m always confused about the best computing environment for teaching mathematics, and over the years I’ve used Derive, Maple, Maxima, Axiom, Sage, and Matlab. All these have great qualities, and Matlab in particular is the tool of choice for engineering education, as it is for engineering practice. But there is as yet no similar standard for mathematics teaching. Aside from engineering, most of my mathematics students are studying either education (with a mathematics method, to enable them to teach upper high school mathematics), or science students, mainly chemists and environmental scientists, who need a little maths in their courses.

We have been using Maple reasonably successfully for years. However, there are some significant problems with its use:

  1. It is very expensive, and the yearly license is becoming prohibitive in an environment where serious cost-cutting is necessary. (Let’s face it, if I had to choose between ditching some software, or ditching me, I’d go for the software!)
  2. In our subjects we only use a tiny fraction of Maple’s functionality. This is of course not an issue with open source software.
  3. Students can’t play with Maple outside the laboratories; we aren’t in a position to buy a heap of student licenses and sell them on to the students.
  4. Once they leave the University, the students are never likely to use Maple again in their professional lives.

The last problem could be countered by saying that the experience and skills gained in learning Maple are in themselves valuable. But this is a tack I’ve heard used too many times to justify the learning of mathematics itself. Basically it comes down to “Why do we have to learn this <expletive>?” “Because it’s GOOD FOR YOU!” I’m not fully convinced, and neither are the students.

Somebody said some years ago that the best adjuncts for learning mathematics are what the students can carry with them. I’ve been trying to track down the source of this wisdom, and so far have failed. Laptops/netbooks with, say Maxima, could indeed overcome the cost of Maple (and of course we could ditch Maple for Maxima, or Sage), and provide a learning environment which can go anywhere. But not all of my students own, or can afford, such machines. Many do, but not all. Some students of course now have tablets (iPads, or less often an Android tablet) but as yet there’s no powerful mathematics software native for any tablet environment.

Sage can of course be used online, as can Wolfram Alpha, but I don’t believe the latter is a good teaching tool: it’s just an answer-giving device; you can’t really play with it, extend it, insert your own programs, use the output of one command as an input to another, etc.

So what I’m looking for is a cheap, carry-everywhere mathematical tool.

Enter the new generation of CAS Calculators. There are a few on the market, of which the Texas Instruments TI-nspire, and the Casio ClassPad, are pre-eminent. Here they are side by side:

TI-nspireCasio ClassPad

I’m not going to go into the pros and cons of each one; they both have vocal supporters and detractors, but in many ways they solve my difficulty of finding the right software. In particular:

  1. Many students have used these at school already, so have some exposure to them.
  2. There is no tedious licensing involved.
  3. These calculators are powerful enough to deal with pretty much all of the sort of mathematics we teach (calculus, linear algebra, numerical methods etc).
  4. It is easy to extend the calculators with further libraries or your own functions and programs. And there are lots more out there on the Internet. For example, some TI-nspire programs can be found here.
  5. Once in schools, teachers will be using these in their own classrooms.
  6. A scientist out in the field is more likely to be able to carry a calculator along than a laptop fitted out with mathematical software.

I’m very excited about these nifty new tools, and I intend to be exploring them with my students this year. Stay tuned!

In a tropical garden

I just realized I haven’t posted since early November. That’s no way to win friends and influence people. And in fact I’ve done very little mathematical thinking in that time. I have however, been on holiday, some of the pics of which you can see here. So in lieu of some mathematics, here’s a poem I wrote inspired by the superb tropical gardens of our house. It’s inspired by Andrew Marvell‘s great poem “The Garden” which if you don’t know, you should read.

In a Tropical Garden (after Marvell)

The warm air tells me to relax
Unreachable by mail or fax,
The vibrant sun with brilliant rays
Erases thoughts of care-filled days.
The water splashing in the pool
Appears and sounds both fresh and cool.
Upon my back I close my eyes
And rest with happy heartfelt sighs.

The spider weaves her busy snare,
Small insects quarrel in the air,
Bright parrots flash between the trees
Whose leaves are still for lack of breeze,
Beyond the pool I hear the frogs
Who croak and jump beside their logs:
To all these I’m an honoured guest
And all encourage me to rest.

Above my head the palm fronds sway
And whisper in my ear to stay,
To bide my time in dappled shade
And let my cares and worries fade -
The flowers and the waving ferns
To treat my eyes do take their turns;
Relaxed, at ease, in such surrounds -
No softer bed than these sweet grounds.

How wondrous are the garden’s fruits:
Bananas, lychees, bamboo shoots,
The mango trees above me drop
Into my lap their sumptuous crop;
The durian, and mangosteen,
Of all the fruits the king and queen:
Into my tastebuds they enmesh
The sweetness of their luscious flesh.

Why do I try so hard at work,
Harsh lighting keeping out the murk?
Commuting there by tedious miles
To push round papers, memos, files.
I do no harm, but little good,
And far less than I wish I could,
And always I’ve myself deceived
That some small good I have achieved.

But here I can, by gradual stages
Forget about time-sheets and wages,
Agendas, meetings, all such things
Just drift away as if on wings.
No more reminders or alarms
Just floral scents like magic charms.
No clock here strikes its busy hours;
I simply lie and smell the flowers.

This garden’s made with such success
Out of primeval wilderness,
The gardener and his sweaty team
Have worked for months to make this dream:
To keep the wilderness at bay
They need to battle every day;
Let down their guard but once and then:
The jungle claims its ground again.

This place is artificial, sure
And yet I love it even more,
The air is humid, hot and still,
I lie beneath a leafy frill,
There’s no place better for my brain
To loosen from a year’s strain;
With restful dozing, dawn ’til late,
I let my mind rejuvenate.

(December, 2011)

The sestina

A sestina is a poetic form consisting of six stanzas of six lines each, followed by an “envoi” of three lines. In each stanza, the six words at the end of the line are the same, but in a different order. So if the six lines in the first stanza end 123456, then the orders of the line endings for the next five stanzas are 615243, 364125, 532614, 451362, and 246531. You can read more at the wikipedia page, and see an example by Anthony Hecht here, and one by Ezra Pound here. The form is supposed to have been invented by one Artaud Daniel, a 12th century troubadour, and has been exercising the minds of poets and other verse-makers since.

My interest here is the permutations which are used for the line endings. If you look carefully, you’ll see that the same permutation is used each time. This means that the sestina can be described in terms of powers of a single permutation. In cyclic notation, the first permutation 615243 can be written as

(1,2,4,5,3,6).

That is, the ending of line 1 becomes the ending of line 2 in the next stanza, the ending of line 2 becomes the ending of line 4 in the next stanza, and so on. Here’s Sage to do all the hard work:

sage: S = SymmetricGroup(6)
sage: p = S((1,2,4,5,3,6))
sage: for i in range(6): print i+1,p^i
....: 
1 ()
2 (1,2,4,5,3,6)
3 (1,4,3)(2,5,6)
4 (1,5)(2,3)(4,6)
5 (1,3,4)(2,6,5)
6 (1,6,3,5,4,2)

Look at stanza 5 for example, The permutation is 451362. This means that 1 goes to the third place, 2 to the sixth place, 3 to the fourth place and so on. But this is given by the cycles

(1,3,4)(2,6,5).

If we want to print out the permutations as given above, that is easy:

sage: q = p.inverse()
sage: for i in range(6):
....:     qi = q^i
....:     print i+1,[qi(i) for i in range(1,7)]
....:     
1 [1, 2, 3, 4, 5, 6]
2 [6, 1, 5, 2, 4, 3]
3 [3, 6, 4, 1, 2, 5]
4 [5, 3, 2, 6, 1, 4]
5 [4, 5, 1, 3, 6, 2]
6 [2, 4, 6, 5, 3, 1]

The sestina can now described as follows: in stanza n, the permutations of the line endings are given by p^{n-1} where p = (1,2,4,5,3,6)\in S_6. I wonder how much of the theory of permutation groups was known to Artaud Daniel?

Computing from the command line

Like many die-hard, old-school Linux users I have an inordinate fondness for the command line, and would rather use a command line interface than a snazzy GUI. This post is about how to use the command line to perform simple arithmetic computations, and a few examples of slightly more advanced mathematics. There are many times when I just want an immediate answer to a (usually simple) computation without powering up Octave or Sage, both of which use up sometimes overloaded memory. I’m assuming the use of the Bash shell.

Use of expr

For simple integer computations, the expr command is sufficient. Here are a few examples:

home:$ expr $[3+4]
7
home:$ expr $[5**6]
15625
home:$ expr $[100/17]
5

Because the Bash shell only allows integers as input and output, a division will always be rounded down.

Use of bc

To obtain fractional output, and to use more mathematical functions, the command bc may be used. This should be available on all Linux distributions, and is a remarkably powerful calculator. See an old article of mine here for an introduction to it.

Any expression can be piped to bc for computation, and if bc is invoked with bc -l, this loads a mathematics library which includes the functions a(x), s(x), c(x), e(x), l(x),j(n,x), for inverse tangent, sin, cos, exponential function, natural logarithm and Bessel function of order n respectively. You can read the man page for bc online.

So here are some examples of the use of bc:

home:$ echo "1/3+4/7" | bc -l
.90476190476190476190
home:$ echo "2^100" | bc
1267650600228229401496703205376
home:$ echo "scale=100;4*a(1)" | bc -l
3.14159265358979323846264338327950288419716939937510\
58209749445923078164062862089986280348253421170676

The value scale is an internal variable which gives the number of figures after the decimal point. Alternatively, we can use this syntax:

home:$ bc -l <<< "pi=4*a(1);s(pi/5)"
.58778525229247312915
home:$ bc -l <<< "e(l(100)/3)"
4.64158883361277889236

This last calculation, of course, uses exp(x) and ln(x) to compute the cube root of 100.

Because bc is a fully featured programming language, you can write programs to perform other functions: a huge number of such functions can be found here and here. Suppose we put a handful of useful functions in a file called stuff.bc and include in our file .bashrc the line

alias b="bc -l /path/to/stuff.bc"

Then we’re able to do things like this:

home:$ b <<< "gcd(1001,161)"
7
home:$ b <<< "bin(100,50)"
100891344545564193334812497256
home:$ b <<< "fact(100)"
9332621544394415268169923885626670049071596826438\
1621468592963895217599993229915608941463976156518\
2862536979208272237582511852109168640000000000000\
00000000000

(assuming that our file contains commands for the greatest common divisor, binomial coefficients, and factorials.)

Use of calc

A newer and even more powerful computational language is provided by calc. It has far more built-in commands than bc, as you can see here. It can be used interactively (as can bc) but here we are interested in its use as a command-line calculator. Here’s a couple of examples:

home:$ calc '100!'
        933262154439441526816992388562667004907159682643816214685
92963895217599993229915608941463976156518286253697920827223758251
185210916864000000000000000000000000
home:$ calc 'sin(pi()/5)'
	0.58778525229247312917

Just as scale can be used to increase the number of decimal points in bc, so epsilon is used in calc for the real precision, and display changes the number of displayed digits. Here’s an example, to compute the almost integer value

e^{\pi\sqrt{163}}.

home:$ calc 'exp(pi()*sqrt(163))'
	262537412640768743.99035456305787674737

This is not quite as close as it should be, so we’ll increase the internal precision, which at the moment is 10^{-20}:

home:$ calc 'log(1/epsilon())'
	20
home:$ calc 'epsilon(1e-40);exp(pi()*sqrt(163))'
	0.00000000000000000001
	~262537412640768743.99999999999925007260


A note on regression

In Matlab and similar matrix oriented systems such as Octave and Scilab, the backslash operator can be used to solve systems of linear equations. Thus to solve, for example

x-y+2z=3

2x+2y-z=-1

x-3y+z=2

we would enter:

>> A = [1 -1 2;2 2 -1;1 -3 1]
>> b = [3;-1;2]
>> A\b
ans =

   0.28571
  -0.14286
   1.28571

If the matrix A is not square, then the backslash operator solves the associated linear least squares regression problem, using the equation

(X^TX)A=X^Tz

where X is the matrix containing the independent variables, z is the matrix containing the dependent variables, and A are the coefficients of the best fit linear function obtained using least squares.

For example, suppose we create random array of x and y and a variable z which is approximately equal to a_1+a_2x+a_3y:

>> N = 5
>> [x,y] = meshgrid(1:N);
>> z = randn(N)+3*(randn()*x+randn()*y);
>> X = [ones(N^2,1),x(:),y(:)]
>> Z = z(:);
>> A = X\Z
A =

   0.97722
   2.99300
   0.70215

To show that these values are in fact the same as that obtained with (X^TX)A=X^Tz:

>> inv(X'*X)*X'*Z
ans =

   0.97722
   2.99300
   0.70215

Now we can see how close the best fit values are to the z values:

>> > r = A(1)+A(2)*x(:)+A(3)*y(:);
>> [z(:),r,abs(z(:)-r)]
ans =

    4.780556    4.672364    0.108192
    6.130427    5.374509    0.755918
    7.459690    6.076654    1.383036
    7.336100    6.778800    0.557300
    4.672154    7.480945    2.808791
    7.323456    7.665366    0.341910
    9.637936    8.367511    1.270425
   10.092996    9.069656    1.023340
   11.415496    9.771802    1.643694
    8.659346   10.473947    1.814601
   11.073199   10.658368    0.414831
   10.165530   11.360513    1.194983
   12.077467   12.062658    0.014809
   11.450084   12.764804    1.314720
   12.881417   13.466949    0.585532
   12.402667   13.651370    1.248703
   14.885301   14.353515    0.531786
   14.859297   15.055660    0.196363
   16.480020   15.757806    0.722215
   16.656739   16.459951    0.196788
   15.183698   16.644372    1.460674
   16.990119   17.346517    0.356398
   18.229809   18.048662    0.181147
   19.423194   18.750808    0.672387
   21.299761   19.452953    1.846808

Here’s a quick proof of the formula (X^TX)A=X^Tz. Suppose we have variables x_i, y_i and z_i and we wish to express z as a linear function of x and y:

z = a+bx+cy

in such a way that the sum of squares:

S=\sum(a+bx_i+cy_i-z_i)^2

is minimized. To do this we simply differentiate S with respect to a, b and c, and solve the system in which each derivative is set equal to zero.

Differentiating with respect to x, and dividing by two, produces

\sum(a+bx_i+cy_i-z_i)x_i = 0

Similarly (dS/db=0):

\sum(a+bx_i+cy_i-z_i)y_i = 0

and for dS/dc=0:

\sum(a+bx_i+cy_i-z_i) = 0

This can be written in matrix form as:

\displaystyle{\left[\begin{array}{ccc}\sum x_i^2&\sum x_iy_i&\sum x_i\\ \sum x_iy_i&\sum y_i^2&\sum y_i\\ \sum x_i&\sum y_i&\sum 1\end{array}\right]\left[\begin{array}{c}a\\b\\c\end{array}\right]=\left[\begin{array}{c}\sum x_iz_i\\ \sum y_iz_i\\ \sum z_i\end{array}\right]}

Now define the vectors:

\mathbf{x}=[x_1\; x_2\;\ldots\; x_n]^T

\mathbf{y}=[y_1\; y_2\;\ldots\; y_n]^T

\mathbf{z}=[z_1\; z_2\;\ldots\; z_n]^T

\mathbf{1}=[1\; 1\;\ldots\; 1]^T

And so the matrix equation above can be written as

\displaystyle{\left[\begin{array}{ccc}\mathbf{x}^T\mathbf{x}&\mathbf{x}^T\mathbf{y}&\mathbf{x}^T\mathbf{1}\\ \mathbf{x}^T\mathbf{y}&\mathbf{y}^T\mathbf{y}&\mathbf{y}^T\mathbf{1}\\ \mathbf{x}^T\mathbf{1}&\mathbf{y}^T\mathbf{1}&\mathbf{1}^T\mathbf{1}\end{array}\right]\left[\begin{array}{c}a\\b\\c\end{array}\right]=\left[\begin{array}{c}\mathbf{x}^T\mathbf{z}\\ \mathbf{y}^T\mathbf{z}\\ \mathbf{z}^T\mathbf{1}\end{array}\right]}

This can be written as

\displaystyle{\left[\begin{array}{c}\mathbf{x}^T\\ \mathbf{y}^T\\ \mathbf{1}^T\end{array}\right]\left[\begin{array}{ccc}\mathbf{x}^T& \mathbf{y}^T& \mathbf{1}^T\end{array}\right]\left[\begin{array}{c}a\\b\\c\end{array}\right]=\left[\begin{array}{c}\mathbf{x}^T\\ \mathbf{y}^T\\ \mathbf{1}^T\end{array}\right]\left[\begin{array}{c}\mathbf{z}\end{array}\right]}

This is clearly the same as the equation (X^TX)A=X^Tz, and is obviously true for any number of independent variables.

Ruth-Aaron pairs

On May 25, 1935, at Forbes Field in Pittsburgh, Babe Ruth hit his 714th (and last) home run, setting a record for major league baseball which would stand unchallenged for nearly 40 years. The record was broken on April 8, 1974 by Hank Aaron who hit his 715th home run. (He went on to hit a total of 755 home runs. This record has since been broken by Barry Bonds with 762 home runs, and those three: Ruth, Aaron, Bonds, are the only three players ever to hit more than 700 home runs in major league baseball.)

Not long after Aaron’s 715th homer, Carl Pomerance, then at the University of Georgia, and two others, noticed two facts about the numbers 714 and 715:

  1. The two numbers between them factored into the first seven primes:

    714 = 2\cdot 3\cdot 7\cdot 17

    and

    715 = 5\cdot 11\cdot 13.

  2. The sums of the prime factors of each were the same:

    2+3+7+17 = 5+11+13 = 29.

Pomerance and his colleagues wrote a paper called “714 and 715” which he later described as “humorous” discussing these numbers. This paper, had the effect of catching the eye of Paul Erdős, who suggested that he visit Pomerance in Georgia, thus starting a long and fruitful collaboration between the two (at least 40 joint papers).

In honour of the two baseball players, two consecutive integers the sum of whose prime factors (with multiplicity) are equal, are called Ruth-Aaron pairs. It’s not hard to set up a brute force program to list them:

sage: def S(n): return sum(i*j for i,j in factor(n))
....: 
sage: for k in range(1,100000):
....:     if S(k)==S(k+1):
....:         print k
....:         
5
8
15
77
125
714
948
1330
1520
1862
2491
3248
4185
4191
5405
5560
5959
6867
8280
8463
10647
12351

… and so on. This sequence is A039752 in the OEIS.

It is not yet known whether there are infinite Ruth-Aaron pairs, but it is known, thanks to Pomerance and Erdős, that they are quite rare; in fact Pomerance proved that for any integer x, the number of integers n<x for which S(n)=S(n+1) was bounded as

\displaystyle{O\left(\frac{x(\log\log x)^4}{(\log x)^2}\right)}

and that the sum of the reciprocals was bounded.

In 1995 Emory University awarded honorary degrees to both Aaron and Erdős. Pomerance, who was then at Emory, asked for a baseball – several had been provided for Aaron to sign – and asked both Aaron and Erdős to sign it. “I joke that Aaron should have Erdős-number 1 since, though he does not have a joint paper with Erdős, he does have a joint baseball.”

The best Matlab alternative

Matlab is probably as close as you’ll find these days to an international standard for numerical computations. It seems to be taught at almost all universities in one form or another, is loved by engineers, and contains many thousands of lines of highly optimised code. And as well as the base package, there are lots of toolboxes: add-ons which provide functionality in particular areas: image processing, data acquisition, curve fitting and lots more.

Matlab’s interface, general ease of use, power, and extensibility have made it deservedly popular, and it has spawned a vast publishing industry.

As a teaching tool, though, it suffers from one major defect: it’s very expensive. And the add-on toolboxes add to its cost.

There’s therefore the need of a low-cost – preferably open-source – alternative, which can be used by students as a sort of drop-in replacement for experimentation at home, or on their own laptops.

Here is my list of requirements for such an alternative:

  1. It must install under Windows. This is absolutely necessary for me, at least, since most of my students use Windows. A very few use OSX, and hardly any use Linux.
  2. It must, as far as possible, be identical in use with Matlab. The commands should be the same, and simple scripts and functions should transfer between the two with no effort.
  3. It should have a reasonably nice interface, as similar as possible to Matlab’s own. A basic command line only interface running in a terminal won’t cut it with my students.

Note that I’m personally satisfied with a lot less. I have, for example, no problems with a terminal text-based interface and in fact do most of my own computing from just such. However, I’m interested in making things as easy as possible for my own students, many of whom are not hugely computer-savvy.

The alternatives

There are three alternatives: Octave, Freemat, and Scilab. A little bit about each one.

Octave (more properly GNU Octave) has been around since about 1998, or 1992, depending on how you measure it, and was conceived and initially developed by John Eaton at the University of Wisconsin–Madison to support a course in chemical reactor design. It’s named after one of Eaton’s professors, Octave Levenspiel, who apparently had a genius at “back of envelope” calculations. Anyway, you can read about it on its wikipedia page. As of now, it is up to version 3.2.4, and is a highly mature product, with an emphasis on Matlab compatibility, and supported by an army of users and developers. There are also many add-on packages.

Freemat has been in development since about 2004, mainly by one person, Samit Basu, with help from some others. It seems to have sprung into life very quickly. I can’t find anything online about its history or provenance, but my guess is that it’s a fork or the continuation of some older project. It exists in forms for windows, Linux and MacOS, and has the most Matlab-like interface of all of them. It has a small wikipedia page.

Scilab is in some ways the worthiest alternative to Matlab, in terms of raw power, development (mostly at INRIA, France), and add-on packages. It also has installers for Windows, MacOS and Linux.

There are of course several good comparisons of these software tools, of which “A Comparative Evaluation of Matlab, Octave, Freemat, and Scilab for Research and Teaching” by Neeraj Sharma and Matthias K. Gobbert from the University of Maryland is probably the most far reaching. You can read it here. They don’t make any conclusions as such, but one or two comments are made, such as

“The syntax of Octave and FreeMat is identical to that of Matlab in our tests. However, we found during our tests that FreeMat lacks a number of functions, such as kron for Kronecker products, pcg for the conjugate gradient method, and mesh for three-dimensional plotting. Otherwise, FreeMat is very much compatible with Matlab. Even though Scilab is designed for Matlab users to smoothly utilize the package and has a m-file translator, it often still requires manual conversions.

The tests in this work lead us to conclude that the packages Octave and FreeMat are most compatible with Matlab, since they use the same syntax and have the native capability of running m-files. Among these two packages, Octave is a significantly more mature software and has significantly more functions available for use.”

For numerical solutions of differential equations, the authors state:

“Matlab, Octave, and Scilab have state-of-the-art variable-order, variable-timestep methods for both non-stiff and stiff ODEs available, with Matlab’s implementation being the richest and its stiff solvers being possibly more efficient. FreeMat is clearly significantly weaker than the other packages in that it does not provide a state-of-the-art ODE solver, particularly not for stiff problems.”

Another nice comparison (with lots of screenshots) is at http://www.dedoimedo.com, and is a 2010 discussion on scientific computing. Here’s the conclusion, to save you the trouble of actually opening a web page yourself:

“All three programs have their merits. For Windows users, the simplest choice is FreeMat, while Octave is the most powerful and best run on Linux. Scilab works better on Windows, but it is not fully compatible with Matlab language and requires more effort to master, while leveraging these disadvantages with the fleet of toolboxes and Scicos.

The best thing is, you can use them all together. But if you want to be picky, then I’d recommend you start with FreeMat and Octave and move on to Scilab when you gain enough expertise. “

Some more comments

Freemat seems to be suffering a lack of development, as this chart from ohloh.net shows:

Its current version – 4.0 – while very nice in many ways, has been static now for nearly two years. Interestingly enough, if you enter “help hist” at the Freemat prompt, you are told that this file was “adopted” in Freemat from Octave. This begs the question: how much of Octave has been ported into Freemat? Another problem with Freemat is a seeming low number of users; it simply does not have the large user base of Octave or Scilab.

If compatibility with Matlab was not a concern, then Scilab would be the tool of choice. As well, Scilab comes with Scicos, a dynamic systems modeller similar in style to Matlab’s Simulink. Neither of the other two systems has such functionality. However, Scilab is different enough from Matlab to make conversions between the two (especially of functions and programs) not entirely trivial. There is a conversion program, but like most other conversion programs, it’s a bit hit or miss. Differences between Matlab and Scilab are given here. One thing which always annoyed me in Scilab was that the whos command (which in the other systems gives a list of the user’s variables, with types and sizes), here gives the list of all variables, including built in ones such as %pi and %i.

Octave does not come with any nice interface. However, several third-party interfaces are being produced, of which the most promising for Windows users is GUI Octave. Another fine looking interface is that provided by Xoctave, which at the moment does not seem to be as mature a product as GUI Octave.

Conclusion

Given all the above, my choice is clear: Octave with GUI Octave ticks all the boxes. Almost everything you can do with Matlab can be done in Octave. In fact I use Octave almost exclusively: for various reasons I can’t have Matlab installed on my office computer at work, so all my “Matlab preparation” is in fact done with Octave. This is the system which I’m recommending to my students. Try it yourself! You may be pleasantly surprised.

Interlaced eigenvalues

An engineering colleague asked me recently to investigate the relationship between the eigenvalues of a matrix and the eigenvalues of its principal submatrices. This is not an area I know much about, so I started off doing a web search. And I learned about a remarkable fact which goes back to Cauchy:

Suppose A is an n\times n Hermitian matrix, with eigenvalues

\lambda_1\le\lambda_2\le\lambda_3\le\cdots\le\lambda_n

and suppose that B is obtained from A by removing row and column i for some 1\le i\le n. Then the eigenvalues of B

\mu_1\le\mu_2\le\mu_3\le\cdots\le\mu_{n-1}

satisfy the interlacing

\lambda_1\le \mu_1\le\lambda_2\le \mu_2\le\lambda_3\le\mu_3\le\cdots\le\mu_{n-1}\le\lambda_n.

The shortest proof of this result is due to the late Steve Fisk of Bowdoin College, who is probably best known in the wider mathematical world for his “proof from the book” of Chvátal’s art gallery theorem. His proof (of Cauchy’s interlacing theorem, not the art gallery theorem) can be found here. Fisk’s proof, as you can see, is based on a result of Hermite, which claims that the roots of two polynomials f and g interlace if and only if all the roots of f+\alpha g are real, for all \alpha\in\mathbb{R}. For a complete proof of this result, from which the eigenvalues result is deduced as a corollary, see Fisk’s “Polynomials, roots, and interlacing“.

Another proof uses the min-max theorem, of which a full account is given by Terry Tao.

A little experiment

We shall use Matlab/Octave here, and instead of general Hermitian matrices we shall restrict ourselves to real symmetric matrices.

If D is a diagonal matrix, then its entries will be its eigenvalues, and if Q is an orthogonal matrix of the same size as D then

QDQ^T

will be a symmetric matrix with the same eigenvalues as D. We can construct an orthogonal matrix using this result of Cayley: if S is a skew-symmetric matrix, then

(S-I)^{-1}(S+I)

is orthogonal. The proof requires some elementary matrix algebra; see here. Here’s the experiment:

>> n = 4;
>> D = diag(1:n);
>> R = rand(n);
>> S = R - R';
>> Q = inv(S-eye(n))*(S+eye(n));
>> M = Q*D*Q'
M =

   2.1146941  -0.8411071  -0.5795664   0.1635006
  -0.8411071   3.0320881  -0.7959547  -0.7357826
  -0.5795664  -0.7959547   2.6104766  -0.0025486
   0.1635006  -0.7357826  -0.0025486   2.2427411

>> sort(eig(full(M))')
ans =

   1.0000   2.0000   3.0000   4.0000

>> i = ceil(n*rand);
>> N = M; N(i,:)=[]; N(:,i)=[];
>> sort(eig(full(N))')
ans =

   1.5021   2.0328   3.8546

Note that each eigenvalue of N is between two consecutive eigenvalues of M.

A tangent identity

We all know that the Chebyshev polynomials of the first kind can be defined as

T_n(x)=\cos(n\cos^{-1}x).

When I was fiddling about with identities associated with Machin’s formula I came across an identity for tangents, which I’m sure is well known, but which I’d never seen before. Here it is:

\displaystyle{\tan(n\tan^{-1}x)=\frac{\displaystyle{{n\choose 1}x-{n\choose 3}x^3+{n\choose 5}x^5-\cdots}}{\displaystyle{{n\choose 0}-{n\choose 2}x^2+{n\choose 4}x^4-\cdots}}}

This can be more precisely written as follows:

\displaystyle{\tan(2k\tan^{-1}x)=\frac{\displaystyle{\sum_{i=0}^{k-1}(-1)^i{2k\choose 2i+1}x^{2i+1}}}{\displaystyle{\sum_{i=0}^k(-1)^i{2k\choose 2i}x^{2i}}}}

and

\displaystyle{\tan((2k+1)\tan^{-1}x)=\frac{\displaystyle{\sum_{i=0}^k(-1)^i{2k+1\choose 2i+1}x^{2i+1}}}{\displaystyle{\sum_{i=0}^k(-1)^i{2k+1\choose 2i}x^{2i}}}}

And these two identities can be shoehorned into one ugly (but general!) expression:

\displaystyle{\tan(n\tan^{-1}x)=\frac{\displaystyle{\sum_{i=0}^{\lfloor (n-1)/2\rfloor}(-1)^i{n\choose 2i+1}x^{2i+1}}}{\displaystyle{\sum_{i=0}^{\lfloor n/2 \rfloor}(-1)^i{n\choose 2i}x^{2i}}}}

The identity is not hard to prove by induction. As T is already used for the Chebyshev polynomials, we shall write G_k(x) for \tan(k\tan^{-1}x) and then write

\displaystyle{G_k(x)=\frac{N_k(x)}{D_k(x)}}

(where N,D stands for numerator and denominator respectively.) Then by the addition formula for \tan(x), we have

\displaystyle{G_{k+1}(x)=\frac{x+G_k(x)}{1-xG_k(x)}}.

Writing G_k on the right in terms of N_k and D_k, and then multiplying through by D_k produces

\displaystyle{G_{k+1}(x)=\frac{xD_k(x)+N_k(x)}{D_k(x)-xN_k(x)}}.

Recall that

\displaystyle{N_k(x)={k\choose 1}x-{k\choose 3}x^3+{k\choose 5}x^5-\cdots}

and

\displaystyle{D_k(x)={k\choose 0}-{k\choose 2}x^2+{k\choose 4}x^4-\cdots}

Now consider the the numerator of G_{k+1}(x):

xD_k(x)+N_k(x).

The coefficient of x^{2i+1} will be

\displaystyle{(-1)^i{k\choose 2i}+(-1)^i{k\choose 2i+1}=(-1)^i{k+1\choose 2i+1}}.

The denominator of G_{k+1}(x) is

D_k(x)-xN_k(x)

and the coefficient of x^{2i} will be

\displaystyle{(-1)^i{k\choose 2i}+(-1)^i{k\choose 2i-1}=(-1)^i{k+1\choose 2i}}.

Note that the constant coefficient is

\displaystyle{{k\choose 0}={k+1\choose 0}}.

These results, plus the trivial result for k=1, prove the identity.

Machin’s formula

I suppose everybody will have come across Machin’s formula:

\displaystyle{4\tan^{-1}\left(\frac{1}{5}\right)-\tan^{-1}\left(\frac{1}{239}\right)=\frac{\pi}{4}}

or in its cotangent form:

\displaystyle{4\cot^{-1}5-\cot^{-1}239=\frac{\pi}{4}}.

This formula was used by Machin to calculate \pi to 100 decimal places: an impressive feat of hand computation.

According to this excellent article on the history of the formula, Machin started by first observing that

\displaystyle{\tan\frac{\pi}{16}\approx\frac{1}{5}}

and so that

\displaystyle{4\tan^{-1}\left(\frac{1}{5}\right)}

should be close to \pi/4. In passing, it’s worth noting that John Machin (1680 – 1751) was far more than just a name attached to a single formula. He was the Professor of Astronomy at Gresham College, London; supplied Newton with some Lunar material for the third edition of his Principia; and was considered by Newton to be the “best geometer”.

In fact it’s not hard to prove Machin’s formula directly. Here’s a simple-minded approach using the cotangent version and the standard addition formula:

\displaystyle{\cot(a\pm b)=\frac{\cot a\cot b\mp 1}{\cot b\pm\cot a}}.

What we’ll do is to take the cotangent of both sides:

\cot(4\cot^{-1}5-\cot^{-1}239)=1

and use the cotangent addition formula to expand the left hand side.

First note that if a=b=5 then

\begin{array}{crl}  \cot(2\cot^{-1}5)&=&\displaystyle{\frac{25-1}{10}}\\[4mm]  &=&\displaystyle{\frac{12}{5}}.  \end{array}

Then it follows that

\begin{array}{crl}  \cot(4\cot^{-1}5)&=&\displaystyle{\frac{\displaystyle{\frac{144}{25}-1}}{\displaystyle{\frac{24}{5}}}}\\[4mm]  &=&\displaystyle{\frac{144-25}{120}}\\[4mm]  &=&\displaystyle{\frac{119}{120}.}  \end{array}

Finally, applying the cotangent to the left hand side of Machin’s formula:

\begin{array}{crl}  \cot(4\cot^{-1}5-\cot^{-1}239)&=&\displaystyle{\frac{\displaystyle{\frac{119}{120}239+1}}{\displaystyle{239-\frac{119}{120}}}}\\[4mm]  &=&\displaystyle{\frac{(119)(239)+120}{(120)(239)-119}}\\[4mm]  &=&1.  \end{array}

New formulas from old

There are an infinite number of “Machin-type” formulas, consisting of linear combinations of cotangents of integers whose sum is \pi/4. And up until recently they were the favorite method of computing \pi to large numbers of digits. (More recently, techniques based on the arithmetic-geometric men have been used.)

Using the useful notation \{x\} for \cot^{-1} x Machin’s formula can be written as

4\{5\}-\{239\}=\{1\}.

Another formula, which is almost trivial and is attributed to Euler, is

\{2\}+\{3\}=\{1\}.

To form new formulas from old, the following general result has been attributed to the ever-surprising Charles Lutwidge Dodgson:

\{x+\alpha\}+\{x+\beta\}=\{x\} if \alpha\beta=x^2+1.

(That formula, and its attribution, can be found in “The enhancement of Machin’s formula by Todd’s process” by Michael Wetherfield, The Mathematical Gazette, Vol. 80, No. 488 (Jul., 1996), pp. 333-344.)

Here, for example, are a few “Dodgson formulas” (the first one of which is Euler’s):

  1. \{2\}+\{3\}=\{1\}.
  2. \{3\}+\{7\}=\{2\}.
  3. \{5\}+\{8\}=\{3\}.
  4. \{7\}+\{18\}=\{5\}.
  5. \{8\}+\{57\}=\{7\}.

By a happy merry-go-round of substitutions, infinite numbers of Machin-type formulas can be developed. Here is an example from Wetherfield’s paper.

  1. Substitute equation 2 above into equation 1 to obtain

    2\{3\}+\{7\}=\{1\}.

    (This is known as Hutton’s formula.)

  2. Now substitute in equation 3:

    2(\{5\}+\{8\})+\{7\}=\{1\}.

    and eliminate \{8\} by using equation 5:

    2(\{5\}+\{7\}-\{57\})+\{7\}=\{1\}

    or

    2\{5\}+3\{7\}-2\{57\}=\{1\}.

  3. Now use equation 4 to eliminate \{7\}:

    2\{5\}+3(\{5\}-\{18\})-2\{57\}=\{1\}.

    or

    5\{5\}-3\{18\}-2\{57\}=\{1\}.

  4. Multiply by 4:

    20\{5\}-12\{18\}-8\{57\}=4\{1\}.

  5. and eliminate \{5\} by using Machin’s formula:

    5(\{1\}+\{239\})-12\{18\}-8\{57\}=4\{1\}

This last formula can be written

12\{18\}+8\{57\}-5\{239\}=\{1\}

which is a Machin-type formula known as Gauss’s formula. Masses of other formulas can be found here.

Follow

Get every new post delivered to your Inbox.