The rise and fall of a good idea

Last year (late 2008) I applied for a Teaching and Learning grant to embed mathematical software in all aspects of one of my first year subjects. (This subject is mostly calculus, with a smattering of combinatorics and complex numbers thrown in.) My University for some time has had a lab license for Maple; at the time we originally purchased Maple we had tried Mathematica, and been disappointed with the level of support from Wolfram Inc; and the freeware software such as Maxima, Axiom or Sage, were not as robust as they are now. Anyway, we had built up a fair amount of local expertise in the use of Maple as an adjunct to lectures and tutorial, in lab classes.

The idea of this grant was to embed Maple more solidly in the subject: to include it in the notes, and to use Maple T.A. for assessment. Basically, the entire subject was to built around Maple. I should point out here that I have no special alliance to Maple or its parent company; this was more of an historical decision, based on the current expertise of our mathematics staff.

Maple and Maple T.A.

Maple is a computer algebra system; that is, a computer system which as well as allowing numerical calculations, also allows symbolic and graphical work, all nicely integrated into a “worksheet” interface:

elementary Maple

A symbolic example with differential equations can be seen here. And here’s one with Laplace transforms:

Maple can also draw nice graphs (the spelling error is not mine!):

Maple graphic

which can be rotated, zoomed into and out from, in real time all with the mouse. Here’s another picture:

Maple 3d plots

Maple T.A. (the “T.A.” stands for “Teaching and Assessment”) is an add-on package, sold separately, which allows mathematical tests to be administered online. Tests may contain open-ended questions, multiple-choice questions, questions whose answer may be a number or a formula, questions which may contain graphics, questoins built of random elements so that each student gets a different question. Maple T.A. is designed to work with Maple in the background, so the mathematical “engine” of Maple checks to see if the answer entered by the student was correct or not. Maple T.A. comes with a powerful authoring system, where questions can be created by filling in blanks, as you can see here. And of course it does automatic grading, and collation and analysis of results. It is supposed to scale well, so whether you have 30 students or 3000, Maple T.A. should be able to handle them with ease.

My project

I applied for $10,000 – not a plethoric amount – and interestingly, one of my colleagues, to whom I showed the application, commented that he thought it was somewhat over-ambitious, given the money and time I’d allocated for it.

Initially the project went well. I got a new set of lab materials up and running, and ran the labs with a group mostly of Education students, all training to become high school maths teachers. I decided not to include too much Maple scattered through the notes; what I did was produce a new chapter introducing Maple, and used Maple as often as I could in lectures to illustrate a point (limits for example). I bought the new version of Maple T.A., and hired a research assistant (whose blog you can read here), who was able to help investigate creating some question banks to be used with Maple T.A. It’s been my experience that the students and subjects we have are not well served by “off the shelf” materials, either in terms of textbooks, or any other material. So to best serve the needs of the students and their diverse backgrounds, specially written material was needed.

The running of the labs brought up an aspect of my teaching – especially in the use of technology – which I need to fight against, and that’s being too ambitious with the students, pushing them at too fast a pace, and expecting too much of them. It was becoming apparent after a few weeks of Maple labs that I was simply expecting too much Maple expertise of the students. This was a turn-off for them. The idea of any computer algebra system (such as Maple) is that it should allow the students the opportunity to explore, to play with mathematical concepts, and let the system handle all the messy algebra. But I had made the mistake of making my lab sheets require too many different Maple commands, and so the students spent most of the labs wrestling with Maple, instead of playing with it.

Our first stumbling block was the extraordinary difficulty of installing Maple T.A. on our servers. Being commercial software, we couldn’t just compile source code, using our own libraries, we had to shoehorn Maple T.A. onto our running system. This turned out to cause endless errors and total confusion on the part of our very experienced and knowledgeable systems administrator, who had installed, maintained and upgraded many many larger and more diverse packages than this. Over a month of anguished phone calls and emails went between our sys admin and the local distributors of Maple T.A., who were at all times helpful, professional, and approachable. However, they were as much at a loss as we were. Eventually, though, but vastly behind time (according to the timeline in my grant application) we were able to get Maple T.A. installed, and could attempt to use it.

My plan was to get Maple T.A. installed and tested in first semester, and write some question banks, so that we could “roll it out” with the cohort of students in second semester. (This subject was running in both semesters, with different students each time.)

It turned out that this new version of Maple T.A. was hopelessly buggy. The software should allow mathematical formulas to be properly typeset:

Maple T.A. sums

and the formulas displayed may be built up of random elements, so each student will obtain a different question, as you can see here. Our experience was that equations and formulas were not so typeset, and to obtain typeset formulas on the screen, they had to be created with separate software, and pasted in place – a messy, time-consuming, and annoying process, and impossible to do with random elements. So immediately part of the power and promise of Maple T.A. was lost to us.

To add to the difficulties, I started spending more and more time fighting the University’s egregious Change Plans, of which you can read my comments here and here. The extraordinary worries caused by these Plans made concerted work impossible for me and for my colleagues.

Of the $10,000, I’d allocated some to buying Maple T.A. (to be precise, upgrading it, as we’d bought a license some years ago with another grant), some for a little time-release for me, and some for the research assistant. My own time-release was quickly used up in planning, implementing, and supporting the students in the labs, which, as I said above, I made too hard. My research assistant, who did about three times the amount of work I paid him, wrestled with Maple T.A. until he was totally sick of it. By the end of year, we had run out of time and money, and still didn’t have an assessment system we could easily use.

Another Change Plan – this time for technical staff – meant that suddenly all our computing support vanished, and in particular our sys admin, who’d so valiantly fought the good fight with Maple T.A. was relieved of his position. So at the time of writing, Maple T.A. is off-line, and can’t be brought back, as the one person who knows how to do it no longer works here.

Failures

Well, we have certainly not managed to achieve our aims. We have not incorporated Maple T.A. into the assessment of the subject. A small pilot study I ran two years ago (with another grant) demonstrated that this was certainly feasible, but it was not brought to fruition with this current project.

Successes

Apart from my tendency to get over-excited with software, and make things too hard for the students, we have successfully demonstrated the powerful use of mathematical tools in first year subjects. The students, once over their initial lack of enthusiasm (read: “animosity”) to the software, became happier with it over the semester, and by the end were very pleased to be using it. In a sense, the major outcome of this project was a paradigm shift, from software as being an adjunct to a subject, to becoming an integral part of it, even if not in the way I’d envisaged in the grant application.

Future plans

Now that the project has finished, we need to look ahead. To y way of thinking, teaching mathematics without using modern software tools would be like teaching carpentry without power tools. However, it is clear that Maple and Maple T.A. are not the way to go. There are several reasons for this:

  1. Being commercial software, we would be locked into expensive licensing agreements, which the University would be unwilling to maintain.
  2. Our current license (for 40 concurrent users) is not really sufficient for projected demand. We thus need a system which is highly scalable, without a huge increase in cost.
  3. Commercial also means possible difficulties in installation and maintenance, as we have seen.
  4. So far, I haven’t met anybody who’s used Maple T.A. and liked it.

So, some possibilities for further work include:

  1. Using WebWork, or STACK, which uses Maxima as its engine. Both of these are open source, and both are known to work well.
  2. Create an entirely new product which uses Sage as its engine. This may be done by rewriting STACK so that the system calls are to Sage, instead of Maxima. Given the current high interest in Sage, this may be a preferred solution.

Fermat factorization

For some reason, when I discuss factoring (very briefly!) in my cryptography subject, I’ve never discussed this method. But it’s very easy, and especially suitable for when a number n is the product of two primes p and q which are nearly equal.

We start by noting that if

n=pq

with p and q both large (and hence odd) primes, then the sum p+q and difference, p-q will both be even. Then we can write

p+q=2r

p-q=2k

with k a small value. From these two equations we have

p=r+k

q=r-k

and so

n=pq=(r+k)(r-k)=r^2-k^2

and so

n+k^2=r^2.

To factor by this method simply requires adding square numbers k^2 to n until another square r^2 is reached. Then the factors will be r\pm k.

For example, suppose n=5609. Then:

\begin{array}[h]{cccc}k&n+k^2&\mbox{Square root}\\\hline 1&5610&\mbox{Not a square}\\2&5613&\mbox{Not a square}\\3&5618&\mbox{Not a square}\\4&5625&75\end{array}

This means the two factors of n are

75\pm 4=71, 79.

There are several variants of this basic method, but they all rely on expressing n as a difference of squares.

Vigenère and Kasiski

Most writers of cryptography texts, to my mind, spend a disproportionate amount of time and space carefully discussing the cryptanalysis of the Vigenère cipher. Maybe it’s because this is the first “non-trivial” cipher most students learn, and its cryptanalysis is also slightly non-trivial. Anyway, who am I to buck this trend? So this post shows how to do it in Sage.

First, the cipher itself. It’s a polyalphabetic cipher, where each letter of the plaintext is shifted by an amount given by a keyword; this key being repeated as much as required:


T H I S I S T H E P L A I N T E X T T O B E E N C R Y P T E D
K E Y W O R D K E Y W O R D K E Y W O R D K E Y W O R D K E Y

In this example, and using the correspondence A=0, B=1 up to Y=24, Z=25, we see that every seventh letter is shifted by the same amount: letters in the 1st, 8th, 15th, 22nd positions are shifted by K=10; letters in the 2nd, 9th, 16th 23rd positions are shifted by E=4, and so on. The resulting ciphertext is


D L G O W J W R I N H O Z Q D I V P H F E O I L Y F P S D I B

What makes this cipher seem so strong is that similar letters in the plaintext are not necessarily encrypted to the same letters in the ciphertext: notice for example that the first two I’s are encrypted to Y and O; and that similar letters in the ciphertext do not necessarily correspond to the same letters in the plaintext.

However, if the length n of the keyword can be determined, then we know that every nth letter is shifted by the same amount. If there are enough letters to perform a frequency analysis (and using the fact that the most common letter in English text is E), then the value of the shift can be found. One way of determining the keyword length is to shift the ciphertext by a given amount and check all the coincidences (equality of letters) between the ciphertext and its shift. A shift with a large number of coincidences may be the length of the keyword. This method is called Kasiski’s method after its 19th century discoverer.

For an example, I have a very long ciphertext which has been obtained with a Vigenère cipher. You can see it here. Anyway, it’s nearly 20,000 characters long. To find the length of the keyword which was used, the first step is to write a little program to perform a cyclic shift of a string:

def cshift(str,n):
   slen=len(str)
   return str[n:slen+1]+str[0:n]

Now to find the coincidences with different shifts:

sage: clen=len(ct)
sage: for i in range(20):
    ctx=cshift(ct,i)
    coin=0
    for j in range(clen):
        if ct[j]==ctx[j]:
            coin=coin+1
    print i,coin

and this returns the output:


0 19369
1 683
2 782
3 791
4 728
5 675
6 655
7 1284
8 712
9 734
10 718
11 764
12 708
13 718
14 1192
15 709
16 716
17 734
18 792
19 697

Neglecting the first output, we see that the greatest number of coincidences occur for shifts of 7 and 14. This would seem to indicate that the keyword has length 7.

Now we break up the ciphertext into seven groups:

sage: ct7s=['','','','','','','']
sage: sage: for i in range(clen):
    ct7s[i%7]=ct7s[i%7]+ct[i]

The next step is to find which of the letters is most common in each group. Here’s one way:

sage: alph='ABCDEFGHIJKLMNOPQRSTUVWXYZ'
sage: for i in alph:
    print i, ct7s[0].count(i)

which produces

A 1
B 71
C 1
D 217
E 41
F 78
G 123
H 366
I 64
J 52
K 160
L 192
M 2
N 27
O 102
P 86
Q 187
R 196
S 40
T 2
U 167
V 188
W 236
X 74
Y 23
Z 71

from which it is obvious that the most common letter in this group is H. Repeating this same procedure for the other six groups enables us to build up a table of the most common letter in each group:

0 H
1 M
2 G
3 O
4 I
5 R
6 W

(Of course, this entire process can be easily automated; on the other hand it’s quite nice to do everything separately one step at a time.) Now, the most common letter in English is E, which has the value 4. If H corresponds to E, that means that for group 0, there has been a shift of 3, which corresponds to the letter D. This is the first letter of the keyword. And in fact every other letter of the keyword can be obtained by shifting back by 4 from each common letter. This produces the keyword

DICKENS.

Applying this to decrypt the ciphertext produces:

WHETHERISHALLTURNOUTTOBETHEHEROOFMYOWNLIFEORWHETHERTHATSTATION
WILLBEHELDBYANYBODYELSETHESEPAGESMUSTSHOW
...
BOURNEOFALLSUCHTRAVELLERSANDTHEMOUNDABOVETHEASHESANDTHEDUST
THATONCEWASHEWITHOUTWHOMIHADNEVERBEEN

- it’s the first chapter of “David Copperfield”, by Charles Dickens, in uppercase with all punctuation removed.

A graphical and numerical approach to teaching calculus

I’m thinking about teaching calculus using Matlab. This is far from being a new idea – people have been using Matlab in their teaching for as long as Matlab has existed – but it must be admitted that as far as calculus is concerned, Matlab is probably less well equipped than a computer algebra system such as Maple, Mathematica, or the open source Maxima and Sage. This is of course because Matlab is primarily a numeric and computational system, whereas Maple, Mathematica and the others are symbolic systems. And calculus, being primarily analytical and symbolic, is better served by them.

That being said, I think that in fact Matlab – used only for computation and graphics – could be used to great effect in a calculus course. Note that the new versions of Matlab contain the Symbolic Math Toolbox, which provides a Matlab interface to the CAS MuPAD. However, I’m rather against using MuPAD-in-Matlab. For one thing, it seems to be a fairly clumsy interface. And you have to declare variables as being symbolic, so as not to confuse Matlab with its own (numeric) variables. I think if you want to use a symbolic system, then do just that. If you want a numeric system, do just that.

My feeling is that using plain old unadorned Matlab would be a great help to engineering students, for whom the bulk of their computing is Matlab based. I think introducing symbolic computation would only add to their cognitive load, without necessarily helping their mathematical understanding. So my thoughts are to introduce the standard calculus material – limits, derivatives, integrals and their applications – analytically, and then use the numeric and graphical capabilities of Matlab to explore and enhance that material. And in fact there’s a lot you can do.

Here’s a small sample.

Limits

Limits can easily be explored numerically in Matlab. For example, the old warhorse

\displaystyle{\lim_{x\to 0}\frac{\sin x}{x}}

You could simply, for example, enter:

>> x = 0.1
>> sin(x)/x

and then replace x with 0.01, 0.001, and so on, and see what happens to the values of the function. You could first set

>> format long

to give you more decimal places.

Rather than entering a new x value each time, you could do it all in one go:

>> x = [0.1; 0.01; 0.001; 0.0001; 0.00001]
>> sin(x)./x

but this presupposes that students understand the “dot” notation of Matlab. Maybe you could define the function first:

f = @(x) sin(x)/x

and then apply it to your vector x:

arrayfun(f,x)

And of course if you wanted you could create the vector x using

x = 0.1.^[1:5]'

but this last assumes a certain knowledge of Matlab’s workings. I would certainly teach function definition (such as above), and the command arrayfun.

Newton’s method

Who does not love and teach

\displaystyle{x\leftarrow x-\frac{f(x)}{f'(x)}}?

Matlab is at a disadvantage here – without the symbolic toolbox, it can’t compute symbolic derivatives. However, it can perform differentiation of polynomials (treated as vectors of their coefficients). So we start here. For example, let’s solve

x^5+x^2-1=0

using Newton’s method, with a starting value of x=0.8 First, we define the function and its derivative:

>> p = [1 0 0 1 0 -1]
>> f = @(x) polyval(p,x)
>> df = @(x) polyval(polyder(p),x)

Putting this together into a Newton’s rule function:

>> nr = @(x) x-f(x)/df(x)

Now we can start using it. First the easy way:

>> 0.8
>> nr(ans)
   0.808859649122807
>> nr(ans)
   0.808730628358884
>> nr(ans)
   0.808730600479393

for as long as we like.

Next, a slightly more sophisticated way:

>> a = [0.8]
>> for i=1:6 a=[a(:);nr(a(i))];end
>> a

   0.80000000000000
   0.80885964912281
   0.80873062835888
   0.80873060047939
   0.80873060047939
   0.80873060047939
   0.80873060047939

For Newton’s method applied to non polynomial functions, you could enter the function and its derivative yourself:

>> f = @(x) exp(x)-x^2
>> df = @(x) exp(x)-2*x

and then proceed as above.

Drawing tangent lines

Again, first enter the function and its derivative:

>> f = @(x) exp(-x^2)
>> df = @(x) -2*x*exp(-x^2)

Given an x value, we can determine the values of the function and derivative at that point, and construct the tangent line:

>> a = 1.2
>> fa = f(a)
>> dfa = df(a)
>> t = @(x) dfa*(x-a)+fa

and we can sketch both together:

>> ezplot(f, [0,2])
>> hold on
>> ezplot(t,[0,2])

Simpson’s rule

There are, at a conservative estimate, about 42,897 versions of Simpson’s rule in Matlab. I want to keep it as simple as possible, and not use any clever Matlab tricks. So here’s one way to calculate

\displaystyle{\int^1_0e^{-x^2}\,dx}.

First set up the function, the limits of integration, and the number of times Simpson’s rule will be used:

>> a = 0
>> b = 1
>> f = @(x) exp(-x^2)
>> n = 4

Now we set up the nodes (where the integral will be evaluated):

>> x = linspace(a,b,2*n+1)
>> h = (b-a)/(2*n)

and the weights – we can either just enter them by hand:

>> w = [1 4 2 4 2 4 2 4 2 4 1]

or

>> w = zeros(1,2*n+1)
>> for i=1:n w(2*i-1:2*i+1)=w(2*i-1:2*i+1)+[1 4 1];end

Now put it all together:

>> s = sum(w.*arrayfun(f,x))*h/3

     0.74682612052747

The last could of course be done as a single matrix product:

>> s = w*arrayfun(f,x))'*h/3

but I think the first method is conceptually easier.

Animated graphics in mathematics education

As far as I know, there’s been little research done on this topic, although it seems to me that animations could be enormously helpful in facilitating the learning of some mathematics. Think, for example, of the “limit of secants” method for defining a derivative:

\displaystyle{f'(a)=\lim_{x\to a}\frac{f(x)-f(a)}{x-a}}

I know that when I teach elementary calculus I draw a curve, and try to show by means of diagrams that as x slides along the curve to a, the secant “approaches” the tangent to the curve at a. Some students, if not fully convinced, give the impression that there is merit in this argument; other students sit there with their “when will this be finished?” look on their faces. This seems to be a place where some interactive or animated graphics would be very helpful.

Most computer algebra systems now provide some support for animation; if they don’t they should. Interestingly, when CAS’s are compared, graphics usually are not considered. Way back in the late 1990’s, when Michael Wester was producing his monumental review of CAS’s (of which an earlier version can be found here), they were treated as problem solving black boxes: problem goes in; solution comes out. The best CAS was the one which solved (correctly!) the greatest number of problems from the broadest number of topics. And this style seems to have permeated CAS comparisons ever since. As far as I know, very few reviews have investigated the graphics capabilities of such systems. One which did was Barry Simon’s “Symbolic Math Powerhouses Revisited”, which is available as the first chapter in Wester’s Computer Algebra Systems, A Practical Guide.

I think that most mathematics educators are simply unwilling to come to grips with the technology which enables such graphics to be produced, but in fact most modern software makes this very easy.

As an example, I’m going to show how to produce a cycloid in Sage. Sage contains an “animate” command, which simply runs though a list of graphics objects, displaying them one after the other. To draw my cycloid, I’m going to need three such lists:

  1. The moving circle.
  2. The point on the circle’s circumference.
  3. The cycloid as drawn by that point.

And of course I need the equation of the cycloid, which is easiest given parametrically:

x=t-\sin(t)
y=1-\cos(t)

So here’s how to produce the three graphics objects above:

step = 0.3
v = []
for t in srange(0,2*pi,step):
    v.append(circle((t,1),1))
a = animate(v, xmin=-1, ymin=0, xmax=8, ymax=2, figsize=[9,2])

There should be no surprises here; “v” is a list, which is filled up with circles all with different centres. And the points:

w = []
for t in srange(0,2*pi,step):
    w.append(point((t-sin(t),1-cos(t)),pointsize=20))
b = animate(w, xmin=-1, ymin=0, xmax=8, ymax=2, figsize=[9,2])

Note here we use the parametric equations to plot the points. And finally the cycloid itself: we draw it as a sequence of lines from the previous to the current point:

L = Graphics()
x = []
for t in srange(0,2*pi,step):
    L += line([(t-step-sin(t-step),1-cos(t-step)),(t-sin(t),1-cos(t))], rgbcolor=(1,0,0), thickness=2)
    x.append(L)
c = animate(x, xmin=-1, ymin=0, xmax=8, ymax=2, figsize=[9,2])

To display this animation, we simply display all of a, b and c together:

(a+b+c).show()

You can see the animation here.

Now that wasn’t so hard, was it?

Book Review: “A Computational Introduction to Number Theory and Algebra” by Victor Shoup, 2nd ed.

I wrote this review a few months ago for “Computing Reviews”, who’ve published it. But for the benefit of those who don’t have access to these reviews, here it is.


Occasionally it’s a pleasure to find a book which is so masterful, so well written, that it has all the hallmarks of the classic. This is such a book. Shoup has set himself a difficult task – to bring the reader up to speed with number theory and algebra, starting “from scratch” – and he has succeeded magnificently. My main complaint with the book is that it is not longer, but as Shoup himself regretfully observes in the introduction, to keep to a reasonable size, some important topics had to be omitted.

Many books on computational number theory present the theory as a sort of smorgasbord of algorithms: primality testing, factorization, discrete logarithms, modular square and n-th roots. Shoup steers clear of this recipe based approach, and instead places the entire theory into a formal algebraic setting. This allows not only for elegance of exposition and a remarkable clarity, but provides the entire book with a structural homogeneity.

Even though “some topics could simply not be covered”, the range of topics presented is wide. The book is geared towards students of cryptography and coding theory, and the material has been chosen to be most relevant to those disciplines.

The books starts with several of standard integer based number theory: divisibility, congruences and modular arithmetic, including quadratic residues (but reciprocity is treated later), large integer arithmetic, Euclid’s algorithm and its association with the Chinese remainder theorem, and a brief discussion of the RSA cryptosystem, including a particularly elegant proof of its correctness. All this material is standard to many other texts, yet rarely treated with as much care as here, in spite of the relative brevity. These first few chapters contain as much mathematics as many cryptography texts, and yet at this stage we are not yet one fifth into the book! Another chapter discusses the distribution of primes, including a proof of Bertrand’s postulate and a discussion of the prime number theorem. Given the importance of primes to modern cryptography, these may be considered vital topics, and it is refreshing to see them treated so well.

These first chapters set the scene, so to speak, for the number theory with which the text is concerned. However, much of the subsequent material is discussed in terms of the general theory of groups and rings. Primitive roots, for example, are not discussed as such, but in terms of generators of the non-zero residues of integers modulo a prime. Although this approach might seem at first to be unnecessarily obtuse, it is in fact the most natural way of introducing these algorithms, as it places them squarely in a generalized algebraic theory. The text, as we would expect, contains several chapters discussing the basic theory of abelian groups and rings, including a fine proof of the fundamental theorem of the structure of finite abelian groups as
a sum of cyclic groups.

What makes this book unique is the way that several different mathematical strands – number theory and algebra – are interwoven and made into a masterful whole. As well as rings and fields, there is much linear algebra (modules, vector spaces and matrices), as well as a considerable amount on probability distributions and probabilistic algorithms, culminating in the Miller-Rabin test for primality and a few applications.

The book ends with some chapters on finite fields and their various algorithms, and a chapter on the AKS deterministic primality test, for which the author carefully observes that the probabilistic Miller-Rabin test is much faster, and hence should be preferred for all practical purposes. However, as an ingenious use of much number theory and algebra, the AKS algorithm is a lovely example with which to finish the text.

Although the text requires not much specific mathematical background, I would hesitate to use it except in an advanced class, or for students whose mathematical ability was already high. The material moves swiftly – while never compromising rigour – and the multiple strands assume considerable ability on the part of the reader. I was pleased to see copious exercises; a student who has completed the book and mastered the exercises will be in a very strong position to embark on some advanced studies. The author does not recommend any specific chapter sequences for a semester’s study, but clearly an astute teacher could pull some parts from this text for an initial course of study, and complete the text in an advanced course.

One regrettable omission is that of the use of any computational tools, either the author’s own C++ NTL library for computational number theory and algebra, or the use of a computer algebra system. A companion volume, or material on the author’s website, discussing some implementation issues, would be most welcome. We note in passing that thanks to the generosity of the publishers, the entire text is available under a Creative Commons licence on the author’s site http://www.shoup.net.

This is a truly magnificent text, deserving of a place on the shelves of any mathematician or computer scientist working in these areas. I hope it has a long life and many further editions.

An anagram measure

Those like me with a love of the byways of the English language, or with a love of cryptic crosswords, will no doubt have collected over the years a private trove of single word anagrams. One of my favourites (because both words are common) is

ORCHESTRA, CARTHORSE

Clearly smaller words are likely to have more anagrams than longer ones:

OPST, POTS, POST, STOP, SPOT, TOPS

From the Anagrams FAQ comes this lovely example, of almost familiar words (at least, they are not scientific or technical terms):

INTERROGATIVES, REINVESTIGATOR, TERGIVERSATION

From an anagrams programming page:

ANGOR, ARGON, GORAN, GRANO, GROAN, NAGOR, ORANG, ORGAN, ROGAN

several of which are unfamiliar to me as English words. I’m not allowing proper names.

How can we measure the “anagrammability” of a set of letters? Some account should be given to the number of anagrams, and to the length of the word. A long word with only two anagrams should get a higher score than a small word with many. Let n be the number of letters, and k the number of (single word) English anagrams. My anagrammability measure (discussed with my eldest son) is:

(k-1)n^2.

This gives a score of zero if there are no anagrams other than the word itself, and is weighted for large words.

With the examples above:

ORCHESTRA: 81
OPST: 80
INTERROGATIVES: 392
ANGOR: 200

According to the Anagrams FAQ again, the set of letters with the most one word anagrams is AEINRST with:

ANESTRI, ASTERIN, ERANIST, NASTIER, RATINES, RESIANT, RESTAIN, RETAINS, RETINAS, RETSINA, SAINTER, STAINER, STARNIE, STEARIN

for a score of 637. (I’ve never seen some of these words before, but apparently they can all be found in Merriam-Webster’s unabridged dictionary).

It’s not hard to experiment with anagrams in Sage; all you need is a wordlist, which you can read in, turn into a list and strip the trailing non-printing characters. You can download a list of 109582 words here, then save it as, say,

words.txt

.
Then:

f = open("words.txt","r")
nwords = f.readlines()
words = []
for i in nwords:
   words.append(i.strip('\r\n'))

Then we can write a simple program which given any string, checks if each permutation is in the list we’ve just created:

def find_anagrams(myword):
   anagrams=[]
   for i in permutations(myword):
       anagrams.append(join(i,""))
   for i in anagrams:
       if i in words:
           print i

For example:

sage: find_anagrams('opst')
opts
post
pots
spot
stop
tops
sage: find_anagrams('aeirnst')
retains
retinas
retsina
nastier
stainer
stearin
sage: find_anagrams('organ')
orang
organ
groan
argon

Clearly this wordlist doesn’t include all the words listed earlier. Clearly also this method is hideously inefficient – it creates all the possible anagrams first, and then checks if each one is in the wordlist. A much better method, for words of say, 8 letters or longer, would be to first set up the wordlist into sublists of words with the same number of letters. Then for your entered word, you simply check if each word in your list is an anagram of your entered word.

And here’s how to do that. First we discover that the longest word in our list has 28 letters:

words = [[] for i in range(28)]
for i in nwords:
   li=len(i)
   words[li-3].append(i.strip('\r\n'))

and a little procedure to test anagrams:

def isanagram(w1,w2):
    l1=list(w1)
    l2=list(w2)
    l1.sort()
    l2.sort()
    return l1==l2

and finally the program to find anagrams in our New Improved List:

def find2_anagrams(myword):
    lm=len(myword);
    for i in words[lm-1]:
        if isanagram(i,myword):
            print i

A quick test of it:

sage: find2_anagrams('retains')
nastier
retains
retinas
retsina
stainer
stearin

This is much faster than the first method.

Addendum

Running this over the entire word list, the highest scoring group of letters I found was AEIGLNRT, with

ALERTING, ALTERING, INTEGRAL, RELATING, TANGLIER, TRIANGLE

for a score of 320.

A much longer list of words is that created as part of the Moby Project; “Moby Words” can be downloaded from this page, and the file single.txt contains a splendid 354,984 words.

Provably secure hash functions

As you may know, a cryptographic hash function is a function H(m) which produces a string of fixed length irrespective of the size of the input m. Security of such functions consists of three requirements:

  1. The function must be pre-image resistant. That is, given a value h=H(m), it should not be computationally feasible to find the input m.
  2. The function must satisfy weak collision resistance; given an input m and hash h=H(m) it should be computatoinally infeasible to find another input m' for which H(m')=H(m).
  3. The function must satisfy strong collision resistance; it should be computatoinally infeasible to find any two inputs m and m' for which H(m')=H(m).

In addition, for practical purposes, such a function should be fast to compute.

One place where hash functions are used are in message signatures. Suppose Alice wishes to send a message m to Bob, and wants to sign it. Because of the low efficiency of message signing, it’s much faster to sign a hash of the message. Here is how this is done:

  1. Alice hashes the message h=H(m), and signs h with her private key to obtain the signature s. She sends m and s to Bob.
  2. Bob hashes the received message m' to obtain h'=H(m') and verifies the signature s against the hash h' using Alice’s public key. If it works out he accepts the message as being genuinely from Alice.

If the hash function were not collision resistant, then an intermediate malicious person, Mallory, could produce a message m'' with the same hash as m, and send m'' and s along to Bob. Bob would verify this new message m'' because hashing it would produce a value verifiable by the signature s.

We note that for any hash function, there are only a finite number of hashes, yet there are an infinite number of possible messages, so collisions certainly exist. The point is that such collisions must be hard to find. In general, given a hash function of n bits, we should not expect to be able to find a collision in less than 2^{n/2} brute-force trials.

Most hash functions are built on an ad hoc basis, where the bits of the message are nicely mixed to produce the hash. But this means that security is hard to prove. And in fact only a few years ago, one of the most popular hash functions, SHA-1, was shown to be less secure than its length (160 bits) suggested: collisions could be found in only 2^{69} tests, rather than the brute-force number of 2^{80}. So the search for a “good” hash function has become a hot topic, with various different functions undergoing intense scrutiny and analysis to become the new SHA-3. See The SHA-3 Zoo for details. At the same time there is renewed interest in hash functions which can be proven to be secure by being based on well-known “hard” problems, such as integer factorization, discrete logarithms, or the subset-sum problem.

Shamir’s hash function

This is one of the oldest, and simplest such functions. Let p and q be two large primes, and let \alpha have order \lambda(n)=\mbox{lcm}(p-1,q-1) in \Bbb{Z}_n. For an input x we define

H(x)=\alpha^x\pmod{n}.

To see that this is collision resistant, suppose we have two inputs x and y for which

\alpha^x=\alpha^y\pmod{n}.

It follows that

\alpha^{x-y}=1\pmod{n}

or that

x-y=0\pmod{\lambda(n)}.

In other words, finding a collision is equivalent to determining the value of \lambda(n), which requires factorising n.

Chaum, van Heijst, Pfitzmann hash function

Another classic: let q be a large prime such that p=2q+1 is also prime, and let \alpha, \beta be two primitive roots of p for which the discrete log log_\alpha\beta\pmod{p} is computationally difficult. Inputs to this function are pairs (x,y) where x<q and y<q. Then:

H(x)=\alpha^x\beta^b\pmod{p}.

Following Buchmann, we can show that finding a collision is equivalent to finding the discrete log. Suppose we have a collision; that is two pairs (x,y) and (w,z) with the same hash:

\alpha^x\beta^y=\alpha^w\beta^z\pmod{p}.

This can be rewritten as

\alpha^{x-w}=\beta^{z-y}\pmod{p}.

Suppose that \lambda=log_\alpha\beta\pmod{p}, so that

\alpha^{x-w}=\alpha^{\lambda(z-y)}\pmod{p}.

Since \alpha is a primitive root of p, it follows that

x-w=\lambda(z-y)\pmod{p-1} (*).

This means that d=\mbox{gcd}(z-b,p-1) must divide x-w. Since each of z and y is less than q, then |z-y|<q, and since p-1=2q, the only two possible values for d are 1 or 2. From Buchmann (with slightly changed notation):

"If d=1, the equation (*) has a unique solution modulo p-1. The discrete logarithm \lambda can be determined as the smallest nonnegative solution of this congruence. If d=2 the congruence has two different solutions mod p-1 and the discrete logarithm can be found by trying both."

Knapsack-based hash functions

The Merkle-Hellman additive knapsack system is something of a modern classic: one of the first public key cryptosystems to be proposed; very fast; based on a well known NP-complete problem (the subset-sum problem); and one of the first to be broken. But knapsacks, because of their elegance and because of the difficulty of the underlying problem, are profoundly attractive to cryptographers, and although every knapsack system is broken usually within a very short time of its publication (an exception was the Chor-Rivest system, which held out a bit longer before it fell), new ones are constantly being proposed.

An early knapsack based hash function was that of Damgård. Let a_i be a list of length n of randomly chosen integers, and let the input be a binary string m_1m_2m_3\ldots m_n of the same length. Then the hash is defined as

\sum_{i=1}^nm_ia_i.

By the difficulty of the subset-sum problem, this should be secure. However, it is vulnerable to the same attack which brought down the original Merkle-Hellman system.

More recently, Impagliazzo and Naor have shown how to construct knapsack hashes which are as provably secure as the subset-sum problem. In particular, they require that if there are n numbers, then the length l(n) of each number be less than n. The attack mentioned above is not valid in this case. The paper shows that finding collisions is equivalent to solving the subsem-sum problem for the set a_i of numbers and the target sum T.

The Zémor-Tillich hash function

This is one of the newest provably secure hash functions, and has been the subject of some intense research and investigation. You can read the original paper here. It remains, with some reservations, a very strong hash function. It is easy to describe.

Let \mathbb{Z}_2(x)/q(x) be a finite field of order 2^n, and let \alpha be a root of q(x). Define the two matrices

A_0=\left[\begin{array}{cc}\alpha&1\\1&0\end{array}\right], A_1=\left[\begin{array}{cc}\alpha&\alpha+1\\1&1\end{array}\right].

Let the input be a binary string m_1m_2m_3\ldots m_k of arbitrary length. Then the hash is defined as

\prod_{i=1}^kA_{m_i}.

The collision resistance can be shown to be based on the property that matrix multiplication is associative but not commutative (which has led to some generalizations over other non-abelian semi-groups.)

Here’s a little example in Sage:


sage: F. = GF(2)[]
sage: K.<a> = GF(2^10, name='a', modulus=x^10+x^7+1)
sage: A=[matrix([[a,1],[1,0]]),matrix([[a,a+1],[1,1]])]
sage: pl="Now is the winter of our discontent, made glorious summer by this sun of York."
sage: ps=map(ord,pl)
sage: pf=flatten(map(lambda x:Integer(x).bits(),ps))
sage: prod([A[pf[i]] for i in range(len(pf))])

giving as hash the matrix

\left[\begin{array}{ll}a^{9} + a^{6} + a^{4} + a^{3} + a^{2} + a & a^{9} + a^{8} + a^{7} + a^{6} + a^{5} + a^{4} + a + 1 \\a^{9} + a^{3} + a^{2} + a & a^{8} + a^{7} + a^{5} + a + 1\end{array}\right]

The Digital Signature Algorithm in Maxima and Sage

The Digital Signature Algorithm, also known as the Digital Signature Standard is, as it name implies, a standard for digital signatures. Most digital signature algorithms work by reversing a public key cryptosystem: a message is signed with the sender’s private key, and the signature is verified using the sender’s public key. The DSA is based on the El Gamal scheme, with a few extras thrown in for extra security, and to make the signatures smaller.

It is set up with four values: a large prime p, a prime q which is a factor of p-1, a primitive root a of p, and a value g defined by

g=a^{(p-1)/q} \bmod p.

For security, p is recommended to be at least 154 digits, and q at least 48 digits.

Alice, the sender, chooses as her private key any value A<q-1 and calculates

B=g^A \bmod p

as her public key. This is secure, by the discrete logarithm problem. The public key consists of the values (p,q,g,B) and the private key is the value A.

Given a message m, a signature is computed as follows:

Alice chooses at random k for which 0<k<q-1. She then computes:

r = (g^k\pmod{p})\pmod{q}

s = k^{-1}(m+Ar)\pmod{q}

and the two values (r,s) are the signature of the message m. We are assuming here that m<p; as in general most messages will be much larger, it is customary to sign not the message itself, but a cryptographic hash of the message, which will be a string of some fixed length (and shorter than p ).

To verify the signature, Bob (the receiver) calculates the following values:

x = s^{-1}m\pmod{q},

y = s^{-1}r\pmod{q},

v = (g^xB^y\pmod{p})\pmod{q}.

If v=r then he accepts the signature.

To see why this works, note that from the definition of s, we can write

m=(-Ar+ks)\pmod{q}

and by multiplying through by s^{-1} we obtain:

s^{-1}m=(-Ars^{-1}+k)\pmod{q}.

This last equation can be written

k= s^{-1}m+Ars^{-1}\pmod{q}

= x+Ay\pmod{q}.

Now we have

r = (g^k\pmod{p})\pmod{q}

  = (g^{x+Ay}\pmod{p})\pmod{q}

 = (g^x(g^A)^y\pmod{p})\pmod{q}

 = (g^xB^y\pmod{p})\pmod{q}

 = v.

This algorithm has the advantage that its security is of order length of p, but the signature values are much smaller – the size of q.

Now let’s look at this algorithm in both Maxima and Sage.

Maxima

We start by creating p and q. We will need to factor p-1 both to find a value of q, and to find its primitive root. So we start by attempting to factor a few randomly chosen large primes, until we have a prime p for which p-1 can be factored, and which also has a reasonably large prime factor q.

We will use smaller values both of p and q than the algorithm requires, just to show how it works.

(%i1) p:next_prime(random(10^80));
(%i2) factor(p-1);

After a few tries, I found

p = 182842970179003336959233156794188485625560973915508949565601
89183057229685434897
q = 525970797581619193760592144581011744537

Maxima doesn’t have a built-in command to find primitive roots, but one can easily be written, using the fact that a primitive root a is a number for which

a^{(p-1)/t}  \neq 1 \bmod{p}

for all prime factors t of p-1. Here is one such program:

primroot(p):=block(
[f:ifactors(p-1),n,i:1],
if not(primep(p)) then error("Your number is not prime"),
n:length(f),
do (
i:i+1,
if not member(1,makelist(power_mod(i,(p-1)/f[j][1],p),j,1,n)) then
return(i)
)
);

So:

(%i3) a:primroot(p);
(%o3) 5
(%i4) g:power_mod(a,(p-1)/q,p);

The private/public key pairs:

(%i5) A:random(q-1);
(%i6) B:power_mod(g,A,p);

Now we can choose a random message a sign it (for ease we won’t show the outputs, which are just long numbers):

(%i7) m:random(p);
(%i8) k:random(q-1);
(%i9) r:mod(power_mod(g,k,p),q);
(%i10) s:mod(inv_mod(k,q)*(m+A*r),q);

Now to verify the signature:

(%i11) x:mod(inv_mod(s,q)*m,q);
(%i12) y:mod(inv_mod(s,q)*r,q);
(%i13) v:mod(mod(power_mod(g,x,p)*power_mod(B,y,p),p),q);
(%i14) is(v=r)
(%o14) true

Sage

As with Maxima, we start by finding p and q. We simply repeat

sage: p=next_prime(randint(1,10^80))
sage: factor(p-1)

until we find values we want. I found

p =
530156743088749972013047250493987281452419479410412138
17513510820559804089810293
q= 2199623526308059394919085303004156101

Then

sage: a=Mod(primitive_root(p),p)

returns 5. The extra Mod( ,p) ensures that the type of a is an element of the
ring of integers modulo p \mathbb{Z}_p:

sage: parent(a)
Ring of integers modulo 530156743088749972013047250493987281
45241947941041213817513510820559804089810293

Now we can calculate the other values we need:

sage: g=a^((p-1)/q)
sage: g
530156743088749972013047250493987281211397896967807817
18309906946321326881261201

and the private/public key pairs:

sage: A=randint(1,q-1);A
617088481431564693290032924639855166L,
sage: B=g^A;B
1052251014343945276262934677066619589736007844895332445
1397254657133312260200438

Note that because of Sage’s handling of types, we don’t have to fuss with “power_mod”; since the type of a is “element of \mathbb{Z}_p“, so automatically are g and B, and powers are thus computed quickly, using the modular exponentiation algorithm.

Let’s choose a message, which will be any value less than p:

sage: m=randint(1,p)

and sign it (as with Maxima we won’t show any of the values):

sage: k=randint(1,q-1)
sage: r=mod(g^k,q)
sage: s=(m+A*r)/k

since g is in the ring \mathbb{Z}_p, we need to change r to be in the ring \mathbb{Z}_q. This
will force s to be also in this ring, so there is no need for any explicit calls to number theoretic functions.

Now we can verify the signature:

sage: x=m/s
sage: y=r/s
sage: v=mod(g^x*B^y,q)
sage: v==r
True

Neat.

Both Maxima and Sage provide all the necessary functionality to compute and verify a digital signature (well, nearly all; we had to write our own program for primitive roots in Maxima), but of the two, Sage certainly allows for easier commands, with its mathematical types. For example:

Maxima: y:mod(inv_mod(s,q)*r,q);
Sage: y=r/s

Maxima: v:mod(mod(power_mod(g,x,p)*power_mod(B,y,p),p),q);
Sage: v=mod(g^x*B^y,q)

Addendum: the use of “modulus” in Maxima

Richard Fateman (see comments below) pointed out that my comparison above is in fact incorrect. Maxima allows for very neat commands by use of “modulus”, which if set to any prime number, will effect all subsequent rational expressions. For example:

(%i1) modulus:97;
(%i2) a:rat(5);
(%i3) a^75;
(%o3)                  -34

We see that the output is given in “balanced” form, where the residues are balanced about zero. However,. in Maxima modulus is a global property – it applies to all rational expressions, whereas in Sage different variables can have different modular types. However, by a judicious change of modulus, we can simplify the Maxima commands considerably. To show how to do this, we will deal with some very small numbers, so as to be able to show the outputs. We will use p=1031, q=103, and the primitive root a=14 of p. First, the setup phase. Since everything here is done modulo p, we will use this as the modulus value:

(%i1) p:1031;
(%o1)                   1031
(%i2) q:103;
(%o2)                    103
(%i3) modulus:p;
(%o3)                   1031
(%i4) a:rat(14);
(%o4)                     14
(%i5) g:a^((p-1)/q);
(%o5)                    320
(%i6) A:rat(70);
(%o6)                     70
(%i7) B:g^A;
(%o8)                     48

Now to sign a message m=500 using the “random” value k=25:

(%i9) m:500;
(%o9)                    500
(%i10) k:25;
(%o10)                    25
(%i11) r:mod(g^k,q);
(%o11)                    95

At this stage we are going to start performing operations modulo q, so we change the modulus:

(%i12) modulus:q;
(%o12)                  103
(%i13) s:(m+A*r)/k;
(%o13)                  -23

We have thus created the signature (with balanced modulo values); now for the verification:

(%i14) x:m/s;
(%o14)                   32
(%i15) y:r/s;
(%o15)                  -31

To compute the final value we’ll change the modulus once more:

(%i16) modulus:p;
(%o16)                1031
(%i17) v:mod(g^x*B^y,q);
(%o17)                  95
(%i18) is(v=r);
(%o18)                true

and we’re done, all with very simple commands.

Solving quadratic equations geometrically

I vaguely recall some years ago having seen a nonogram for solving quadratic equations, and I thought it may be a fun thing to do with my students. I couldn’t find what I was looking for, but I did come across a lovely result which seems to have been first noted by a French artillery captain named M. E. Lill (see http://www.pballew.net/Lill_cir.doc for a more complete history). It can be stated as follows:

Let the quadratic x^2+bx+c=0 have real roots. Then the circle whose diameter has endpoints (0,1) and (-b,c) cuts the x-axis at the root values.

We note in passing that the quadratic given is completely general, for any quadratic can be reduced to that form by dividing through by the leading coefficient. Here are “Lill circles” for the quadratic equations x^2-3x+2=0:

lill1

and x^2-2x-3=0:

lill2

I like this result. It seems (at first glance) to be slightly mysterious; it is a wonderful mixture of algebra and geometry, and it’s not quite obvious. In fact, the proof is very simple, and is an elementary application of Euclid’s result that the angle in a semi-circle is a right angle.

Consider the first diagram above, without the circle, but with two extra lines:

lill3

Since the angle AOB is a right angle, the bottom two triangles are similar, and so

\displaystyle{\frac{x}{1}=\frac{c}{b-x}}.

Multiplying this fraction out produces x(b-x)=c which can be rearranged to produce x^2-bx+c=0. Since in this diagram the upper right point is (-(-b),c)=(b,c) this finishes the proof.