Vigenère and Kasiski

Most writers of cryptography texts, to my mind, spend a disproportionate amount of time and space carefully discussing the cryptanalysis of the Vigenère cipher. Maybe it’s because this is the first “non-trivial” cipher most students learn, and its cryptanalysis is also slightly non-trivial. Anyway, who am I to buck this trend? So this post shows how to do it in Sage.

First, the cipher itself. It’s a polyalphabetic cipher, where each letter of the plaintext is shifted by an amount given by a keyword; this key being repeated as much as required:


T H I S I S T H E P L A I N T E X T T O B E E N C R Y P T E D
K E Y W O R D K E Y W O R D K E Y W O R D K E Y W O R D K E Y

In this example, and using the correspondence A=0, B=1 up to Y=24, Z=25, we see that every seventh letter is shifted by the same amount: letters in the 1st, 8th, 15th, 22nd positions are shifted by K=10; letters in the 2nd, 9th, 16th 23rd positions are shifted by E=4, and so on. The resulting ciphertext is


D L G O W J W R I N H O Z Q D I V P H F E O I L Y F P S D I B

What makes this cipher seem so strong is that similar letters in the plaintext are not necessarily encrypted to the same letters in the ciphertext: notice for example that the first two I’s are encrypted to Y and O; and that similar letters in the ciphertext do not necessarily correspond to the same letters in the plaintext.

However, if the length n of the keyword can be determined, then we know that every nth letter is shifted by the same amount. If there are enough letters to perform a frequency analysis (and using the fact that the most common letter in English text is E), then the value of the shift can be found. One way of determining the keyword length is to shift the ciphertext by a given amount and check all the coincidences (equality of letters) between the ciphertext and its shift. A shift with a large number of coincidences may be the length of the keyword. This method is called Kasiski’s method after its 19th century discoverer.

For an example, I have a very long ciphertext which has been obtained with a Vigenère cipher. You can see it here. Anyway, it’s nearly 20,000 characters long. To find the length of the keyword which was used, the first step is to write a little program to perform a cyclic shift of a string:

def cshift(str,n):
   slen=len(str)
   return str[n:slen+1]+str[0:n]

Now to find the coincidences with different shifts:

sage: clen=len(ct)
sage: for i in range(20):
    ctx=cshift(ct,i)
    coin=0
    for j in range(clen):
        if ct[j]==ctx[j]:
            coin=coin+1
    print i,coin

and this returns the output:


0 19369
1 683
2 782
3 791
4 728
5 675
6 655
7 1284
8 712
9 734
10 718
11 764
12 708
13 718
14 1192
15 709
16 716
17 734
18 792
19 697

Neglecting the first output, we see that the greatest number of coincidences occur for shifts of 7 and 14. This would seem to indicate that the keyword has length 7.

Now we break up the ciphertext into seven groups:

sage: ct7s=['','','','','','','']
sage: sage: for i in range(clen):
    ct7s[i%7]=ct7s[i%7]+ct[i]

The next step is to find which of the letters is most common in each group. Here’s one way:

sage: alph='ABCDEFGHIJKLMNOPQRSTUVWXYZ'
sage: for i in alph:
    print i, ct7s[0].count(i)

which produces

A 1
B 71
C 1
D 217
E 41
F 78
G 123
H 366
I 64
J 52
K 160
L 192
M 2
N 27
O 102
P 86
Q 187
R 196
S 40
T 2
U 167
V 188
W 236
X 74
Y 23
Z 71

from which it is obvious that the most common letter in this group is H. Repeating this same procedure for the other six groups enables us to build up a table of the most common letter in each group:

0 H
1 M
2 G
3 O
4 I
5 R
6 W

(Of course, this entire process can be easily automated; on the other hand it’s quite nice to do everything separately one step at a time.) Now, the most common letter in English is E, which has the value 4. If H corresponds to E, that means that for group 0, there has been a shift of 3, which corresponds to the letter D. This is the first letter of the keyword. And in fact every other letter of the keyword can be obtained by shifting back by 4 from each common letter. This produces the keyword

DICKENS.

Applying this to decrypt the ciphertext produces:

WHETHERISHALLTURNOUTTOBETHEHEROOFMYOWNLIFEORWHETHERTHATSTATION
WILLBEHELDBYANYBODYELSETHESEPAGESMUSTSHOW
...
BOURNEOFALLSUCHTRAVELLERSANDTHEMOUNDABOVETHEASHESANDTHEDUST
THATONCEWASHEWITHOUTWHOMIHADNEVERBEEN

- it’s the first chapter of “David Copperfield”, by Charles Dickens, in uppercase with all punctuation removed.

A graphical and numerical approach to teaching calculus

I’m thinking about teaching calculus using Matlab. This is far from being a new idea – people have been using Matlab in their teaching for as long as Matlab has existed – but it must be admitted that as far as calculus is concerned, Matlab is probably less well equipped than a computer algebra system such as Maple, Mathematica, or the open source Maxima and Sage. This is of course because Matlab is primarily a numeric and computational system, whereas Maple, Mathematica and the others are symbolic systems. And calculus, being primarily analytical and symbolic, is better served by them.

That being said, I think that in fact Matlab – used only for computation and graphics – could be used to great effect in a calculus course. Note that the new versions of Matlab contain the Symbolic Math Toolbox, which provides a Matlab interface to the CAS MuPAD. However, I’m rather against using MuPAD-in-Matlab. For one thing, it seems to be a fairly clumsy interface. And you have to declare variables as being symbolic, so as not to confuse Matlab with its own (numeric) variables. I think if you want to use a symbolic system, then do just that. If you want a numeric system, do just that.

My feeling is that using plain old unadorned Matlab would be a great help to engineering students, for whom the bulk of their computing is Matlab based. I think introducing symbolic computation would only add to their cognitive load, without necessarily helping their mathematical understanding. So my thoughts are to introduce the standard calculus material – limits, derivatives, integrals and their applications – analytically, and then use the numeric and graphical capabilities of Matlab to explore and enhance that material. And in fact there’s a lot you can do.

Here’s a small sample.

Limits

Limits can easily be explored numerically in Matlab. For example, the old warhorse

\displaystyle{\lim_{x\to 0}\frac{\sin x}{x}}

You could simply, for example, enter:

>> x = 0.1
>> sin(x)/x

and then replace x with 0.01, 0.001, and so on, and see what happens to the values of the function. You could first set

>> format long

to give you more decimal places.

Rather than entering a new x value each time, you could do it all in one go:

>> x = [0.1; 0.01; 0.001; 0.0001; 0.00001]
>> sin(x)./x

but this presupposes that students understand the “dot” notation of Matlab. Maybe you could define the function first:

f = @(x) sin(x)/x

and then apply it to your vector x:

arrayfun(f,x)

And of course if you wanted you could create the vector x using

x = 0.1.^[1:5]'

but this last assumes a certain knowledge of Matlab’s workings. I would certainly teach function definition (such as above), and the command arrayfun.

Newton’s method

Who does not love and teach

\displaystyle{x\leftarrow x-\frac{f(x)}{f'(x)}}?

Matlab is at a disadvantage here – without the symbolic toolbox, it can’t compute symbolic derivatives. However, it can perform differentiation of polynomials (treated as vectors of their coefficients). So we start here. For example, let’s solve

x^5+x^2-1=0

using Newton’s method, with a starting value of x=0.8 First, we define the function and its derivative:

>> p = [1 0 0 1 0 -1]
>> f = @(x) polyval(p,x)
>> df = @(x) polyval(polyder(p),x)

Putting this together into a Newton’s rule function:

>> nr = @(x) x-f(x)/df(x)

Now we can start using it. First the easy way:

>> 0.8
>> nr(ans)
   0.808859649122807
>> nr(ans)
   0.808730628358884
>> nr(ans)
   0.808730600479393

for as long as we like.

Next, a slightly more sophisticated way:

>> a = [0.8]
>> for i=1:6 a=[a(:);nr(a(i))];end
>> a

   0.80000000000000
   0.80885964912281
   0.80873062835888
   0.80873060047939
   0.80873060047939
   0.80873060047939
   0.80873060047939

For Newton’s method applied to non polynomial functions, you could enter the function and its derivative yourself:

>> f = @(x) exp(x)-x^2
>> df = @(x) exp(x)-2*x

and then proceed as above.

Drawing tangent lines

Again, first enter the function and its derivative:

>> f = @(x) exp(-x^2)
>> df = @(x) -2*x*exp(-x^2)

Given an x value, we can determine the values of the function and derivative at that point, and construct the tangent line:

>> a = 1.2
>> fa = f(a)
>> dfa = df(a)
>> t = @(x) dfa*(x-a)+fa

and we can sketch both together:

>> ezplot(f, [0,2])
>> hold on
>> ezplot(t,[0,2])

Simpson’s rule

There are, at a conservative estimate, about 42,897 versions of Simpson’s rule in Matlab. I want to keep it as simple as possible, and not use any clever Matlab tricks. So here’s one way to calculate

\displaystyle{\int^1_0e^{-x^2}\,dx}.

First set up the function, the limits of integration, and the number of times Simpson’s rule will be used:

>> a = 0
>> b = 1
>> f = @(x) exp(-x^2)
>> n = 4

Now we set up the nodes (where the integral will be evaluated):

>> x = linspace(a,b,2*n+1)
>> h = (b-a)/(2*n)

and the weights – we can either just enter them by hand:

>> w = [1 4 2 4 2 4 2 4 2 4 1]

or

>> w = zeros(1,2*n+1)
>> for i=1:n w(2*i-1:2*i+1)=w(2*i-1:2*i+1)+[1 4 1];end

Now put it all together:

>> s = sum(w.*arrayfun(f,x))*h/3

     0.74682612052747

The last could of course be done as a single matrix product:

>> s = w*arrayfun(f,x))'*h/3

but I think the first method is conceptually easier.

Animated graphics in mathematics education

As far as I know, there’s been little research done on this topic, although it seems to me that animations could be enormously helpful in facilitating the learning of some mathematics. Think, for example, of the “limit of secants” method for defining a derivative:

\displaystyle{f'(a)=\lim_{x\to a}\frac{f(x)-f(a)}{x-a}}

I know that when I teach elementary calculus I draw a curve, and try to show by means of diagrams that as x slides along the curve to a, the secant “approaches” the tangent to the curve at a. Some students, if not fully convinced, give the impression that there is merit in this argument; other students sit there with their “when will this be finished?” look on their faces. This seems to be a place where some interactive or animated graphics would be very helpful.

Most computer algebra systems now provide some support for animation; if they don’t they should. Interestingly, when CAS’s are compared, graphics usually are not considered. Way back in the late 1990’s, when Michael Wester was producing his monumental review of CAS’s (of which an earlier version can be found here), they were treated as problem solving black boxes: problem goes in; solution comes out. The best CAS was the one which solved (correctly!) the greatest number of problems from the broadest number of topics. And this style seems to have permeated CAS comparisons ever since. As far as I know, very few reviews have investigated the graphics capabilities of such systems. One which did was Barry Simon’s “Symbolic Math Powerhouses Revisited”, which is available as the first chapter in Wester’s Computer Algebra Systems, A Practical Guide.

I think that most mathematics educators are simply unwilling to come to grips with the technology which enables such graphics to be produced, but in fact most modern software makes this very easy.

As an example, I’m going to show how to produce a cycloid in Sage. Sage contains an “animate” command, which simply runs though a list of graphics objects, displaying them one after the other. To draw my cycloid, I’m going to need three such lists:

  1. The moving circle.
  2. The point on the circle’s circumference.
  3. The cycloid as drawn by that point.

And of course I need the equation of the cycloid, which is easiest given parametrically:

x=t-\sin(t)
y=1-\cos(t)

So here’s how to produce the three graphics objects above:

step = 0.3
v = []
for t in srange(0,2*pi,step):
    v.append(circle((t,1),1))
a = animate(v, xmin=-1, ymin=0, xmax=8, ymax=2, figsize=[9,2])

There should be no surprises here; “v” is a list, which is filled up with circles all with different centres. And the points:

w = []
for t in srange(0,2*pi,step):
    w.append(point((t-sin(t),1-cos(t)),pointsize=20))
b = animate(w, xmin=-1, ymin=0, xmax=8, ymax=2, figsize=[9,2])

Note here we use the parametric equations to plot the points. And finally the cycloid itself: we draw it as a sequence of lines from the previous to the current point:

L = Graphics()
x = []
for t in srange(0,2*pi,step):
    L += line([(t-step-sin(t-step),1-cos(t-step)),(t-sin(t),1-cos(t))], rgbcolor=(1,0,0), thickness=2)
    x.append(L)
c = animate(x, xmin=-1, ymin=0, xmax=8, ymax=2, figsize=[9,2])

To display this animation, we simply display all of a, b and c together:

(a+b+c).show()

You can see the animation here.

Now that wasn’t so hard, was it?

Book Review: “A Computational Introduction to Number Theory and Algebra” by Victor Shoup, 2nd ed.

I wrote this review a few months ago for “Computing Reviews”, who’ve published it. But for the benefit of those who don’t have access to these reviews, here it is.


Occasionally it’s a pleasure to find a book which is so masterful, so well written, that it has all the hallmarks of the classic. This is such a book. Shoup has set himself a difficult task – to bring the reader up to speed with number theory and algebra, starting “from scratch” – and he has succeeded magnificently. My main complaint with the book is that it is not longer, but as Shoup himself regretfully observes in the introduction, to keep to a reasonable size, some important topics had to be omitted.

Many books on computational number theory present the theory as a sort of smorgasbord of algorithms: primality testing, factorization, discrete logarithms, modular square and n-th roots. Shoup steers clear of this recipe based approach, and instead places the entire theory into a formal algebraic setting. This allows not only for elegance of exposition and a remarkable clarity, but provides the entire book with a structural homogeneity.

Even though “some topics could simply not be covered”, the range of topics presented is wide. The book is geared towards students of cryptography and coding theory, and the material has been chosen to be most relevant to those disciplines.

The books starts with several of standard integer based number theory: divisibility, congruences and modular arithmetic, including quadratic residues (but reciprocity is treated later), large integer arithmetic, Euclid’s algorithm and its association with the Chinese remainder theorem, and a brief discussion of the RSA cryptosystem, including a particularly elegant proof of its correctness. All this material is standard to many other texts, yet rarely treated with as much care as here, in spite of the relative brevity. These first few chapters contain as much mathematics as many cryptography texts, and yet at this stage we are not yet one fifth into the book! Another chapter discusses the distribution of primes, including a proof of Bertrand’s postulate and a discussion of the prime number theorem. Given the importance of primes to modern cryptography, these may be considered vital topics, and it is refreshing to see them treated so well.

These first chapters set the scene, so to speak, for the number theory with which the text is concerned. However, much of the subsequent material is discussed in terms of the general theory of groups and rings. Primitive roots, for example, are not discussed as such, but in terms of generators of the non-zero residues of integers modulo a prime. Although this approach might seem at first to be unnecessarily obtuse, it is in fact the most natural way of introducing these algorithms, as it places them squarely in a generalized algebraic theory. The text, as we would expect, contains several chapters discussing the basic theory of abelian groups and rings, including a fine proof of the fundamental theorem of the structure of finite abelian groups as
a sum of cyclic groups.

What makes this book unique is the way that several different mathematical strands – number theory and algebra – are interwoven and made into a masterful whole. As well as rings and fields, there is much linear algebra (modules, vector spaces and matrices), as well as a considerable amount on probability distributions and probabilistic algorithms, culminating in the Miller-Rabin test for primality and a few applications.

The book ends with some chapters on finite fields and their various algorithms, and a chapter on the AKS deterministic primality test, for which the author carefully observes that the probabilistic Miller-Rabin test is much faster, and hence should be preferred for all practical purposes. However, as an ingenious use of much number theory and algebra, the AKS algorithm is a lovely example with which to finish the text.

Although the text requires not much specific mathematical background, I would hesitate to use it except in an advanced class, or for students whose mathematical ability was already high. The material moves swiftly – while never compromising rigour – and the multiple strands assume considerable ability on the part of the reader. I was pleased to see copious exercises; a student who has completed the book and mastered the exercises will be in a very strong position to embark on some advanced studies. The author does not recommend any specific chapter sequences for a semester’s study, but clearly an astute teacher could pull some parts from this text for an initial course of study, and complete the text in an advanced course.

One regrettable omission is that of the use of any computational tools, either the author’s own C++ NTL library for computational number theory and algebra, or the use of a computer algebra system. A companion volume, or material on the author’s website, discussing some implementation issues, would be most welcome. We note in passing that thanks to the generosity of the publishers, the entire text is available under a Creative Commons licence on the author’s site http://www.shoup.net.

This is a truly magnificent text, deserving of a place on the shelves of any mathematician or computer scientist working in these areas. I hope it has a long life and many further editions.

An anagram measure

Those like me with a love of the byways of the English language, or with a love of cryptic crosswords, will no doubt have collected over the years a private trove of single word anagrams. One of my favourites (because both words are common) is

ORCHESTRA, CARTHORSE

Clearly smaller words are likely to have more anagrams than longer ones:

OPST, POTS, POST, STOP, SPOT, TOPS

From the Anagrams FAQ comes this lovely example, of almost familiar words (at least, they are not scientific or technical terms):

INTERROGATIVES, REINVESTIGATOR, TERGIVERSATION

From an anagrams programming page:

ANGOR, ARGON, GORAN, GRANO, GROAN, NAGOR, ORANG, ORGAN, ROGAN

several of which are unfamiliar to me as English words. I’m not allowing proper names.

How can we measure the “anagrammability” of a set of letters? Some account should be given to the number of anagrams, and to the length of the word. A long word with only two anagrams should get a higher score than a small word with many. Let n be the number of letters, and k the number of (single word) English anagrams. My anagrammability measure (discussed with my eldest son) is:

(k-1)n^2.

This gives a score of zero if there are no anagrams other than the word itself, and is weighted for large words.

With the examples above:

ORCHESTRA: 81
OPST: 80
INTERROGATIVES: 392
ANGOR: 200

According to the Anagrams FAQ again, the set of letters with the most one word anagrams is AEINRST with:

ANESTRI, ASTERIN, ERANIST, NASTIER, RATINES, RESIANT, RESTAIN, RETAINS, RETINAS, RETSINA, SAINTER, STAINER, STARNIE, STEARIN

for a score of 637. (I’ve never seen some of these words before, but apparently they can all be found in Merriam-Webster’s unabridged dictionary).

It’s not hard to experiment with anagrams in Sage; all you need is a wordlist, which you can read in, turn into a list and strip the trailing non-printing characters. You can download a list of 109582 words here, then save it as, say,

words.txt

.
Then:

f = open("words.txt","r")
nwords = f.readlines()
words = []
for i in nwords:
   words.append(i.strip('\r\n'))

Then we can write a simple program which given any string, checks if each permutation is in the list we’ve just created:

def find_anagrams(myword):
   anagrams=[]
   for i in permutations(myword):
       anagrams.append(join(i,""))
   for i in anagrams:
       if i in words:
           print i

For example:

sage: find_anagrams('opst')
opts
post
pots
spot
stop
tops
sage: find_anagrams('aeirnst')
retains
retinas
retsina
nastier
stainer
stearin
sage: find_anagrams('organ')
orang
organ
groan
argon

Clearly this wordlist doesn’t include all the words listed earlier. Clearly also this method is hideously inefficient – it creates all the possible anagrams first, and then checks if each one is in the wordlist. A much better method, for words of say, 8 letters or longer, would be to first set up the wordlist into sublists of words with the same number of letters. Then for your entered word, you simply check if each word in your list is an anagram of your entered word.

And here’s how to do that. First we discover that the longest word in our list has 28 letters:

words = [[] for i in range(28)]
for i in nwords:
   li=len(i)
   words[li-3].append(i.strip('\r\n'))

and a little procedure to test anagrams:

def isanagram(w1,w2):
    l1=list(w1)
    l2=list(w2)
    l1.sort()
    l2.sort()
    return l1==l2

and finally the program to find anagrams in our New Improved List:

def find2_anagrams(myword):
    lm=len(myword);
    for i in words[lm-1]:
        if isanagram(i,myword):
            print i

A quick test of it:

sage: find2_anagrams('retains')
nastier
retains
retinas
retsina
stainer
stearin

This is much faster than the first method.

Addendum

Running this over the entire word list, the highest scoring group of letters I found was AEIGLNRT, with

ALERTING, ALTERING, INTEGRAL, RELATING, TANGLIER, TRIANGLE

for a score of 320.

A much longer list of words is that created as part of the Moby Project; “Moby Words” can be downloaded from this page, and the file single.txt contains a splendid 354,984 words.

Provably secure hash functions

As you may know, a cryptographic hash function is a function H(m) which produces a string of fixed length irrespective of the size of the input m. Security of such functions consists of three requirements:

  1. The function must be pre-image resistant. That is, given a value h=H(m), it should not be computationally feasible to find the input m.
  2. The function must satisfy weak collision resistance; given an input m and hash h=H(m) it should be computatoinally infeasible to find another input m' for which H(m')=H(m).
  3. The function must satisfy strong collision resistance; it should be computatoinally infeasible to find any two inputs m and m' for which H(m')=H(m).

In addition, for practical purposes, such a function should be fast to compute.

One place where hash functions are used are in message signatures. Suppose Alice wishes to send a message m to Bob, and wants to sign it. Because of the low efficiency of message signing, it’s much faster to sign a hash of the message. Here is how this is done:

  1. Alice hashes the message h=H(m), and signs h with her private key to obtain the signature s. She sends m and s to Bob.
  2. Bob hashes the received message m' to obtain h'=H(m') and verifies the signature s against the hash h' using Alice’s public key. If it works out he accepts the message as being genuinely from Alice.

If the hash function were not collision resistant, then an intermediate malicious person, Mallory, could produce a message m'' with the same hash as m, and send m'' and s along to Bob. Bob would verify this new message m'' because hashing it would produce a value verifiable by the signature s.

We note that for any hash function, there are only a finite number of hashes, yet there are an infinite number of possible messages, so collisions certainly exist. The point is that such collisions must be hard to find. In general, given a hash function of n bits, we should not expect to be able to find a collision in less than 2^{n/2} brute-force trials.

Most hash functions are built on an ad hoc basis, where the bits of the message are nicely mixed to produce the hash. But this means that security is hard to prove. And in fact only a few years ago, one of the most popular hash functions, SHA-1, was shown to be less secure than its length (160 bits) suggested: collisions could be found in only 2^{69} tests, rather than the brute-force number of 2^{80}. So the search for a “good” hash function has become a hot topic, with various different functions undergoing intense scrutiny and analysis to become the new SHA-3. See The SHA-3 Zoo for details. At the same time there is renewed interest in hash functions which can be proven to be secure by being based on well-known “hard” problems, such as integer factorization, discrete logarithms, or the subset-sum problem.

Shamir’s hash function

This is one of the oldest, and simplest such functions. Let p and q be two large primes, and let \alpha have order \lambda(n)=\mbox{lcm}(p-1,q-1) in \Bbb{Z}_n. For an input x we define

H(x)=\alpha^x\pmod{n}.

To see that this is collision resistant, suppose we have two inputs x and y for which

\alpha^x=\alpha^y\pmod{n}.

It follows that

\alpha^{x-y}=1\pmod{n}

or that

x-y=0\pmod{\lambda(n)}.

In other words, finding a collision is equivalent to determining the value of \lambda(n), which requires factorising n.

Chaum, van Heijst, Pfitzmann hash function

Another classic: let q be a large prime such that p=2q+1 is also prime, and let \alpha, \beta be two primitive roots of p for which the discrete log log_\alpha\beta\pmod{p} is computationally difficult. Inputs to this function are pairs (x,y) where x<q and y<q. Then:

H(x)=\alpha^x\beta^b\pmod{p}.

Following Buchmann, we can show that finding a collision is equivalent to finding the discrete log. Suppose we have a collision; that is two pairs (x,y) and (w,z) with the same hash:

\alpha^x\beta^y=\alpha^w\beta^z\pmod{p}.

This can be rewritten as

\alpha^{x-w}=\beta^{z-y}\pmod{p}.

Suppose that \lambda=log_\alpha\beta\pmod{p}, so that

\alpha^{x-w}=\alpha^{\lambda(z-y)}\pmod{p}.

Since \alpha is a primitive root of p, it follows that

x-w=\lambda(z-y)\pmod{p-1} (*).

This means that d=\mbox{gcd}(z-b,p-1) must divide x-w. Since each of z and y is less than q, then |z-y|<q, and since p-1=2q, the only two possible values for d are 1 or 2. From Buchmann (with slightly changed notation):

"If d=1, the equation (*) has a unique solution modulo p-1. The discrete logarithm \lambda can be determined as the smallest nonnegative solution of this congruence. If d=2 the congruence has two different solutions mod p-1 and the discrete logarithm can be found by trying both."

Knapsack-based hash functions

The Merkle-Hellman additive knapsack system is something of a modern classic: one of the first public key cryptosystems to be proposed; very fast; based on a well known NP-complete problem (the subset-sum problem); and one of the first to be broken. But knapsacks, because of their elegance and because of the difficulty of the underlying problem, are profoundly attractive to cryptographers, and although every knapsack system is broken usually within a very short time of its publication (an exception was the Chor-Rivest system, which held out a bit longer before it fell), new ones are constantly being proposed.

An early knapsack based hash function was that of Damgård. Let a_i be a list of length n of randomly chosen integers, and let the input be a binary string m_1m_2m_3\ldots m_n of the same length. Then the hash is defined as

\sum_{i=1}^nm_ia_i.

By the difficulty of the subset-sum problem, this should be secure. However, it is vulnerable to the same attack which brought down the original Merkle-Hellman system.

More recently, Impagliazzo and Naor have shown how to construct knapsack hashes which are as provably secure as the subset-sum problem. In particular, they require that if there are n numbers, then the length l(n) of each number be less than n. The attack mentioned above is not valid in this case. The paper shows that finding collisions is equivalent to solving the subsem-sum problem for the set a_i of numbers and the target sum T.

The Zémor-Tillich hash function

This is one of the newest provably secure hash functions, and has been the subject of some intense research and investigation. You can read the original paper here. It remains, with some reservations, a very strong hash function. It is easy to describe.

Let \mathbb{Z}_2(x)/q(x) be a finite field of order 2^n, and let \alpha be a root of q(x). Define the two matrices

A_0=\left[\begin{array}{cc}\alpha&1\\1&0\end{array}\right], A_1=\left[\begin{array}{cc}\alpha&\alpha+1\\1&1\end{array}\right].

Let the input be a binary string m_1m_2m_3\ldots m_k of arbitrary length. Then the hash is defined as

\prod_{i=1}^kA_{m_i}.

The collision resistance can be shown to be based on the property that matrix multiplication is associative but not commutative (which has led to some generalizations over other non-abelian semi-groups.)

Here’s a little example in Sage:


sage: F. = GF(2)[]
sage: K.<a> = GF(2^10, name='a', modulus=x^10+x^7+1)
sage: A=[matrix([[a,1],[1,0]]),matrix([[a,a+1],[1,1]])]
sage: pl="Now is the winter of our discontent, made glorious summer by this sun of York."
sage: ps=map(ord,pl)
sage: pf=flatten(map(lambda x:Integer(x).bits(),ps))
sage: prod([A[pf[i]] for i in range(len(pf))])

giving as hash the matrix

\left[\begin{array}{ll}a^{9} + a^{6} + a^{4} + a^{3} + a^{2} + a & a^{9} + a^{8} + a^{7} + a^{6} + a^{5} + a^{4} + a + 1 \\a^{9} + a^{3} + a^{2} + a & a^{8} + a^{7} + a^{5} + a + 1\end{array}\right]

The Digital Signature Algorithm in Maxima and Sage

The Digital Signature Algorithm, also known as the Digital Signature Standard is, as it name implies, a standard for digital signatures. Most digital signature algorithms work by reversing a public key cryptosystem: a message is signed with the sender’s private key, and the signature is verified using the sender’s public key. The DSA is based on the El Gamal scheme, with a few extras thrown in for extra security, and to make the signatures smaller.

It is set up with four values: a large prime p, a prime q which is a factor of p-1, a primitive root a of p, and a value g defined by

g=a^{(p-1)/q} \bmod p.

For security, p is recommended to be at least 154 digits, and q at least 48 digits.

Alice, the sender, chooses as her private key any value A<q-1 and calculates

B=g^A \bmod p

as her public key. This is secure, by the discrete logarithm problem. The public key consists of the values (p,q,g,B) and the private key is the value A.

Given a message m, a signature is computed as follows:

Alice chooses at random k for which 0<k<q-1. She then computes:

r = (g^k\pmod{p})\pmod{q}

s = k^{-1}(m+Ar)\pmod{q}

and the two values (r,s) are the signature of the message m. We are assuming here that m<p; as in general most messages will be much larger, it is customary to sign not the message itself, but a cryptographic hash of the message, which will be a string of some fixed length (and shorter than p ).

To verify the signature, Bob (the receiver) calculates the following values:

x = s^{-1}m\pmod{q},

y = s^{-1}r\pmod{q},

v = (g^xB^y\pmod{p})\pmod{q}.

If v=r then he accepts the signature.

To see why this works, note that from the definition of s, we can write

m=(-Ar+ks)\pmod{q}

and by multiplying through by s^{-1} we obtain:

s^{-1}m=(-Ars^{-1}+k)\pmod{q}.

This last equation can be written

k= s^{-1}m+Ars^{-1}\pmod{q}

= x+Ay\pmod{q}.

Now we have

r = (g^k\pmod{p})\pmod{q}

  = (g^{x+Ay}\pmod{p})\pmod{q}

 = (g^x(g^A)^y\pmod{p})\pmod{q}

 = (g^xB^y\pmod{p})\pmod{q}

 = v.

This algorithm has the advantage that its security is of order length of p, but the signature values are much smaller – the size of q.

Now let’s look at this algorithm in both Maxima and Sage.

Maxima

We start by creating p and q. We will need to factor p-1 both to find a value of q, and to find its primitive root. So we start by attempting to factor a few randomly chosen large primes, until we have a prime p for which p-1 can be factored, and which also has a reasonably large prime factor q.

We will use smaller values both of p and q than the algorithm requires, just to show how it works.

(%i1) p:next_prime(random(10^80));
(%i2) factor(p-1);

After a few tries, I found

p = 182842970179003336959233156794188485625560973915508949565601
89183057229685434897
q = 525970797581619193760592144581011744537

Maxima doesn’t have a built-in command to find primitive roots, but one can easily be written, using the fact that a primitive root a is a number for which

a^{(p-1)/t}  \neq 1 \bmod{p}

for all prime factors t of p-1. Here is one such program:

primroot(p):=block(
[f:ifactors(p-1),n,i:1],
if not(primep(p)) then error("Your number is not prime"),
n:length(f),
do (
i:i+1,
if not member(1,makelist(power_mod(i,(p-1)/f[j][1],p),j,1,n)) then
return(i)
)
);

So:

(%i3) a:primroot(p);
(%o3) 5
(%i4) g:power_mod(a,(p-1)/q,p);

The private/public key pairs:

(%i5) A:random(q-1);
(%i6) B:power_mod(g,A,p);

Now we can choose a random message a sign it (for ease we won’t show the outputs, which are just long numbers):

(%i7) m:random(p);
(%i8) k:random(q-1);
(%i9) r:mod(power_mod(g,k,p),q);
(%i10) s:mod(inv_mod(k,q)*(m+A*r),q);

Now to verify the signature:

(%i11) x:mod(inv_mod(s,q)*m,q);
(%i12) y:mod(inv_mod(s,q)*r,q);
(%i13) v:mod(mod(power_mod(g,x,p)*power_mod(B,y,p),p),q);
(%i14) is(v=r)
(%o14) true

Sage

As with Maxima, we start by finding p and q. We simply repeat

sage: p=next_prime(randint(1,10^80))
sage: factor(p-1)

until we find values we want. I found

p =
530156743088749972013047250493987281452419479410412138
17513510820559804089810293
q= 2199623526308059394919085303004156101

Then

sage: a=Mod(primitive_root(p),p)

returns 5. The extra Mod( ,p) ensures that the type of a is an element of the
ring of integers modulo p \mathbb{Z}_p:

sage: parent(a)
Ring of integers modulo 530156743088749972013047250493987281
45241947941041213817513510820559804089810293

Now we can calculate the other values we need:

sage: g=a^((p-1)/q)
sage: g
530156743088749972013047250493987281211397896967807817
18309906946321326881261201

and the private/public key pairs:

sage: A=randint(1,q-1);A
617088481431564693290032924639855166L,
sage: B=g^A;B
1052251014343945276262934677066619589736007844895332445
1397254657133312260200438

Note that because of Sage’s handling of types, we don’t have to fuss with “power_mod”; since the type of a is “element of \mathbb{Z}_p“, so automatically are g and B, and powers are thus computed quickly, using the modular exponentiation algorithm.

Let’s choose a message, which will be any value less than p:

sage: m=randint(1,p)

and sign it (as with Maxima we won’t show any of the values):

sage: k=randint(1,q-1)
sage: r=mod(g^k,q)
sage: s=(m+A*r)/k

since g is in the ring \mathbb{Z}_p, we need to change r to be in the ring \mathbb{Z}_q. This
will force s to be also in this ring, so there is no need for any explicit calls to number theoretic functions.

Now we can verify the signature:

sage: x=m/s
sage: y=r/s
sage: v=mod(g^x*B^y,q)
sage: v==r
True

Neat.

Both Maxima and Sage provide all the necessary functionality to compute and verify a digital signature (well, nearly all; we had to write our own program for primitive roots in Maxima), but of the two, Sage certainly allows for easier commands, with its mathematical types. For example:

Maxima: y:mod(inv_mod(s,q)*r,q);
Sage: y=r/s

Maxima: v:mod(mod(power_mod(g,x,p)*power_mod(B,y,p),p),q);
Sage: v=mod(g^x*B^y,q)

Addendum: the use of “modulus” in Maxima

Richard Fateman (see comments below) pointed out that my comparison above is in fact incorrect. Maxima allows for very neat commands by use of “modulus”, which if set to any prime number, will effect all subsequent rational expressions. For example:

(%i1) modulus:97;
(%i2) a:rat(5);
(%i3) a^75;
(%o3)                  -34

We see that the output is given in “balanced” form, where the residues are balanced about zero. However,. in Maxima modulus is a global property – it applies to all rational expressions, whereas in Sage different variables can have different modular types. However, by a judicious change of modulus, we can simplify the Maxima commands considerably. To show how to do this, we will deal with some very small numbers, so as to be able to show the outputs. We will use p=1031, q=103, and the primitive root a=14 of p. First, the setup phase. Since everything here is done modulo p, we will use this as the modulus value:

(%i1) p:1031;
(%o1)                   1031
(%i2) q:103;
(%o2)                    103
(%i3) modulus:p;
(%o3)                   1031
(%i4) a:rat(14);
(%o4)                     14
(%i5) g:a^((p-1)/q);
(%o5)                    320
(%i6) A:rat(70);
(%o6)                     70
(%i7) B:g^A;
(%o8)                     48

Now to sign a message m=500 using the “random” value k=25:

(%i9) m:500;
(%o9)                    500
(%i10) k:25;
(%o10)                    25
(%i11) r:mod(g^k,q);
(%o11)                    95

At this stage we are going to start performing operations modulo q, so we change the modulus:

(%i12) modulus:q;
(%o12)                  103
(%i13) s:(m+A*r)/k;
(%o13)                  -23

We have thus created the signature (with balanced modulo values); now for the verification:

(%i14) x:m/s;
(%o14)                   32
(%i15) y:r/s;
(%o15)                  -31

To compute the final value we’ll change the modulus once more:

(%i16) modulus:p;
(%o16)                1031
(%i17) v:mod(g^x*B^y,q);
(%o17)                  95
(%i18) is(v=r);
(%o18)                true

and we’re done, all with very simple commands.

Solving quadratic equations geometrically

I vaguely recall some years ago having seen a nonogram for solving quadratic equations, and I thought it may be a fun thing to do with my students. I couldn’t find what I was looking for, but I did come across a lovely result which seems to have been first noted by a French artillery captain named M. E. Lill (see http://www.pballew.net/Lill_cir.doc for a more complete history). It can be stated as follows:

Let the quadratic x^2+bx+c=0 have real roots. Then the circle whose diameter has endpoints (0,1) and (-b,c) cuts the x-axis at the root values.

We note in passing that the quadratic given is completely general, for any quadratic can be reduced to that form by dividing through by the leading coefficient. Here are “Lill circles” for the quadratic equations x^2-3x+2=0:

lill1

and x^2-2x-3=0:

lill2

I like this result. It seems (at first glance) to be slightly mysterious; it is a wonderful mixture of algebra and geometry, and it’s not quite obvious. In fact, the proof is very simple, and is an elementary application of Euclid’s result that the angle in a semi-circle is a right angle.

Consider the first diagram above, without the circle, but with two extra lines:

lill3

Since the angle AOB is a right angle, the bottom two triangles are similar, and so

\displaystyle{\frac{x}{1}=\frac{c}{b-x}}.

Multiplying this fraction out produces x(b-x)=c which can be rearranged to produce x^2-bx+c=0. Since in this diagram the upper right point is (-(-b),c)=(b,c) this finishes the proof.

The Mathematics Change Plan, take 2

or,

Why upper management is innumerate

I haven’t written much lately, because I’m still smarting over my University’s intransigence over the Mathematics Change Plan. The basis of the argument seemed to be: enrolments in Computer Science have been falling (yes, that’s true), and since Mathematics staff make up about half of the old Department of Computer Science and Mathematics, their numbers must be vastly in excess; that is “surplus to requirements”.

However, that argument holds no water, because the bulk of mathematics teaching is to external students: engineering students, teacher education students, science (biology, chemistry) students, nursing students, and so on. And in fact almost every subject we have taught over the past four or five years has obtained the bulk of its students from cohorts other than computer science. In fact, if comp sci folded, it would make negligible difference to the student numbers in mathematics/statistics subjects, and no difference at all to (maths) staffing requirements.

I have spent some time putting together a spreadsheet, which provides the breakdown of all staffing hours and student numbers. It turns out that this year we are teaching a total of 2550 hours (a far cry from the 2100 in the Plan). When our Head of School was presented by our Union with our spreadsheet, and asked to justify his own numbers, he was unable to do so. Now call me old-fashioned, but I would have thought that if you are signing a document which speaks of sacking staff, you would be pretty sure about the numbers on which your argument was based. Apparently not here.

It also turns out that under the new funding for students in maths/stats subjects, every student we teach brings $1185 in income to the School. Multiply that by the 1410 students we teach, and you get $1,670,850. Now, a rule of thumb is that staff salaries should amount to no more than 80% of income. We cost about $1,200,000 (give or take a bit – and that includes on-costs), which is far less than 80% of our income.

Let’s suppose that we still get rid of three full-time effective (FTE) staff, which is the current plan, according to a form letter from our Head. Currently we have nine staff teaching maths, of which group some also teach physics, some computer science and some engineering. So the total FTE maths staff is about 7.5. Reduce that by three and you get 4.5. Not rocket science, is it? And how do you cover 2550 hours/year with 4.5 staff? – by using vast numbers of sessionals, that’s how! But the University requires less than 20% of its numbers to be covered by sessionals.

The Union is mystified about the University’s continued persistence with the Plan – we have patiently explained why the original Plan was based on faulty reasoning derived from incorrect numbers – and our arguments are clear, and what’s more, backed up with fully justifiable figures derived from the University’s timetables and student database.

And yet the madness continues…

The “Mathematics Change Plan”

With various attempts to save money, my university has started issuing “change plans” for various disciplines and employment groups. Just this last week the administration has released a Mathematics Change Plan, a Nursing and Midwifery Change Plan, a Science Change Plan, and a Technical Staff Change Plan. My concern is with the Mathematics Change Plan, because it affects me more than the others.

I have rarely seen a document so specious, badly thought out, and based on such poor reasoning. All the change plans are based on the same template, the main differences between them being the numbers used to draw their conclusions. For mathematics, the main changes are to reduce staff by five: a full professor and a lecturer associated with the RGMIA research group, and three full time teaching staff (or equivalent) from the rest.

Now, there is no doubt that student numbers, and associated teaching hours have fallen over the past few years – but less so in Mathematics, with its service teaching to engineering, science, and education than in some other disciplines – and probably some changes are required. But the loss of five staff?

Let’s first look at the teaching hours. The document makes note of a change in total teaching hours from 2178 to 1944. That’s a loss of 234 hours. Given two 12 week semesters, and an average teaching load of 12 contact hours per week, that’s less than one full time staff member. Note that an original report (we’ve had two reports in the past year) counts the number of mathematics teaching staff at 10.4 (effectively 10, as the 0.4 counts for a staff member with no teaching), so if the change plan were to be implemented we would have five staff teaching 1944 hours. That works out, over 24 teaching weeks in the year, at 16.2 contact hours per week – an impossible work load for any academic with the hope of doing any research, planning or development.

One of the two reports recommends “disaggregating” mathematics and computer science – up until recently, we were all together in the School of Computer Science and Mathematics. Leaving aside the wisdom of this idea (which is quite lost on me) it is in practice almost impossible to achieve, as many teaching staff, such as myself, have taught into both disciplines. Over the past few years I’ve taught calculus, discrete mathematics, mathematical cryptography, digital image processing, network security, as well as supervising research students, mainly from our coursework Masters in Computer Science. So am I a mathematician or a computer scientist? I’ve avoided being pigeon-holed, and I’m not alone – many of my colleagues also teach as much computing as mathematics.

Student numbers. Because the disaggregation has not yet happened, it is very difficult to separate mathematics numbers from computer science. And in fact the biggest single drop in numbers (which the Change Plan notes) was in fact caused by a drop in numbers from the Comp Sci Masters program a few years ago. But this is a program which contains no mathematics! No report, or this Change Plan, has made the effort to go beyond total student numbers to numbers specific to individual disciplines.

RGMIA. This is not a formal Research Centre of the University, but rather an informal grouping of local and international academics (about 1300 of them) all with similar interests. A refereed journal is published, and the Chair of RGMIA is an internationally renowned scholar, with hundreds of papers and books to his credit. He was appointed as a research and teaching academic some years ago, and has built up RGMIA with his many international contacts, and by his own research excellence, obtaining nothing from the University aside from a lecturer to help run the Journal. How is he being repaid for his selfless dedication to the University? – by having his position being made redundant even while the Change Plan is officially still in draft status! Even though he himself, officially, is not being made redundant, his position is, which means that either he has to reapply for another position, or take a “voluntary departure” package. Either way, it’s shoddy treatment.

Equity. The University has had a mission to provide high quality post-secondary education to the Western suburbs of Melbourne. These are suburbs with a high proportion of blue-collar workers, and low socio-economic status, and the University, with its value-added programs, has always had many students from this area. In fact the University has the highest percentage of any Australian university of students who are the first in their family to attend university. The Mathematics staff have for many years worked diligently to provide multiple entry points into the university’s programs, as well as some highly regarded outreach programs, and some programs to upskill secondary school mathematics teachers. All these programs and good work will die if the Change Plan is to proceed.

The university seems bent on destroying mathematics as a valid, valued and valuable teaching and learning discipline, and in ruining the goodwill of its teaching staff, and engendering a spirit of disinterestedness and cynicism. Well, it’s doing an excellent job.


I’ve been doing some work on the number of hours taught over last few years, and looking very carefully at the staffing spreadsheets. Well, here are the actual numbers:

2006: 2388
2007: 2140
2008: 2340

So much for falling numbers! And owing to the funding cuts, three staff members who have retired during that time have not been replaced.