# A quick-and-dirty cryptographic hash function

Cryptographic hash functions sit at the core of many secure applications. They are vital to digital signatures for example: rather than signing a message, the hash of the message is signed. A cryptographic hash may be considered as a sort of digital fingerprint: it identifies a message in the same way that a fingerprint identifies a person by associating a message – no matter its size – with a string of bits of fixed length. To do this, such a hash must satisfy the following:

1. It must be fast to compute.
2. Given a hash value $h$, it should not be possible to find a message $m$ for which $h=H(m)$. This is called pre-image resistance.
3. Given a message $m$ and its hash value $h$, it should not be possible to find another message $m'\ne m$ for which $h=H(m')$. This is called weak collision resistance.
4. It should not be possible to find any two different messages $m$ and $m'$ for which $H(m)=H(m')$. This is called strong collision resistance.

Most cryptographic hash functions are based on bit operations, and are designed to be blisteringly fast, as well as satisfying the other properties above. See the links on the wikipedia page for some examples.

However, if we are prepared to trade speed for simplicity, a very nice hash function was developed by Adi Shamir. Take two large primes $p$ and $q$, and a number $a$ whose multiplicative order is large in the ring $\mathbb{Z}/ \mathbb{Z}_{pq}$. If a message $m$ can be encoded as a large integer, the hash is defined as

$H(m)=a^m\pmod{pq}$.

By the discrete logarithm problem, this is pre-image resistant. It is also collision resistant, for if

$a^{m'}=a^m\pmod{pq}$

then we would require

$m-m'=0\pmod{\phi(pq)}$.

But determining $\phi(pq)$ requires factoring $pq$.

This hash function can be easily implemented. First we have a little function which turns any ASCII text into a large number, by treating the ASCII values of each character as digits in base 256:

def str2num(s):
return ZZ(map(ord,s),256)

Next, to find $a$ which has a large multiplicative order. The largest possible order modulo $pq$ is $\mbox{lcm}(p-1,q-1)$, and this will be obtained if $a$ is a primitive root of both $p$ and $q$. Given that a number can be tested to be a primitive root by checking that

$a^{(p-1)/d}\ne 1$

for all prime factors $d$ of $p-1$, we have the following little program for finding a common primitive root:

def common_primroot(p,q):
# Finds a primitive root common to both p and q
ps = [f for f,e in factor(p-1)]
qs = [f for f,e in factor(q-1)]
a=2
while true:
outp = true
for d in ps:
if power_mod(a,(p-1)//d,p)==1:
outp=false
outq =  true
for d in qs:
if power_mod(a,(q-1)//d,q)==1:
outp=false
if outp and outq:
return a
break
else:
a+=1

Now the hash, for say, 160 bits. This can be computed with these steps:

1. Choose two primes $p$ and $q$ and a suitable value $a$.
2. Append the string “xx” to the text. This will ensure that the empty string has a non-trivial hash.
3. Turn the text into a number $m$ and compute

$H(m)=a^m\pmod{pq}$

4. Take the residue modulo $2^{160}$, turn the result into hexadecimal, and pad out to 40 hexadecimal characters.

All this can be done in one fell swoop:

def qd_hash(pl,a,p,q):
# Implements Shamir's hash function h = a^m (mod p*q) where a is a
# number of high multiplicative order mod p*q.  This works well if
# a is chosen to be a common primitive root of both p and q
return hex(power_mod(a,str2num(pl+'xx'),p*q)%(2^160)).zfill(40)

Let’s try it out, we’ll choose two 150-bit primes, and find a common primitive root:

sage: p=next_prime(randint(2^150,2^151));p
1681637307496557466781353395191134574935425217
sage: q=next_prime(randint(2^150,2^151));q
1592026707039533251493751206235271693959169369
sage[142]: a=common_primroot(p,q); a
7

For the message text, we choose the first few lines of Shakespeare’s “Richard III”. Also take another text differing by one character, and the empty string:

sage: pl="Now is the winter of our discontent/ Made glorious summer by this
sun of York;/ And all the clouds that lour'd upon our house/ In the deep
bosom of the ocean buried."
sage: pl1="Now is the winter of our discontent/ Made glorious summer by this
sun of York;/ And all the clouds that lour'd upon our house/ In the deep
bosom of the ocean buried-"
sage: pl2=''

Now hash them:

sage: qd_hash(pl,a,p,q)
'53ed334a9d8937f0b42bec194b0be729de2a7da3'
sage: qd_hash(pl1,a,p,q)
'15b5735208cdb11ab27f1b52f5da4f0132f88137'
sage: qd_hash(pl2,a,p,q)
'28c2ec5fb67e2b1af21ce8dd1a79f4c347197c34'

A tiny change in the plaintext causes a complete change in the hash, as we would hope.