Any book on cryptography written for a more-or-less lay audience must inevitably face comparisons to The Code Book, written in 1999 by Simon Singh, the king of distilling complex subjects to a few hundred pages of understandable writing. While Singh’s book is a pretty thorough history of codes and codebreaking1 through the centuries with plenty of the maths thrown in, The Mathematics of Secrets is tilted (and indeed titled) more towards a fuller explanation of the mathematical techniques underlying the various ciphers. Although Holden’s book follows a basically chronological path, you won’t find too much interest in pre-computer ciphers here: Enigma is cracked on page seventy, and the name Alan Turing does not appear in the book.
The Mathematics of Secrets kicks off with a pretty decent chunk of introductory linear algebra in the service of basic substitution ciphers, preceded by a few pages of terminology. This is introduced apologetically as an unfortunate necessity, but some of the explanation could be handled better. Part of the rundown is handed off to the following quote from David Kahn, none of the terms in which have previously been defined (the ellipses are the book’s, not mine):2
A code consists of thousands of words, phrases, letters, and syllables with the codewords or codenumbers…that replace these plaintext elements….In ciphers, on the other hand, the basic unit is the letter, sometimes the letter-pair…, very rarely larger groups of letters….
Clear? Once the book gets going, it offers a pretty good explanation of a variety of different encryption techniques from simple substitution ciphers through to modern stream ciphers and key exchange systems, as well as the strategies used to attack them. In one of the most accessible sections, Holden explains a particularly elegant system for cracking ‘polyalphabetic’ substitution ciphers: ones where the encoded ‘ciphertext’ is produced by switching to a different substitution cipher ‘alphabet’ after each letter, in particular the case where a relatively small number of alphabets are used in rotation. The first thing to work out is the length of the cycle: how long before the encoding alphabets repeat? One method of finding this is to compute the “index of coincidence”: the probability that two randomly chosen letters of the ciphertext are the same. For a normal ‘monoaphabetic’ cipher this would be the same as for unencoded text: about $0.066$ for English. The more alphabets are used, the nearer this number gets to $1/26=0.038$, the value for a string of random letters. The value your ciphertext gives you suggests a rough value for the number of alphabets, give or take one or two. A second approach is to look for short strings of three or four letters that show up twice in the ciphertext. Some of these will be coincidental, but more will arise from repetitions in the message that happen to get encoded using the same sequence of alphabets. In that case, the distance between the repetitions must be a multiple of the number of alphabets. So any particularly prevalent factor among the distances-between-repeats is a good bet for the number you’re looking for. The two methods combine perfectly, since the second will give you candidates that are unlikely to be close in value, so the approximate value gained from the first should settle which is the correct one. Once you know the number of alphabets, you can split the message up and attack the parts with normal frequency analysis. Holden explains this clearly, with the full details and a couple of examples in case you want to have a go yourself. (I don’t know how challenging these informal exercises are — I had a review to get on with — but I’m guessing they’re less intense than The Code Book‘s cipher challenge, which stood unsolved for thirteen months despite a £10,000 prize.)
Towards the middle of the book, some of the discussion of real-world implementations of modern cipher systems towards can get a little mystifying. A flurry of ‘key schedules’, ‘S-boxes’ and ‘P-boxes’ seem to be chained together in specific arrangements it’s hard to muster up much enthusiasm for. Perhaps some of these details could have been relegated to appendices and more time spent on the general aims of these systems, which I think I am still a little hazy on. The version of the book the publishers sent us was also marred in a couple of places by typographical annoyances. Ciphertext is marked out in the main prose in small caps, but plaintext (unencoded messages) are not distinguished at all, producing the momentarily baffling sentence on page 46 “The plaintext letters at repeat 4 times”. And in one unfortunate instance, factorial signs have been omitted leading to the startling claim that $12 = 479,001,600$.
Things get going again when it’s time for the fun stuff. The seemingly-impossible shenanigans of public-key encryption, where the two parties can concoct some secret numbers that only the two of them know despite all their communication being entirely public, is well-explained. As are the newfangled elliptic-curve-based systems that might one day replace all that current mucking about with massive semiprimes, and the exciting world of quantum cryptography, where the preposterous properties of protons are corralled into a fundamentally unbreakable code system. This more theoretical stuff seems to me a lot more interesting than worrying about why a P-box was added at the beginning and end of the DES standard (even the author seems unsure: “Apparently, the P-boxes are merely there to make the data easier to handle on the original chip”). Unless you have a particular interest in the gnarlier details, a few pages might prove skippable, but otherwise this is a decent tour of cryptography for anyone who wants to go a bit deeper than Simon Singh took them.
Joshua Holden: The secrets behind secret messages: press release from the publishers with an interview with the author
The Mathematics of Secrets at Princeton University Press
- I will in this review unapologetically make no attempt to maintain any distinction between the terms code and cipher; cryptography, cryptanalysis and codebreaking, etc. [↩]
- Khan is introduced as “author of perhaps the definitive account of the history of cryptography”. Obviously, I beg to differ. [↩]