Friday, Feb. 16, 1968

HOW TO SOLVE A CIPHER

The expert cryptanalyst and the Sunday-puzzle expert alike rely on the fact that letters have their own personalities. As David Kahn writes: "To the casual observer, they may look as alike as troops lined up for inspection, but just as the sergeant knows his men as 'the gold-brick,' 'the kid,' 'the reliable soldier,' so the cryptanalyst knows the letters of the alphabet."

Consider this simple cryptogram:

GJXXN GGOTZ NUCOT WMOHY JTKTA MTXOB YNFGO GINUG JFNZV QHYNG NEAJF HYOTW GOTHY NAFZN FTUIN ZANFG NLNFU TXNXU FNEJC INHYA ZGAEU TUCQG OGOTH JOHOA TCJXK HYNUV OCOHQ UHCNU GHHAF NUZHY NCUTW JUWNA EHYNA FOWOT UCHIMP HOGLN FQZNG OFUVC NZJHT AHNGG NTHOU CGJXY OGHTN ABNTO TWGNT HNTXN AEBUF KNFYO HHGIU TJUCE AFHYN GACJH OATAE IOCOH UFOXO BYNFG

First, count the frequency of each letter in the text:

17 4 13 0 7 17 23 26 5 12 3 2 2 A B C D E F G H I J K L M 36 25 1 5 0 0 23 20 3 6 9 13 8 N O P Q R S T U V W X Y Z

For a 200-letter cryptogram, an appropriate frequency table for English looks like this:

LETTER: a b c d e f g h i j k l m

FREQUENCY: 16 3 6 8 21 4 3 12 13 1 1 7 6

LETTER: n o p q r s t u v w x y z

FREQUENCY: 14 16 4 1/2 13 12 18 6 2 3 1 4 1/2

Since "every letter has a cluster of preferred associations that constitute its most distinguishing characteristic," the next step is to set up a contact chart. This shows how often each letter precedes and is succeeded by each other letter. From this it is clear that N in the ciphertext stands for plaintext e: it is the most frequently used letter, and associates more often with more different characters than does any other letter.

The other high-frequency plaintext vowels, a, i and o, tend to avoid one another. A contact chart would show that three of the most common letters in the ciphertext --O, U and A--are the most mutually exclusive. OA appears twice, OU once, and UO, UA, AO and AU not at all. But NU appears five times in the cryptogram. It happens that the most frequent English vowel diagraph is ea. Thus it is a good bet that U = a. Similarly, since the combination io is most frequent among the three dissident vowels in English, assume that it is represented in the cipher by OA. Therefore O = i and A = o.

So far, the four most common vowels have been tentatively identified. Now for consonants. An easy-to-spot characteristic of plaintext n is that it is preceded 80% of the time by vowels. The contact chart shows that ciphertext T is preceded 17 times out of 23 by ciphertext N, O, U, or A. Put T down for n.

To the cryptanalyst, ciphertext Y is also significant. In the cryptogram, it runs before N and never follows it; at the same time, it always follows H and never precedes it. This is the usual behavior of plaintext h: the diagraph he is commonplace, but eh is unusual; th is the most frequent diagraph of all, but ht less so. Therefore, in the cryptogram, Y=h and H should equal t.

So far, 160 of the 280 letters in the cipher have been tentatively identified. The solution (in the first six cipher groups) would look like this:

GJXXNGGOTZNUCOTWMOHYJTKTAMTXOB e in ea in ith n no n i

Now it becomes possible to find familiar words in plain English. For example, the letters ith appear near the beginning. Guessing that this could stand for with, the analyst assumes that M = w. He tries that idea out in other places where M appears in the ciphertext. Down the line this produces the sequence with-n-nown. This suggests: with unknown, in which case J would equal u, and K would equal k.

By now, the only two high-frequency plaintext letters remaining are r and s. Assume that F stands for r and G for s. If this is so, then the first nine letters in the message would read:

GJXXNGGOT su e s s i n

From this, success in leaps to mind, meaning that X -- c. Each clue begets new clues until the cipher is solved. The cryptogram reads:

"Success in dealing with unknown ciphers is measured by these four things in the order named: perseverance, careful methods of analysis, intuition, luck. The ability at least to read the language of the original text is very desirable but not essential." Such is the opening sentence of Parker Hitt's Manual for the Solution of Military Ciphers.

This file is automatically generated by a robot program, so reader's discretion is required.