Cryptology, cryptography, and cryptanalysis

Blog

December 22, 2015

Ray Alderman

VITA Standards Organization

EVOLUTION OF WARFARE BLOG: Cryptography is a broad, sticky, and mathematically complex, but interesting subject and an integral part of the evolution of warfare. So let’s get some definitions out of the way first. Cryptology is the study of codes, both creating and solving them. Cryptography is the art of creating codes. Cryptanalysis is the art of surreptitiously revealing the contents of coded messages, breaking codes, that were not intended for you as a recipient.

Secondly, there are nomenclators and enciphers. Nomenclators are letters or numbers that represent words or phrases, like 103A means “meet me at 4PM". Ciphers are alphabetical letters or numbers that are encrypted using some sequential coding process and a key. For this essay, we will refer to both as codes. Also, enciphered, encrypted and encoded mean the same thing.

Additionally, there is plain text. This is the original message that is readable and understandable, uncoded or unencrypted. Once it goes through the coding process and is encrypted, the output is readable but not understandable. There are a bunch of other terms like steganography, homophones, polyphones, digraphs, bigrams, and polygrams, but they are just variations of coding and decoding techniques.

A little history

Cryptology, the study of coded messages, dates back to Egypt about 1,900 BC, when a scribe carved some hieroglyphic symbols into a rock at the tomb of Khnumhotep II. Cryptology wasn’t that hard back then, since most of the people were illiterate and only the elite could read any written language. Pharaohs and potentates, kings and queens, presidents and dictators, and military commanders have used cryptology to hide their communications from their enemies ever since. An interesting example is Mary, Queen of Scotts. In 1577, Don Juan, ruler of Austria, had worked out a plan to invade England, dethrone and kill Queen Elizabeth, put Mary on the throne, and marry her. But, Elizabeth’s minister, Francis Walsingham, smelled a rat and had Mary confined to castles away from Elizabeth.

Mary’s page, Antony Babington, began plotting to have Elizabeth assassinated 1586, with the help of Don Juan and Phillip, the king of the Netherlands. Babington employed Gilbert Gifford to smuggle encrypted communications between him and Mary out of the castle in beer barrels. As it turns out, Gifford was an agent working for Walsingham, so those messages got delivered to him immediately. He employed Thomas Phelippes to decrypt those messages which thoroughly incriminated Mary of plotting to kill Elizabeth, and she was convicted of high treason. So, at 8AM on the morning of 8 February 1587, the executioner chopped off Mary’s head with an axe. Such is the importance of insuring your code is hard to break. Thomas Philippes became England’s first cryptanalyst, establishing the role of codebreakers in the future.

Code types

There are numerous types of codes throughout history: mono-alphabetic, poly-alphabetic, columnar transposition, S-boxes and P-boxes, Polybius squares, etc. Many primitive machines were made to encrypt/decrypt messages. One of the first used a wooden dowel about an inch in diameter and two feet long. A 1/4 inch-wide piece of parchment, about 3 feet long, was wrapped around the dowel, edge to edge like a barber pole. The plaintext message was then written vertically on the dowel in columns. When unraveled from the pole and laid out flat, it would look like a series of letters on a strip of paper that made no sense. Only when the intended recipient wrapped the 1/4 inch strip of parchment around his 1-inch wooden dowel could the characters be read in the proper order, vertically. Other early mechanisms used were identical cipher wheels and cypher drums held by both the sender and receiver. Other codes used simple key words shared by two people, and an agreement on how the code would be structured. That could be worked out mentally on paper, without the use of a mechanical mechanism. As you can see, many early codes and code machines were cumbersome and it took a lot of time to encrypt or decrypt a message.

Although the early codes were unsophisticated and primitive, they taught many lessons. The first is that the only people with the knowledge to create secure codes are the cryptanalysts: they know the weaknesses of the codes they have already broken. The second lesson is that the most important element of a code is the key: it does not matter if your enemy knows how your code is structured if you have complex keys. Third, the longer your key, the more difficult your code is to break. If your key is the same length as your message, your message is basically unbreakable. But, as early cryptographers have stated, no code is unbreakable. Fourth, the longer your message, the easier it is to break. And finally, always send the message in five-letter groups, to hide the number and length of words.

Externals and internals

There are two parts to any encrypted message: the externals, and the internals. Cryptanalysis must discover the internals, the process by which the encrypted message was created, by using the external characteristics of the message. The first technique is character frequency analysis. The National Security Agency (NSA) has run millions of pages of newspapers, books, periodicals, and magazines through their computers, establishing the letter frequency charts of the English language: these are the percentages of each letter in a page of text: a= 8.167 percent, e= 12.702 percent, i=6.966 percent, o=7.507 percent, u=2.758percent, and so on. With an encrypted message using a simple substitution code where x=e, z=a, etc, it’s child’s play to decrypt it just using the character frequency tables. Our intelligence people have such tables for all the significant languages on the planet.

Next is the first-letter-of-a-word frequency charts: we know the frequencies of which letter the words start with in a page of text (a=11.602 percent, b=4.702 percent, etc). Then, there's letter proximity charts: TT, CC, QU, TH, CH, ING, ION, ED, and ON all string together regularly, and we know what percentage of words have those frequencies. There are also word frequency charts: AND, CAN, IS, IF, OF, WITH, OVER, UNDER, ABOUT, and other word frequencies are well known. Also, a cryptanalyst can make letter interval charts, the distance between letters in the encrypted message. If an M appears in the encrypted message twice, and they are 12 letters apart, then the keyword is probably either 6 characters or 12 characters long (since 3 and 4 character keys are not very secure). This is a process called “key discovery through symmetry of position”. There are additional frequency and interval analysis tools, too many to recount here, that can reveal how an encrypted message was constructed.

Once a good cryptanalyst applies the externals analysis tools, recovers the key, and decrypts the message, he can then reconstruct the encryption process used for that message. That process, and the key, are the internals of the code. Also, once we have the patterns for the externals of a certain code, we can analyze thousands or millions of other intercepted encrypted messages against that known pattern, to see if they fit. If they do, we know the internals of the code, so it’s just a matter of using letter interval charts to uncover the symmetry of position of the letters in the encrypted message, retrieve the key, and decrypt them.

NSA efforts

In 1969, there were 96,000 people working for the NSA, including my old Army Security Agency (ASA) unit, that were intercepting enemy messages, running the frequency/proximity/symmetry charts against those messages, decrypting them, analyzing them, cataloging them, and disseminating the intelligence to our military commanders. Today, that process is done with acres of NSA supercomputers discussed in the previous installment of this warfare series. Now you know why NSA has the most computing power on the planet.

With massive amounts of computer power, relatively sophisticated codes can be broken in a short time with these brute-force methods. Today’s advanced codes, like Logarithmic and Elliptical coding schemes, are very secure but not unbreakable. It takes millions of years for our computers to discover the two prime numbers multiplied together to create a 200-digit key used for encrypting a message. A quantum computer can do it in an hour, according to cryptanalysts, which is why NSA has been funding quantum computer research and development since the 90’s. Turn that decryption concept around, and quantum computers could create massive keys, as long as the message itself, that make them invulnerable to cryptanalysis techniques and decoding by our enemies. Auguste Kerckhoffs, the author of La Cryptographie militaire, once said, “…secrecy must reside solely in the keys.” As you can see from the examples of frequency analysis above, all encrypting techniques and their resulting coded messages, have certain idiosyncrasies or patterns that offer a good cryptanalyst a crack to get his fingers into. With a bit of work, by him or computers, he can widen that crack and get inside, recover the keys and the coding scheme, and break it.

Time shifting

If quantum cryptology doesn’t impress you, let’s explore doing time shifting. The scientists at Purdue University have figured out how hide data in a light beam. They take two beams and modulate them slightly out of phase with each other on an optical fiber. That creates “holes in time”, between the two beam phases. Into that hole, they drop the plaintext data. If you look at the modulated beam, you see a flat line, no light, where the data is now located. The receiver, knowing the modulation patterns, can simply demodulate the beam and recover the data. The modulation patterns are the key to the encoding scheme. Read: http://www.forbes.com/sites/alexknapp/2013/06/06/take-that-nsa-scientists-hide-communications-using-a-hole-in-time/.

For many years, computers have been doing the in-stream encryption and decryption for our military and government communications. That makes the process fast and secure. When the Pueblo was captured by the North Koreans in January 1968, the President of the United States knew about it within a few minutes via secure encrypted communications from ASA intelligence people in the Pacific to NSA at Fort Meade.

The Pueblo had top secret crypto machines onboard, the same models used by NSA and ASA, and the North Koreans grabbed them intact (they shared them with the Russians too). Did the U.S. intelligence groups have to throw away all the crypto machines and replace them with new ones? No. All we had to do was change the keys and the captured machines were basically worthless to the North Koreans and the Russians. So, Kerckhoffs was correct: the keys are the key.

If you want to dig deeper into this topic, the best book on the history of cryptology is “The Codebreakers” by David Kahn (if you can find a copy). It’s over 1,000 pages long but worth a read. The chapters about breaking the German Enigma machines and the Japanese Purple code in WWII are astonishingly detailed. If you want to understand basic encryption techniques and more history, read “The Code Book” by Simon Singh. If you are not mathematically challenged, read “Cryptanalysis” by Helen Fouche Gaines. Same goes for “Decrypted Secrets” by F. L. Bauer, very heavy on formulas. And finally, there’s the ever-popular “Modern Cryptanalysis” by Christopher Swenson (this is a text book with exam questions, problems to solve, and software examples in Python, so you can program your analysis techniques into your computer). I’ll warn you now: these books are not light reading.

For our next installment, we will take a look at the military capabilities of our closest enemies, through the eyes of our intelligence folks. That will give you an informed idea of where we stand in a war with any of them.