The use of pencil-and-paper one-time pads is limited because of the practical and logistical issues and the low message volume it can process. One-time pads were widely used by foreign service communicators until the 1980s, often in combination with code books. These code books contained all kinds of words or entire phrases, which were represented by a three or four figure code. For special names or expressions, not listed in the codebook, there were codes included that represent one letter that allowed the spelling of words. There was a book to encode, sorted by alphabet and/or category, and a book to decode, sorted by numbers. These books were valid for a long period of time and were not only to encode the message - which would be a poor encryption method by itself - but especially to reduce its length for transmission over commercial cable or telex.
Once the message was converted into numbers, the communicator enciphered these numbers with the one-time pad. Usually there was a set of two different pads, one for incoming and one for outgoing messages. Although a one-time pad normally has only two copies of a key, one for sender and one for receiver, some systems used more than two copies to address multiple receivers. The pads were like note blocks with random numbers on each small page, but with the edges sealed. One could only read the next pad by tearing off the previous pad. Each pad was used only once and destroyed immediately. This system enabled absolute secure communication. An excellent description of Canadian Foreign Service one-time pads is found on Jerry Proc's website.
Intelligence agencies use one-time pads to communicate with their agents in the field. The perfect and long-term security protects the identity of convert agents, their assets and operations abroad. With one-time pad, spies don't have to carry crypto systems or use insecure computer software. They can carry a large number of one-time pad keys in very small booklets, on microfilm or even printed on clothing. These are easy to hide and to destroy. One way to send one-time pad encrypted messages to agents in the field is via numbers stations. To do so, the message text is converted into digits prior to encryption.
Below, on the left, a one-time pad booklet with reciprocal encryption table from a Western agent, seized by the East-German MfS (Ministerium für Staatssicherheit or Stasi). The second image is a one-time pad sheet (preserved in a 35 mm slide frame) from an East-German agent, found by the West-German BfV (Bundesamt für Verfassungsschutz, the federal domestic intelligence).
The right-most image is a one-time pad of a West agent, found by the MfS (also preserved in a 35 mm slide frame). The pad itself is only about 15 mm or 0.6 inch wide (thus even smaller than depicted) and virtually impossible to read with the naked eye! I even had difficulties to photograph it clearly. Such miniature one-time pads were used by illegal agents, operating in foreign countries, and were hidden inside innocent looking household items like cigarette lighters, fake batteries or ashtrays. You can click the images to enlarge them. However, to read the small pad you will need to click and zoom in once more in your browser after enlarging.
The three photos above are taken at the
Detlev Vreisleben collection.
Until the 1980s, one-time-tapes were widely used to secure Telex communications. The Telex machines used Vernam's original one-time-tape (OTT) principle. The system was simple but solid. It required two identical reels of punched paper tape with truly random five-bit values, the so-called one-time tapes. These were distributed beforehand to both sender and receiver. Usually, the message was prepared (punched) in plain onto paper tape.
Next, the message was transmitted on a Telex machine with the help of a tape reader, and one copy of the secret one-time tape ran synchronously with the message tape on a second tape reader. Before exiting the machine, the five-bit signals of both tape readers were mixed by performing an Exclusive OR (XOR) function, thus scrambling the output.
On the other end of the line, the scrambled signal entered the receiving machine and was mixed, again by XOR, with the second copy of the secret one-time tape. Finally, the resulting readable five-bit signal was printed or perforated on the receiving machine. The picture explains one-time tape encryption for Telex (TTY Murray).
A unique advantage of the punched paper tape keys was that copying them quickly was virtually impossible. The long tapes (which were sealed in plastic before use) were on a reel and printed with serial numbers and other markings on the side. To unwind the tape, copy it and rewind it again with a perfectly aligned print was very unlikely and such one-time tapes were therefore more secure than other keys sheets that were copied quickly by taking a photo or writing them over by hand.
A famous example of one-time pad's security is the Washington/Moscow hotline with the ETCRRM II, a standard commercial one-time tape mixer for Telex. Although simple and cheap, it provided absolute security and unbreakable communications between Washington and the Kremlin, without disclosing any secret crypto technology.
Some other cipher machines that used
the principle of one-time pad are the American TELEKRYPTON, SIGSALY (noise as one-time
pad), B-2 PYTHON and SIGTOT, the
British BID-590 NOREEN and 5-UCO, the Canadian ROCKEX, the Dutch ECOLEX series, the Swiss Hagelin
CD-57 RT, CX-52 RT and T-55 with a superencipherment
option, the German Siemens T-37-ICA and M-190, the East German T-304 LEGUAN, the Czech SD1, the Russian M-100 SMARAGD and M-105 N AGATand the Polish T-352/T-353 DUDEK. There were also many teletype or ciphering
device configurations in combination with a tape reader,
for one-time tape encryption or superencipherment.
Below are three images of the famous Washington-Moscow hotline, encrypted with one-time tapes. The Hotline became operational in 1963 and was a full duplex teleprinter (Telex) circuit. Although the Hotline always was shown as a red telephone in movies and popular culture, the option of a speech link was turned down immediately as it was believed that spontaneous verbal communications could lead to miscommunications, misperceptions, incorrect translation or unwise spontaneous remarks, which are serious disadvantages in times of crisis. Nevertheless, the red phone myth lived a long life.
The real hot line was a direct cable link, routed from Washington over London, Copenhagen, Stockholm and Helsinki to Moscow. It was a double link with commercial teleprinters, one link with a Teletype Corp Model 28 ASR teleprinter with English characters and the other link with East German T-63 teleprinters with Cyrillic character. The links were encrypted with one-time tapes by means of four ETCRRM's (Electronic Teleprinter Cryptographic Regenerative Repeater Mixer). The one-time tape encryption provided unbreakable encryption, absolute security and privacy. Although a highly secure system, the unclassified standard teleprinters and ETCRRM's were sold by commercial firms and therefore did not disclose any secret crypto technology to the opponent.
Hotline images with kind permission of the National Security Agency, copyright NSA (click to enlarge)
One-time tapes and one-time pads
remained very popular for many decades, because of their
absolute security, unequalled by any other crypto machine
or algorithm. Today, digital versions of the one-time pad
enable the storage of huge quantities of random key data,
allowing secure encryption of large volumes of data.
One-time encryption still is, and will continue to be,
the only system that can offer absolute message security.
There are many different ways to apply one-time pad encryption. All of them are absolutely secure if the rules of one-time pad are followed. We can apply one-time pad with letters or numbers. In our first example we will demonstrate the use of letters. The OTP keys are called OTLP (One-Time Letter Pad) and result of encryption is always a letters-only ciphertext. This letters-only system is less flexible than a numbers based system but requires only one ciphertext letter for each plaintext letter and a single encryption step, which makes it pretty fast for a manual system. It's therefore ideal when the messages mostly consist of letters and rarely require spelling out numbers or punctuations.
Punctuations and figures are usually spelled out. However, to limit the message length you generally omit punctuations where it doesn't affect readability. Alternatively, you could use rare letter combinations as a prefix to convert figures or punctuations into letters, for instance QQ or XX. In that case XXF to switch to figures and XXL to switch to letters, with ABCDEFGHIJ representing the digits 1234567890. Thus, 2581 would become XXFBEHAXXL (or XXFBBEEHHAAXXL to exclude errors), which is more economical than having to write out 2581 in letters. XXP could be a period, XXK a comma and XXS a slant. XXC could be Code, a prefix for three or four-letter codes to replace long words or sentences, like XXCABC, where ABC represents Request further information or "My location is..."
We need a Vigenère table, also called tabula recta, to encrypt a message. Its result is identical to a modulo 26 addition of letters A=00 through Z = 25.
To encrypt a letter, we write the key underneath the plaintext. We take the plaintext letter at the top and the key letter on the left. The cross section of those two letters is the ciphertext. In the first letter of our example below, the crossing between the plaintext T and key X is ciphertext Q.
To decrypt a letter, we take the key letter on the left and find the ciphertext letter in that row. The plaintext letter is at the top of the column where you found ciphertext letter. In our example, we take the X row, find the Q in that row and see the plain T on top of that column. As a mnemonic we can consider the column header as plaintext, the row header as key and the square field as ciphertext.
Finding the proper letter at the cross section can be cumbersome. There are several more practical versions of the Vigenère table, like a Vigenère disk or Vigenère slider, These images can be saved by right-clicking and then printed and cut out.
Another way to calculate letter one-time pads without a Vigenère table, although more elaborate, is to perform a modulo 26 calculation. We assign each letter a numerical value (e.g. A=0, B=1 C=3 and so on through Z=25). Note that we start with A=0 and not A=1 to enable the use of modulo 26. Text and key values are added together (this time with carry!), with modulo 26: if a value is more than 25, we subtract 26 from that value. Finally, we convert the result back into letters. To decipher the message, we convert the ciphertext and one-time pad key into numerical values and subtract, by modulo 26, the one-time pad from ciphertext (if a value is less than 0 we add 26 to that value).
You can use a little help table to make the calculations easier. To encrypt by addition we take for example T(19) + X(23). The total of 42 in the conversion table represents the letter Q which is the encryption result. To decrypt by subtracting we take Q(16) - X(23). If the result would give a negative value (which is the case here) we take the greater equivalent of Q(16), which is (42) in the conversion table. We can now find the deciphered letter with Q(42) - X(23) = T(19)
There are, however, more convenient and faster systems to encrypt letters. One such system is a table with reciprocal alphabets, which is much faster than a Vigenère table and therefore ideal to encrypt larger volumes of messages by hand. The manual DIANA crypto system, used by U.S. Special Forces during the Vietnam War, is one such system that uses a reciprocal table. See also NSA's History of Communications Security, page 22.
For each column letter there is a normal alphabet and a reversed alphabet. For each column, the reversed alphabet is shifted one position against the previous reversed alphabet. Such reciprocal tables come in various formats but they all use the same principle. Thanks to its reciprocal properties, encryption and decryption are identical and require only a single column. The order of plain, key and cipher letter don't matter and may even differ for sender and receiver. The table is easy to use and it is virtually impossible to make a mistake. Note that this table is not compatible with the Vigenère table! You can download the reciprocal one-time pad table as .txt file.
In the example below we wrote the plaintext above the key. To encrypt T with X, find column T in the table, go downward to letter X and find cipher letter j at its right. Thanks to the reciprocal system it doesn't matter whether you combine T with X or X with T. Quite handy!
To decrypt, take column X, go downward to J and find plain letter t at its right. Again, the order of key and cipher letter don't matter. The beauty of this system is the ease and speed of finding plain and cipher letters in whatever order you like best.
There is also a method to memorize the reciprocal table and speed up the process even more. When encrypting F + G = O, we decrypt this as O + G = F, but also as G + O = F. We call this the trigram combination FGO. Because of the reciprocal property, we can use the trigram FGO for any possible combination, that is, FGO, FOG, OFG, OGF, GFO and GOF. Thus, if you encrypt or decrypt any letter from the trigram with another letter from the trigram you will always get the remaining letter of the trigram. We therefore only need to remember the trigram FGO and instantly know every variation of the trigram. This reduces the number of combinations to memorize from 676 to 126.
Any user can create his list of mnemonics by memorizing the 126 possible trigrams in any desired order. FGO can easily be remembered as the word "FOG". Some other examples are TAG (derived from AGT), AIR, HRB (HR Bureau), NNZ (Northern New Zealand), OXO (the game), AMN (a-mu-nition) or BGS (Better Get Smart), to name a few examples. Everyone picks his own connotations to easily remember the trigrams. Trained operators can encrypt and decrypt on-the-fly at high speed without using a table. You can download the full list of reciprocal one-time pad trigrams as .txt file.
The full list of 126 reciprocal trigrams to be memorized in any order (e.g. ABY is also AYB, BAY, BYA, YAB and YBA)
This is the most flexible system that allows many variations. The OTP keys are also called OTFP (One-Time Figure Pad) and result of encryption is always a digits-only ciphertext. Usually, encryption is performed by subtracting the random one-time pad key from the plaintext and decryption by adding the ciphertext and key together. Note that enciphering by addition and deciphering by subtraction works just as good, as long as sender and receiver agree upon using the opposite calculations.
Before we can perform the calculations with the plaintext and key we need to convert the text into digits. A most basic but less efficient method is to assign a two-digit value to each letter (e.g. A=01, B=02 and so on through Z=26). A more economic system is a so-called straddling checkerboard that converts the most frequently used letters into one-digit values and the other letters into two-digit values. This results in a ciphertext that is considerably smaller than a basic two-digit system. Various checkerboards exist with different character sets and symbols, optimized for different languages and particular purposes.
Note that this text-to-digit conversion itself is by no means secure and must be followed by an encryption! Therefore, we call the converted text a plaincode, to stress that the digits are still in readable form.
The first row of the checkerboards contains the most frequent characters with some blanks between them. The following rows (as many as there were blanks in the top row) contain the remaining letters. These following rows are designated by the digits above the blanks in the top row. Checkerboards are memorized by the top row letters, which can depend on the language it is optimized for.
Some example mnemonics are "AT-ONE-SIR" and "ESTONIA---" (English), "DEIN--STAR" and "DES--TIRAN" (German), "SENORITA--" and "ENDIOSAR--" (Spanish), "RADIO-NET-" (Dutch) or "ZA---OWIES" (Polish). Such word combinations are easily composed with an anagram generator. More blanks in the top row gives more additional rows and thus more characters. There's no need to keep this table secret or scramble the order of the digits or letters because one-time encryption follows.
In our example we use a basic
checkerboard with the "AT-ONE-SIR" mnemonic,
optimized for English. More checkerboards are found on this page.
The top row letters are converted into the one-digit values right above them. All other letters are converted into two-digit values by taking the row header and the column header. To convert figures, we use "FIG" before and after the digits and write out each digit three times to exclude errors.
Let us convert the text "PLEASE CONTACT ME AT 1200H." with the checkerboard:
To encrypt the message, we complete the last group with zero's and write the one-time pad key underneath the plaintext. Since we use digits, the key are plaintext must be calculate the ciphertext by modulo 10. This modulo 10 is essential to the security of the encryption! Therefore, we subtract the key without borrowing (e.g. 3 - 7 = 13 - 7 = 6, and don't borrow 10 from the digit's next-left neighbor).
To decrypt the message, we add the ciphertext and one-time pad key together without carry (e.g. 5 + 7 = 2 and not 12, and don't carry 10 to next-left digit). Next, we re-convert the digits back into text. It's easy to separate the one-digit values from the two-digit values. If a digit combination starts with row number 2 or 6, it is a two-digit code and another digit follows. In all other cases it's a one-digit code.
Sometimes a codebook or code sheet is used to reduce ciphertext length and transmission time. Such codebook can contain all kinds of words and/or small phrases about message handling and operational, technical or tactical expressions. A codebook system does not always require a large book with thousands of expressions. Even a single code table can contain enough practical information to reduce the message length enormously. Below images of a seized Korean code table sheet, the instructions on how to convert the table content into digits and how to calculate the ciphertext.
As a little exercise we will decipher a recording of an actual numbers station (see important note below). You can play the audio or download (right-click.) the recording. The broadcast starts with a repeated call sign melody and the receiver's call sign "39715", followed by six tones and the actual message. All message groups are spoken twice to ensure correct reception. Write down the message groups once (skip the call sign).
Once you have the complete message, write the given one-time pad key underneath it. Add message and key together, digit by digit, from left to right, without carry (e.g. 6 + 9 = 5 and not 15). Finally, convert the digits back into text with the help of the "AT-ONE-SIR" straddling checkerboard as shown in the previous section. Make sure to separate one-digit and two-digit characters correctly. This little exercise shows exactly how secret agents can receive messages in an absolutely secure manner, with only one-time pads, a small short-wave receiver and pencil and paper.
Although we use a recording from an actual numbers
station (Lincolnshire Poacher, E3 Voice), the one-time
pad key is fictitious and reverse-calculated (key =
plaintext - ciphertext) so that a readable but fictitious
message is obtained when using this key. In reality, we
don't know which key was used, whether we must add or
subtract and there is no way to decipher the original
message. In fact, since a one-time pad key is truly
random, you can calculate any plaintext from a given
ciphertext, as long as you use the 'right' wrong key. You
will never know whether you used the proper key or the
wrong one, unless you have the original key. That's
exactly why one-time pad encryption is unbreakable.
Sometimes, one-time letter pads are used, but the message should be transmitted with a device that accepts only digits, like a burst transmitter, a special digital carrier or steganography based on digits or numbers. In that case, we have to convert the encrypted letters into digits. This is possible with a simple conversion table. Such messages might appear to be one-time pad encryption with numbers, where text was first converted into digits and then encrypted with a one-time figure pad, but they are not!
With the table below, you can easily encode the encrypted ciphertext letters EK into 411 or TL into 942. Always find the first letter in the top row and the second letter in the left column. It's just as easy to convert 411 and 942 back into the original ciphertext letter pair as the numbers are in series. Note that this letter-to-digit conversion is no type of encryption or part of the encryption, and does not provides any additional security whatsoever. This is merely a conversion of letters into numbers!
Because that table converts the bigrams
(letter pairs) from the encrypted message into a three
digit numbers, there will be a bias in the spreading of
the digits, regardless the randomness of the bigrams that
were used, because there are more (unused) possible
three-digit combinations than bigram combinations (1000
numbers against 676 letter pairs). In the above table,
there are far less of digit 9 (104) than digit 1 (256)
and you could even design a table where one number is
missing. This bias however does not affect the security
of the encrypted message itself in any way, as the
message was already securely encrypted with a one-time
There is a special way to use one-time pad where the key is not to be destroyed. When information should be available only when two people agree to reveal that information, we can use secret splitting. The secret information is encrypted with a single one-time pad whereupon the original plaintext is destroyed. One user receives the encrypted message and the other user the key. In fact, it doesn't matter who gets which, since both pieces of information can be seen as equal, encrypted parts of the original information.
The split parts are both called keys. Both these keys are useless without each other. This is called secret splitting. One could encrypt for example the combination to a safe and give the split ciphertext to two different individuals. Only when they both agree upon opening the safe, will it be possible to decipher the combination to the safe. You could even split information into three or more pieces by using two or more keys.
In this little example Charlie splits his secret safe combination 21 46 03 88. A random key is subtracted digit by digit, without carry, from the combination numbers. Alice and Bob both receive one piece of the information from Charlie. It's mathematically impossible for both Alice and Bob to retrieve the combination numbers unless they share their keys. This is done by simply adding the keys (without carry).
Of course, we could also use secure splitting on text to encrypt passwords and such. Just convert the text into numbers (e.g.. A=01, B=02 and so on through Z=26) or use a straddling checkerboard. To split the secret into more parts, just add a one-time key for each of the new persons. For three persons you must subtract two keys (without carry) from the plaintext to obtain the ciphertext (e.g. 2 - 4 - 9 = 9 Because 2 - 4 = 12 - 4 = 8 and 8 - 9 = 18 - 9 = 9).
Instead of keeping your secret password in an envelope, you could split it and give the shares to different persons, of which at least one is trusted. One person could never act on his own and approval of a second person is always required. When granddad, old and sick, splits the secret combination from the safe that contains his money and gives each of his children one part, they can only get their hands on his money if they all agree (not that this will make him live longer).
However, since this system is
unbreakable, all information is lost if one of the shares
goes missing. There's no way back if a share is lost or
destroyed by accident! It might be useful to have one
extra copy of your share somewhere on a secure location.
More about Secret Splitting on this page.
Modular arithmetic has interesting properties that play a vital role in cryptography and it is also essential to the security of one-time pad encryption. The result of an encryption process could reveal information about the key or the plaintext. Such information might either point to possible solutions or enable the codebreaker to discard some wrong assumptions. The codebreaker will use this information as a lever to break open the encrypted message. By using modular arithmetic on the result of a calculation we can obscure the original values that were used to calculate that result.
In mathematics, modulo x is the remainder after the division of a positive number by x. Some examples: 16 modulo 12 = 4 because 16 divided by 12 is 1 and this leaves a remainder of 4. Also, 16 modulo 10 is 6 because 16 divided by 10 is 1 and thus leaves a remainder of 6. Fortunately, there's a far easier way to understand and work with modular arithmetic.
Modular arithmetic works similarly to counting hours, but on a decimal clock. If the hand of our clock is at 7 and we add 4 by advancing clockwise, we pass the 0 and arrive at 1. Likewise, when the clock shows 2 and we subtract 4, advancing anticlockwise, we arrive at 8. Modular arithmetic is very valuable to cryptography because the result value reveals absolutely no information about the two values that were added or subtracted. If the result of a modulo 10 addition is 4, we have no idea whether this is the result of 0 + 4, 1 + 3, 2 + 2, 3 + 1, 4 + 0, 5 + 9, 6 + 8, 7 + 7, 8 + 6 or 9 + 5. The value 4 is the result of an equation with two unknowns, which is impossible to solve.
The modulus should have the same value as the number of different elements, with 0 designated to the first element:
Modulo 10 is very easy to perform by adding without carry and subtracting without borrowing, which basically means discarding all but the most-right digit of the result. It could not be easier for one-time pad encryption with digits.
Performing modulo calculations on letters is a bit more complex and requires conversion into numerical values. If we combine the letter X (23) with key Z (25) modulo 26, the result will be 22 (W) because (23 + 25) mod 26 = 22. That's way more elaborate and slower than decimal modulo 10. Fortunately, we can use the Vigenère Square or a circular Vigenère cipher disk to perform modulo 26 easily without any calculations. Note that you should never assign the values 1 through 26 to the letters because the result of a modulo calculation can be zero, for example (25 + 1) mod 26 = 0.
Modular calculations with bits and bytes are actually Exclusive OR operations (XOR) in Boolean modular arithmetic. XOR is used in computer programming to combine a data bit with a random key bit or to combine a data byte with a random key byte.
Let's show the danger of not using modular arithmetic. With normal addition, the ciphertext result 0 can only mean that both key and plaintext have the value 0. A ciphertext result of 1 means that the two unknowns can only be 0 + 1 or 1 + 0. With result 2, the unknowns can only be 0 + 2, 1 + 1 or 2 + 0. Thus, for some ciphertext result values we can either immediately determine the unknowns or we can see which unknowns of the equation could be possible or impossible.
Suppose we add the letter X (23) with key Z (25) without modulo. In that case, the result would be ciphertext 48, as we cannot convert 48 into a letter. However, although both plain letter and truly random key are unknown, we can draw some important conclusions: the total of 48 is only possible with combinations X (23) + Z (25), Y (24) + Y (24) or Z (25) + X (23). By merely looking at the ciphertext, we can discard all letters A through W as possible candidates for that particular plaintext and key letter.
This is also the reason why you should never use text that is converted into digits as numerical key for a one-time pad (some book ciphers use this system). The result will never be random as it consists of a limited range of 26 elements (0-25 or 1-26) instead of 10 elements (0-9) or 100 (0-99), resulting in a completely insecure ciphertext with a tremendous bias.
simple examples show how a ciphertext can leak
information that is very valuable to the codebreaker,
simply because normal instead of modular arithmetic was
used to calculate the ciphertext. Not using modular
arithmetic always causes a biased ciphertext instead of
the truly random ciphertext result from modular
arithmetic. Any bias is as valuable as gold to the
codebreaker. Modular arithmetic is therefore vital to the
security of the one-time pad. Never use one-time pad
encryption without applying modular arithmetic!
Is one-time pad encryption absolutely secure and unbreakable when all rules are applied correctly? Yes! It's also easy to show why, because the system is simple and transparent. It all comes down to one basic fact that is easily understood. The one time pad system is an equation with two unknowns, one of which is truly random. This is mathematically unsolvable. When a truly random key is combined with a plaintext, the result is a truly random ciphertext. An adversary only has the random ciphertext at his disposal to find key or plaintext.
There is also no mathematical, statistical or linguistic relation whatsoever between the individual ciphertext characters or between different ciphertext messages because each individual key letter or digit is truly random. The modulo 26 (one-time pad with letters) or modulo 10 (one time pad with digits) also ensures that the ciphertext does not reveal any information about the two unknowns in the equation. These properties render useless all existing cryptanalytic tools that are available to the codebreaker.
Suppose we have the piece of ciphertext "QJKES", enciphered with a one-time letter pad. If someone had infinite computational power he could go through all possible keys (a brute force attack). He would find out that applying the key XVHEU on ciphertext QJKES would produce the (correct) word TODAY. Unfortunately, he would also find out that the key FJRAB would produce the word LATER, and even worse, DFPAB would produce the word NEVER. He has no idea which key is the correct one. In fact, you can produce any desired word or phrase from any one-time pad -encrypted message, as long as you use the "'proper"' wrong key. There is no way to verify if a solution is the right one. Therefore, the one-time pad system is proven completely secure.
Three of the many possible solutions:
Let us give an example with one-time pad encryption, based on digits. For encryption, plain and key are subtracted. For decryption, the key is added to the ciphertext. The following straddling checkerboard is used for text to digit conversion.
Suppose we intercepted the following ciphertext fragment:
Lets crack the message with the following key:
However, there is a second solution with a different key:
Unfortunately, there is no way to check which of the two keys and resulting plaintext are correct. Well, here is the bad news: both solutions are incorrect. The actual message is found below, but we will never know for sure whether this is the actual message, unless we have the original key at our disposal.
These examples again show that we can produce any plaintext from any ciphertext, as long as we apply the proper wrong key. Since the plaintext is determined by a series of truly random key digits, mathematically unrelated to each other, we have absolutely no idea whether the chosen key is correct. Any readable solution is mathematically and statistically equally possible and appears valid. There is no way to verify the solution, as it originates from random digits. The system is therefore information-theoretically secure. You have an unbreakable cipher. It's the only existing unbreakable cipher and it will stay unbreakable forever, regardless any future mathematical or technological advances or infinite time, available to the codebreaker.
The one-time pad encryption scheme itself is mathematically unbreakable. The attacker will therefore focus on breaking the key instead of the ciphertext. That's why a truly random key is essential. If the key is generated by a deterministic algorithm the attacker could find a method to predict the output of the key generator. If for instance a crypto algorithm is used to generate a random key, the security of the one-time pad is lowered to the security of the used algorithm and is no longer mathematically unbreakable.
one-time pad key, even truly random, is used more
than once, simple cryptanalysis can recover the key.
Using the same key twice will result in
a relation between the two ciphertexts and
consequently also between the two keys. The different
ciphertext messages are no longer truly random and it's
possible to recover both plaintexts by
heuristic analysis. Another unacceptable risk of
using one-time pad keys more than once is the
known-plaintext attack. If the plaintext version of a
one-time pad encrypted version is known, it is of course
no problem to calculate the key. This means that if
the content of one message is known, all messages that
are encrypted with the same key are also
Using a one-time pad more than once will always compromise the one-time pad and all ciphertext, enciphered with that one-time pad. To exploit reused one-time pads we can use a heuristic method of trial and error. This simple method enables the complete, or at least partial, deciphering of all messages. This can even be done with pencil and paper, although it is a slow and cumbersome process.
The principle is as follows: a crib, which is a presumed piece in the first plaintext, is used to reverse-calculate a piece of the key. This presumed key is than applied at the same position on the second ciphertext. If the presumed crib was correct than this will reveal a readable part of the second ciphertext and provide clues to expand the cribs. In the following example we will demonstrate the breaking of two messages, only with the aid of pencil and paper.
We have two completely different ciphertext messages, "A" and "B". They are both enciphered with the same one-time pad, but we have no knowledge of that key. Let us begin with assuming that the letters are converted into digits by assigning them the values A=01 trough Z=26, that the enciphering is performed by subtracting the key from the plaintext without borrowing (5 - 8 = 15 - 8 = 7) and that deciphering is performed by adding ciphertext and key together without carry (7 + 6 = 3 and not 13). This is a standard and unbreakable application of one-time pad, if only they had never used that one-time pad twice! The reason I use the basic A=01 to Z=26 is to make it easier to see the separate letters. The described heuristic analysis works also with a straddling checkerboard (one-digit and two-digit conversions).
First, we must search for a crib. A crib is an assumed piece of plaintext that corresponds to a given ciphertext. These can be commonly used words, parts of words, or frequently used trigrams or bigrams. Some examples of frequent trigrams in the English language are "THE", "AND", "ING", "HER" and "HAT". Frequent bigrams are "TH", "AN", "TO", "HE", "OF" and "IN". Of course, a crib should be as long as possible. If you know who sent the message and what he might be talking about you could try out complete words.
In our example, we don't have any presumed words, so we'll have to use some other group of letters. Let's try the crib "THE", which is the most frequently used trigram in the English language. Now, in this example we only have one small piece of ciphertext. In real life, you might have a few hundred digits at your disposal for testing, which makes a successful crib more likely.
We align the letters "THE" with every position of ciphertext "A" and subtract the ciphertext from the crib. The result is the assumed one-time key. In heuristic terms, this is our trial. To test it, we add the assumed key to ciphertext "B" to recover plaintext "B". Unfortunately, as shown underneath the first "THE" of the example, we get our heuristic error. We continue to try out all positions. For the sake of simplicity, I only show three example positions of the crib. Our trial and error will show us that the 9th character position (17th digit) provides a possible correct plaintext "B", the trigram "OCU".
There are a few, but not too many, solutions to complete this "OCU" piece of plaintext, and we'll have to try them all out. So, let's try out the obvious "DOCUMENT". This assumption has to pass our trial and error again. Therefore, here below, we use "DOCUMENT" as a crib for plaintext "B" at exactly the same place. We subtract ciphertext B from the assumed plaintext "DOCUMENT" to again recover a new portion of the presumed key. Our presumed key is now already expanded to 16 digits.
We add this presumed key to ciphertext "A" to hopefully recover something readable and indeed, "OTHESTAT" could well be a correct solution, thus confirming the used crib. Can we make this crib any longer? "THE STAT" could be part of "THE STATUS", "THE STATION" or "THE STATIC", and "O THE" might be expandable to "TO THE", as "TO" is a popular bigram that ends with the letter O. Again we must test these solutions by recovering the related assumed key and try that key out on the other ciphertext. If correct, this will again reveal another little readable piece of plaintext. Remember we started only with the assumption that there could be a "THE" in one messages and already end up with "DOCUMENT" and "TO THE STAT..." after only two heuristic steps!
This process is repeated over and over. Some new cribs will prove to be dead end and others will result in readable words or parts of words (trigrams or bigrams). More plaintext means better assumptions and the puzzle will become easier and easier. Thanks to the two ciphertexts, you can verify the solutions of one plaintext with its counterpart ciphertext, over and over again, until the deciphering is completed.
Finally, we'll give the solution, just to verify the results of our trial and error:
Little fragments like, for example, "FORMA" is easily expanded to "INFORMATION", gaining 6 additional letters as a crib. "RANSP" is most likely "TRANSPORT" or, with some luck, "TRANSPORTATION", providing 9 additional letters, a quite large crib. Sometimes, the already recovered text provides clues about the words that precede or follow them, or will help to get ideas for words on other places in the message. It's a slow and tedious process, but the patchwork will gradually grow. Slow, cumbersome and tedious pays off in this line of work. This method is also usable when the text is converted into digits with a straddling checkerboard or any other text-to-digit conversion systems.
Of course, this example is short and simple. In reality, there could be all kinds of complications that require many more trials. What system is used to convert text into digits? What language is used? Did they use abbreviations or slang? Are there words available as cribs or do we need to piece together trigrams or even bigrams until we have a word to get launched? Does the message contain actual words or are there only codes from a codebook? Is the one-time pad reused completely or only partially, and do they start at the same position in both messages?
All these problems can slow down the heuristic process and require a vast number of trials, with associated dead ends and errors, before the job is done. Success is not guaranteed, but in most cases, the reuse of one-time pads will result in a successful deciphering. This is certainly the case with today's computer power, enabling fast heuristic testing.
History provides various examples of bad implementation or negligent use of one-time pads. The breaking of the war-time German diplomatic message traffic is a fine example of flawed implementation. The German foreign office could have been the first to implement the perfectly secure one-time pad system but instead decided to generate the keys with a simple mechanical machine. By doing so, they neglected the first crucial rule of one-time pad that the key should be truly random. The U.S. Army Security Agency did not actually break German one-time pad encrypted traffic but basically exploited a flawed pseudo-random stream cipher.
The VENONA project is probably the most notorious and well-known example of how important it is to follow the basic rules of one-time pad. Soviet Intelligence historically always relied heavily on one-time pad encryption, with good reason and success. Soviet communications have always proved extremely secure. However, during the Second World War, the Soviets had to create and distribute enormous quantities of one-time pad keys. Time pressure and tactical circumstances lead in some cases to the distribution of more than two copies of certain keys.
In the early 1940s, the United States and Great Britain analyzed and stored enormous quantities of encrypted messages, intercepted during the war. American codebreakers discovered by cryptanalysis that a very small portion of the tens of thousands of KGB and GRU messages between Moscow and Washington were enciphered with reused one-time pads. The messages were encoded with codebooks prior to enciphering with one-time pad, making the task even immensely harder for the codebreakers. Finding out which key was reused on what message, the reconstruction of the codebooks and recovering the plaintext were enormous challenges that took years.
Eventually they managed to reconstruct more than 3,000 KGB and GRU messages, just because of a distribution error by the Soviets. VENONA was crucial in solving many spy cases. Although VENONA is often mistakenly referred to as the project that broke Soviet one-time pads, they never actually broke one-time pad, but exploited implementation mistakes as described above.
no mistake, it will never be possible to break one-time
pad with current or any future technology, when properly
applied. This example only shows how to exploit the most
deadly of all mistakes: reusing a one-time pad. Another
fatal mistake is not using truly random numbers for the
The use of a truly random key, as long as the plaintext, is an essential part of the one-time pad. Since the one-time algorithm itself is mathematically secure, the codebreaker cannot retrieve the plaintext by examining the ciphertext. Therefore, he will try to retrieve the key. If the random values for the one-time key are not truly random but generated by a deterministic mechanism or algorithm it could be possible to predict the key. Thus, selecting a good random number generator is the most important part of the system.
In the pre-electronic era, true random was generated mechanically or electro-mechanically. Some of the most curious devices were developed to produce random values. Today, there are several options to generate truly random numbers. Hardware Random Number Generators (RNG's) are based on the unpredictability of physical events. Some semiconductors such as Zener diodes produce electrical noise in certain conditions. The amplitude of the noise is sampled at fixed time intervals and translated into binary zeros and ones.
Another unpredictable source is the tolerance of electronic component properties and their behavior under changing electrical and temperature conditions. Some examples are ring oscillators that operate at a very high frequency, the drift, caused by resistors, capacitors and other components in oscillators or time drift of computer hardware. Photons, single light particles, are another perfect source of randomness. In such systems, a single photon is sent through a filter, and its state is measured. The quality of such randomness sources can be verified with statistical tests to detect failure of the system.
Even when hardware-based true random generators are used, it will be necessary in some cases to improve their properties, for instance to prevent unequal distribution of zero's or one's in a sequence. One simple way to improve or whiten a single bit output is to sample two consecutive bits. The value sequence 01 would result in an output bit 0 and the value sequence 10 would give output 1. The repetitive values 00 and 11 are discarded. Some hardware RNG's are the Quantis QRNG, based on the unpredictable state of photons, the CPU clock jitter based ComScire generators, and the VIA Nano processor with its integrated dual quantum RNG's.
Another option is the manual generation of numbers. Of course, this time consuming method is only possible for small volumes of keys or key pads. Nevertheless, it's possible to produce truly random numbers. You could use five ten-sided dice (see image right). With each throw, you have a new five-digit group. Such dice are available in toy stores.
Never ever simply use normal six-sided dice by adding the values of two dice. This method is statistically unsuitable to produce values from 0 to 9 and thus absolutely insecure (the total of 7 will occur about 6 times more often that the values 2 or 12). Instead, use one black and one white die and assign a value to each of the 36 combinations, taking in account the order/color of the dice (see table below). This way, each combination has a .0277 probability (1 on 36). We can produce three series of values between 0 and 9. The remaining 6 combinations (with a black 6) are simply disregarded, which doesn't affect the probability of the other combinations.
You could also assign the letters A through Z and numbers 0 through 9 to all 36 dice combinations, again taking in account the order/color as in the table above. This way, you can create one-time pads that contain both letters and numbers. Such one-time pads can be used in combination with a Vigenère square, similar to the one described above, but with a 36 x 36 grid where each row contains the complete alphabet, followed by all digits. This will also produce a ciphertext with both letters and numbers. An advantage is that your plaintext can contain figures.
You can also use lotto balls. However, after extracting a number, that ball must always be mixed again with the other balls before extracting the next ball. If random bit values are required you can use one or more coins that are flipped, with one side representing the zero's and the other side the one's. With 8 coins you could compose an 8 bit value (byte) in one throw. Many other manual systems can be devised, as long as statistical randomness is assured. These simple but effective and secure methods are suitable for small one-time pads or small keys that are used to protect passwords (see Secret Splitting).
Another alternative is the use of a software based generator. However, software random number generators will never provide absolute security because of their deterministic nature. Crypto secure pseudo-random number generators (CSPRNG's) produce a random output that is determined by a key or seed. A large (unlimited) amount of random values is derived from a seed or key with a limited size, and seed and output are related to each other. In fact, you're no longer using one-time encryption, but an encryption with a small sized key. Brute forcing the seed by trying out all possible seeds, or analysis of the output or parts of the output could compromise the generator.
There are techniques to improve the output of CSPRNG's. Using a truly random and very large seed is essential. This could be done by accurate time or movement measurements of human interaction with the computer, for instance mouse movements, or by measuring the drift of computer processes time (note that a normal computer RND function is totally insecure). Another technique to drastically improve a CSPRNG is to combine the generator output with multiple other generators, the so-called "whitening". This will make analysis of the output much more difficult because each generator output obscures information about the other generator outputs. In the end, however, only one time pad encryption, based on truly random keys, is really unbreakable. More information about the secure generation of randomness is found in the RFC 4086 Randomness Requirements for Security.
also the issue of secure computers to process, store or
print the truly random numbers. Even the use of a
hardware generator with truly random output, necessary
for absolute security, is useless if the computer itself
is not absolutely secure. Unfortunately, there's no such
thing as a secure personal computer. The only absolutely
secure computer is a physically separated computer, with
restricted input/output peripherals, never connected to a
network and securely stored with controlled access. Any
other computer configuration will never guarantee
absolute security. Cryptographic software is only secure
on a stand-alone computer or dedicated crypto equipment.
One-time pad encryption is only possible if both sender and receiver are in possession of the same key. Therefore, we need a secure exchanged beforehand, physically through a trusted courier, or electronically by a perfect secure system like quantum key distribution. The secure communications are therefore expected and planned within a specific time frame. Enough key material must be available for all required communications until a new exchange of keys is possible. Depending upon the situation, a large volume of keys could be required for a short time period, or little key material could be sufficient for a very long time period, up to years or even decades.
One-time pads are especially interesting in circumstances where long-term security is essential. Once encrypted, no single future cryptanalytic attack or technology will ever be able to decrypt the data. In contrast, information that is encrypted with current traditional computer algorithms will not withstand future codebreaking technology and can compromise people or organizations years after.
Although one-time pad is the only perfect cipher, it has two disadvantages that complicate its use for some specific applications. The first problem is the generation of large quantities of random keys. We cannot produce true randomness with simple mechanical devices or computer algorithms like a computer RND function or stream ciphers. Hardware true random generators, usually based on noise, are the only secure option. The second problem is key distribution. The amount of key needed is equal to the amount of data that is encrypted and each key is for one-time use only.
Therefore, we need to distribute large amounts of keys to both sender and receivers in a highly secure way. Of course, it would be useless to send the one-time pads to the receiver by encrypting them with AES, IDEA or another strong algorithm. This would lower the unbreakable security of the pads to the security level of the algorithm that was used. These are practical problems, but solutions exist to solve these problems for certain applications.
Another disadvantage is that one-time encryption doesn't provide message authentication and integrity. Of course, you know that the sender is authentic, because he has the appropriate key and only he can produce a decipherable ciphertext, but you cannot verify if the message is corrupted, either by transmission errors or by an adversary. A solution is to use a hash algorithm on the plaintext and send the hash output value, encrypted along with the message, to the recipient (a hash value is a unique fixed-length value, derived from a message).
Only the person who has the proper one-time pad is able to correctly encrypt the message and corresponding hash. An adversary cannot predict the effect of his manipulations on the plaintext, nor on the hash value. Upon reception, the message is deciphered and its content checked by comparing the received hash value with a hash that is created from the received message. Unfortunately, a computer is required to calculate the hash value, making this method of authentication impossible for a purely manual encryption.
One-time pad encryption nevertheless has an important future. Eventually, computational power and advances in technology will surpass the mathematical capabilities to provide strong encryption and only information-theoretical secure encryption will survive the evolution of cryptology. Just as classical pencil-and-paper ciphers were rendered useless with the advent of the computer, so will current computer algorithms, based on mathematical complexity, become victim to the evolution of technology, and that moment might creep on us much faster than we expect. One-time pad will survive any evolution in codebreaking.
Technology and science must then provide more practical solutions for mass key distribution. This can be a modern mass storage version of the briefcase with handcuffs that can easily exchange many Terabytes of key bytes. Also, quantum key distribution (QKD) is already in use today for smaller quatities of keys. ECOQC in Vienna, Austria, was in 2008 the first ever QKD protected network. The current DARPA Quantum network has ten nodes. ID Quantique, QuintessenceLabs and SeQureNet are some of the commercial firms that currently offer QKD networks. Sending high volumes of one-time pad keys by QKD could offer a solution in the future, also to resist cryptanalysis with quantum computers, as securely sharing small keys for encryption with today's encryption algorithms won't suffice, just as current public key algorithms, based on mathematical complexity, won't remain secure.
current precarious state of Internet security is where
the limited use of one-time pad encryption for specific
purposes comes into play. One might have found it
ridiculous in our high-tech world, if it wasnt for
the current disastrous state our privacy is in today.
Indeed, even the pencil and paper one-time pad still
provides a practical encryption system for small volumes
of critical private communications. The correspondents
can perform all simple calculations by hand, safely send
their encrypted message over any insecure channel and
nobody will ever be able to decipher it. Not even
three-letter organizations. It's also the only crypto
algorithm that we can really trust today, because it
doesn't require today's inherently insecure computers,
connected to untrustworthy networks.