charlesreid1.com blog

Enigma Cipher Implementation: Part 4: Combinatorics

Posted in Enigma

permalink

In this, the fourth article in a series on implementing the Enigma cipher in Java, we use some big number libraries to explore the combinatorics of the Enigma encryption scheme and better understand the Enigma's strengths and weaknesses.

Table of Contents

The Key Space

Basically, what the Enigma did was to encrypt each character of a message one at a time, using a different, unique key for each character. One key corresponded to one particular scrambled version of the alphabet (one possible set of substitutions). The huge number of possible initial settings for the machine - the rotors, wiring, and reflector - meant that finding the very first key was extremely difficult. Furthermore, as the operator entered additional characters into the Enigma, the machine would rotate the rotor wheels, sequentially stepping through the space of possible keys in a totally random but deterministic way. Any operator with a matching Enigma machine and matching settings could replicate this "random walk" through the key space.

What we will do below is look at each component of the Enigma and determine the total number of unique settings for each component. A single machine setting corresponds to a single key, so the total number of possible settings of the machine yields the total number of possible keys for the Enigma.

The Switchboard

The switchboard at the front of the Enigma consisted of a set of plugs, one for each letter, connected by wires. The operator would connect pairs of wires to swap pairs of letters. If the letters A and K were connected, any A signal entering the keyboard would become a K signal leaving the keyboard, and any K signal entering the keyboard would become an A signal leaving the keyboard. Letters could not be connected to themselves, and a wire could only connect two letters together.

From these constraints, we can get the total number of cable configurations on the front of the machine. For a machine with \(S\) symbols (typically 26) and \(N\) patch cables, the total number of configurations is:

$$ C_{sw} = \dfrac{ S! }{ N! \times (S - 2N)! \times 2^N } $$

Let's break down where those terms are coming from.

One Cable

Let's consider a single cable connecting two letters.

There are S (or, 26) places to plug in the left end, and S places to plug in the right end, for a total of \(S^2\) combinations. But no cable can connect to itself, so there are actually \(S (S-1)\) possible combinations. Furthermore, each plug is symmetric (if A connects to B, then B connects to A), so half of the plug combinations are simply mirror images of the other half.

For a single plug, we start from a total of \(26 \times 26 = 676\) possible configurations. Ruling out any combinations that connect letters to themselves eliminates 26 possibilities (A connects to A, B connects to B, etc.) for a total of \(26 \times 25 = 650\) possible configurations. But half of those configurations are mirror images of the other half (if we connect A to B, by implication we connect B to A), so our number of choices is actually half that, or \(\frac{26 \times 25}{2} = 350\).

More Cables

If we plug in a second cable, there are now 2 choices occupied by the first letter, so there are \(S-2\) possible places to plug in the left end, and \(S-3\) possible places to plug in the right end, but each having half duplicate solutions, since the wires are two-way, for a total of \(\frac{(S-2)(S-3)}{2 \times 2}\) combinations.

Many Cables

Once \(N\) wires have been plugged in, there are \(S - 2N\) spaces remaining, and \(2N\) spaces occupied by plug ends. That is, we are reducing the number of choices by 2 letters with each wire placed.

Taking the product of these numbers explains part of the expression given above:

$$ S \times (S-1) \times (S-2) \times \dots \times (S - 2N + 1) = \dfrac{ S! }{ (S - 2N) ! } $$

Accounting for Duplicates

But where did the \(2^N\) and \(N!\) terms come from? They come from the fact that many choices of wiring configurations are duplicates.

Dividing by \(2^N\) comes from the fact that the wires are doubled up: if A connects to B, B connects to A. This means that when we choose our pair and connect A to B using a wire, we also connect B to A. Even though it looks like two choices, it is only one!

Meanwhile, the \(N!\) term accounts for the fact that order is not important when we select pairs and place wires - making the choice to connect A to B and then making the choice to connect C to D is entirely equivalent to connecting C to D, then connecting A to B. This means that \(N!\) of the \(S!\) possible solutions are duplicate configurations with the same connections chosen in a different order.

Switchboard Combinations

Here's how the number of possible combinations that result when we plug in various numbers of wires in:

S = 2       N = 1       C = 1
S = 26      N = 1       C = 325
S = 26      N = 2       C = 44,850
S = 26      N = 3       C = 3,453,450
S = 26      N = 4       C = 164,038,875
S = 26      N = 5       C = 5,019,589,575
S = 26      N = 6       C = 100,391,791,500
S = 26      N = 7       C = 1,305,093,289,500
S = 26      N = 8       C = 10,767,019,638,375
S = 26      N = 9       C = 53,835,098,191,875
S = 26      N = 10      C = 150,738,274,937,250
S = 26      N = 11      C = 205,552,193,096,250
S = 26      N = 12      C = 102,776,096,548,125
S = 26      N = 13      C = 7,905,853,580,625

Notice the bump in the shape of the distribution, meaning the use of 11 wires is much more secure than the use of 13 wires. Let's explore that.

More Interesting Observations About the Switchboard

Notice how the switchboard expression is not proportional to \(N\), it is inversely proportional to two different terms, each changing differently as \(N\) changes.

We are looking at the denominator of this expression:

$$ C_{sw} = \dfrac{ S! }{ N! \times (S - 2N)! \times 2^N } $$

The term \(N!\) on the bottom will increase as \(N\) increases, thereby decreasing the total number of possible keys \(C_{sw}\). However, the term \((S - 2N)!\) will decrease as \(N\) increases, thereby increasing the total number of possible keys \(C_{sw}\). The tradeoff can be visualized just by printing it out - here are the total number of combinations that are possible for an alphabet of \(S = 26\) symbols, using \(N = 1 \dots 13\) wires:

This pattern holds for other alphabet sizes. Here's a 52-character alphabet:

S = 52      N = 1       C = 1,326
S = 52      N = 2       C = 812,175
S = 52      N = 3       C = 305,377,800
S = 52      N = 4       C = 79,016,505,750
S = 52      N = 5       C = 14,949,922,887,900
S = 52      N = 6       C = 2,145,313,934,413,650
S = 52      N = 7       C = 239,049,266,977,521,000
S = 52      N = 8       C = 21,006,454,335,649,657,875
S = 52      N = 9       C = 1,470,451,803,495,476,051,250
S = 52      N = 10      C = 82,492,346,176,096,206,475,125
S = 52      N = 11      C = 3,719,654,882,122,156,219,242,000
S = 52      N = 12      C = 134,837,489,476,928,162,947,522,500
S = 52      N = 13      C = 3,920,659,309,406,065,045,704,885,000
S = 52      N = 14      C = 91,015,305,396,926,509,989,577,687,500
S = 52      N = 15      C = 1,674,681,619,303,447,783,808,229,450,000
S = 52      N = 16      C = 24,178,215,878,693,527,378,731,312,684,375
S = 52      N = 17      C = 270,227,118,644,221,776,585,820,553,531,250
S = 52      N = 18      C = 2,296,930,508,475,885,100,979,474,705,015,625
S = 52      N = 19      C = 14,506,929,527,216,116,427,238,787,610,625,000
S = 52      N = 20      C = 66,006,529,348,833,329,743,936,483,628,343,750
S = 52      N = 21      C = 207,449,092,239,190,464,909,514,662,831,937,500
S = 52      N = 22      C = 424,327,688,671,071,405,496,734,537,610,781,250
S = 52      N = 23      C = 516,572,838,382,173,884,952,546,393,613,125,000
S = 52      N = 24      C = 322,858,023,988,858,678,095,341,496,008,203,125
S = 52      N = 25      C = 77,485,925,757,326,082,742,881,959,041,968,750
S = 52      N = 26      C = 2,980,227,913,743,310,874,726,229,193,921,875

For a 52-symbol alphabet the optimum number of pairs is 23 keys.

Okay, here we go with an alphabet of 100 characters:

S = 100     N = 1       C = 4,950
S = 100     N = 2       C = 11,763,675
S = 100     N = 3       C = 17,880,786,000
S = 100     N = 4       C = 19,539,228,901,500
S = 100     N = 5       C = 16,358,242,436,335,800
S = 100     N = 6       C = 10,919,126,826,254,146,500
S = 100     N = 7       C = 5,971,202,498,700,124,686,000
S = 100     N = 8       C = 2,728,093,141,593,619,465,916,250
S = 100     N = 9       C = 1,056,681,410,177,261,939,798,227,500
S = 100     N = 10      C = 350,923,896,319,868,690,206,991,352,750
S = 100     N = 11      C = 100,810,864,760,980,460,095,826,606,790,000
S = 100     N = 12      C = 25,227,918,906,435,360,138,980,608,349,197,500
S = 100     N = 13      C = 5,530,736,067,949,290,492,007,287,215,016,375,000
S = 100     N = 14      C = 1,067,037,008,537,930,972,779,405,911,982,802,062,500
S = 100     N = 15      C = 181,823,106,254,863,437,761,610,767,401,869,471,450,000
S = 100     N = 16      C = 27,443,925,100,343,450,137,143,125,204,719,673,346,984,375
S = 100     N = 17      C = 3,677,485,963,446,022,318,377,178,777,432,436,228,495,906,250
S = 100     N = 18      C = 438,233,743,977,317,659,606,613,804,310,698,650,562,428,828,125
S = 100     N = 19      C = 46,498,906,729,382,757,987,733,338,394,229,919,975,466,132,500,000
S = 100     N = 20      C = 4,396,471,631,263,139,767,740,187,145,174,438,933,680,322,827,875,000
S = 100     N = 21      C = 370,559,751,777,893,208,995,244,345,093,274,138,695,912,924,063,750,000
S = 100     N = 22      C = 27,842,512,258,584,430,657,688,131,929,053,734,148,379,275,612,608,125,000
S = 100     N = 23      C = 1,864,237,777,313,914,052,732,161,876,988,815,242,978,438,454,061,587,500,000
S = 100     N = 24      C = 111,155,177,472,342,125,394,155,151,915,458,108,862,589,392,823,422,154,687,500
S = 100     N = 25      C = 5,895,670,613,133,026,330,905,989,257,595,898,094,071,741,395,354,311,084,625,000
S = 100     N = 26      C = 277,776,788,503,382,971,359,993,724,636,729,814,047,610,892,665,731,964,564,062,500
S = 100     N = 27      C = 11,604,896,941,919,110,803,484,182,273,712,267,786,877,966,182,479,468,741,787,500,000
S = 100     N = 28      C = 428,966,726,245,938,560,057,361,737,617,578,469,979,239,107,102,366,076,705,359,375,000
S = 100     N = 29      C = 13,993,190,449,264,064,752,216,007,027,111,352,848,288,282,597,201,320,984,940,343,750,000
S = 100     N = 30      C = 401,604,565,893,878,658,388,599,401,678,095,826,745,873,710,539,677,912,267,787,865,625,000
S = 100     N = 31      C = 10,104,889,077,329,850,114,293,791,397,061,765,963,283,274,007,127,379,728,028,210,812,500,000
S = 100     N = 32      C = 221,991,781,917,590,144,698,391,729,754,200,671,005,879,425,844,079,623,400,119,756,287,109,375
S = 100     N = 33      C = 4,238,024,927,517,630,035,151,114,840,762,012,810,112,243,584,296,065,537,638,649,892,753,906,250
S = 100     N = 34      C = 69,927,411,304,040,895,579,993,394,872,573,211,366,852,019,140,885,081,371,037,723,230,439,453,125
S = 100     N = 35      C = 990,971,314,480,122,405,933,620,681,622,751,795,370,245,756,967,971,438,858,134,592,065,656,250,000
S = 100     N = 36      C = 11,974,236,716,634,812,405,031,249,902,941,584,194,057,136,230,029,654,886,202,459,654,126,679,687,500
S = 100     N = 37      C = 122,331,391,321,296,191,597,346,282,792,214,022,306,853,986,350,032,690,459,041,344,574,591,484,375,000
S = 100     N = 38      C = 1,046,255,320,511,085,849,187,830,050,196,567,296,045,461,725,362,121,694,715,485,183,861,637,695,312,500
S = 100     N = 39      C = 7,404,268,422,078,453,701,944,643,432,160,322,402,783,267,594,870,399,685,678,818,224,251,589,843,750,000
S = 100     N = 40      C = 42,759,650,137,503,070,128,730,315,820,725,861,876,073,370,360,376,558,184,795,175,245,052,931,347,656,250
S = 100     N = 41      C = 198,154,476,246,965,446,938,018,536,730,193,018,450,096,106,548,086,489,149,050,812,111,220,901,367,187,500
S = 100     N = 42      C = 721,848,449,185,374,128,131,353,240,945,703,138,639,635,816,710,886,496,185,827,958,405,161,854,980,468,750
S = 100     N = 43      C = 2,014,460,788,424,299,892,459,590,439,848,473,875,273,402,279,193,171,617,262,775,697,874,870,292,968,750,000
S = 100     N = 44      C = 4,166,271,176,059,347,504,859,607,500,595,707,332,951,809,259,240,423,117,520,740,647,877,572,651,367,187,500
S = 100     N = 45      C = 6,110,531,058,220,376,340,460,757,667,540,370,754,995,986,913,552,620,572,363,752,950,220,439,888,671,875,000
S = 100     N = 46      C = 5,977,693,426,519,933,376,537,697,718,246,014,869,017,813,284,997,128,820,790,627,886,085,212,934,570,312,500
S = 100     N = 47      C = 3,561,179,062,607,619,883,894,798,640,657,200,347,499,973,871,913,183,127,279,522,995,965,658,769,531,250,000
S = 100     N = 48      C = 1,112,868,457,064,881,213,717,124,575,205,375,108,593,741,834,972,869,727,274,850,936,239,268,365,478,515,625
S = 100     N = 49      C = 136,269,606,987,536,475,149,035,662,270,045,931,664,539,816,527,290,170,686,716,441,172,155,310,058,593,750
S = 100     N = 50      C = 2,725,392,139,750,729,502,980,713,245,400,918,633,290,796,330,545,803,413,734,328,823,443,106,201,171,875

Optimum number of pairs? 45.

Final Combination Count Switchboard

For a switchboard with holes for each of \(S\) symbols, with \(N\) unique pairs of letters chosen from among the symbols to be swapped by the switchboard, the number of possible combinations of ciphers with \(N\) wires is:

$$ C_{sw} = \dfrac{ S! }{ N! \times (S - 2N)! \times 2^N } $$

The Rotors

Typical Enigma machines had three rotors, with each rotor implementing a different scrambled alphabet. Assuming there are P possible rotors to choose from, the number of choices when selecting R rotors from P possible rotors is given by:

$$ C_{rot} = \dfrac{P!}{(P-R)!} $$

If the rotors are known, \(P\) and \(R\) are small numbers like 8 and 3, yielding a modest number of possible rotor combinations (336). If the number of rotors is unknown, however, P becomes the set of all possible rotors (the set of all possible alphabet scrambles), which is S!. Then we take the factorial of this number,

$$ C_{rot} = \frac{(S!)!}{(S!-R)!} $$

Note that the numerator \((S!)!\) is a double factorial. Here's how Wolfram Alpha describes 26 double-factorial (26!)!

$$ 10^{10^{28}} $$

This can also be written as \(403291461126605635584000000!\). This is a number with 10^28 digits. That's probably the biggest number you've ever seen in your life. The denominator is also pretty big, though. For small values of R, this is approximately \((S!)^R\). For a 26-character alphabet with 3 rotors, that is approximately

$$ (403,291,461,126,605,635,584,000,000)^3 $$

or

$$ 65,592,937,459,144,468,297,405,473,968,303,761,468,794,234,820,105,359,750,856,704,000,000,000,000,000,000 $$

which is a key space with more keys than there are protons in the universe (the Eddington number).

In addition, each wheel had notches at different locations. The notches change the path the Enigma takes through the key space. For \(R\) rotors containing \(S\) symbols, the total combinations increases by a factor of \({S}^{R-1}\). If each wheel has \(M\) notches, that factor is \({(MS)}^{R-1}\). (The \(R-1\) comes from the fact that the location of the notch on the last wheel has no effect.)

Final Combination Count for Rotors

The total number of combinations for \(R\) rotors with \(S\) symbols and \(M\) notches (which advance the neighboring left wheel by one), chosen from among \(P\) possible choices of rotors, is given by:

$$ C_{rot} = S^{R-1} \dfrac{P!}{(P-R)!} $$

NOTE: For a very large set of possible rotors \(P\), (\(P >> R\)),

$$ \dfrac{P!}{(P-R)!} \approx P^R $$

so it follows that

$$ C_{rot} \approx S^{R-1} P^R \qquad P >> R $$

(For example, if \(P = S!\), the set of all possible rotors.)

Reflector

Like the switchboard on the front of the Enigma, the reflector connected pairs of letters. It could only be changed by swapping it out like a rotor, so there were a small number of mechanically produced reflectors chosen from the set of all possible reflectors.

If a reflector pairs all letters with another letter, it makes \(N = \frac{S}{2}\) possible pairs. Using the analysis we performed above for the switchboard, and plugging that in, and using \(0!=1\):

$$ C_{rfl} = \dfrac{ S! }{ (\frac{S}{2})! \times (S - 2(\frac{S}{2}))! \times 2^S } = \dfrac{S!}{(\frac{S}{2})! 2^S} $$

For the 26 characters in the English alphabet, that's:

$$ C_{rfl} = \frac{26!}{13! * 2^26 } = 965,070,017 $$

This is the total number of possible reflectors. (Curiously enough, the number of possible keys goes down as N goes from 11 to 12 and 12 to 13, making the switchboard on the front, which swapped 10 pairs of letters, more secure than the rotor, which swapped 13 pairs of letters.

However, like the rotors, there were a finite number of reflectors in use. Supposing there were Q reflectors in use, that would make for Q possible sets of the 13 letter pairings. Since the reflectors did not rotate, this would lead to only 1 possible reflector position, meaning a choice from among Q reflectors only multiplied the number of possible combinations by Q.

Final Combination Count for Reflector

Here is the final expression for the total number of combinations resulting from all possible rotors:

$$ C_{rfl} = \dfrac{S!}{(\frac{S}{2})! 2^S} $$

and here is the final expression for the total number of combinations if there is 1 rotor chosen from among \(Q\) rotors:

$$ C_{rfl} = Q $$

Final Enigma Combination Count

Putting all of this together results in the following monstrosity of an expression for the Enigma's complete key space:

$$ C_{enigma-full} = \left( \dfrac{S!}{N! (S-2N)! 2^N} \right) \left( \dfrac{(S!)!}{(S!-R)!} \right) \left( \dfrac{ S! }{ \left( \dfrac{S}{2} \right)! 2^S } \right) $$

Note that this assumes the attacker has no idea which rotors or reflectors are actually used. If instead the attacker has knowledge that \(R\) rotors out of \(P\) possible rotors and 1 reflector out of \(Q\) possible reflectors are being used, this key space reduces to:

$$ C_{enigma-small} = \left( \dfrac{S!}{N! (S-2N)! 2^N} \right) \left( S^{R-1} \dfrac{P!}{(P-R)!} \right) Q $$

When you evaluate the above expressions for the following values of \(N\), \(P\), \(Q\), \(R\), and \(S\), here are the actual numbers you get:

N = 10; // number of switchboard wires
P = 5;  // number of possible rotors
Q = 5;  // number of possible reflectors
R = 3;  // number of rotors
S = 26; // number of symbols

For the case of utilizing the \(P\) known rotors and \(Q\) known reflectors, the key space is:

$$ C_{enigma-small} = 537,293,436,636,253,096,800,000 $$

For the completely unknown case of all possible rotors and reflectors, the key space increases to an astronomical number:

$$ C_{enigma-full} = 422,732,921,460,335,478,939,047,043,039,799,222,455,533,136,281,221,092,624,796,865,514,111,348,059,884,989,972,480,000,000,000,000,000,000,000,000 $$

There are plenty of additional big numbers related to the Enigma, and more math around the weaknesses in the system and how it was cracked. There is also yet more math around how Alan Turing managed to crack the German Navy's Enigma cipher, which utilized various protective steps like bigram replacement that made cracking via frequency analysis much more difficult.

But that's enough for one post!

Below you can find a Java program that uses the BigInteger class, part of the Java API, to perform calculations with extremely large numbers.

Java BigInteger Program

import java.math.BigInteger;
import java.text.*;

/** Cryptanalysis of the Enigma Machine.
 *
 * This program uses combinatorics and big integers
 * to analyze the cryptographic strength of the Enigma machine.
 *
 * Author: Charles Reid
 * Date: March 2017
 */

public class Combos {
    public static void main(String[] args) { 

        // This involves a double factorial. PREPARE YOUR CPU
        boolean doBig = true;



        /////////////////////////////////
        // Git Ready

        DecimalFormat formatter = new DecimalFormat("#,###");

        BigInteger small_combos = new BigInteger("1");
        BigInteger big_combos = new BigInteger("1");

        /// Useful temp variable 
        BigInteger next;



        /////////////////////////////////
        // Constants

        int N = 10; // number of switchboard wires
        int P = 5;  // number of possible rotors
        int Q = 5;  // number of possible reflectors
        int R = 3;  // number of rotors
        int S = 26; // number of symbols




        /////////////////////////////////
        // Plugboard combinations

        BigInteger plugboard = new BigInteger("1");

        // S!/(S-2N)!
        BigInteger num = BigInteger.valueOf(1);
        for(int i=S; i>(S-2*N); i--) {
            next = BigInteger.valueOf(i);
            num = num.multiply(next);
        }

        // divided by 2^N
        BigInteger pdenom = BigInteger.valueOf(2);
        pdenom = pdenom.pow(N);

        // divided by N!
        for(int j = N; j>1; j--) { 
            next = BigInteger.valueOf(j);
            pdenom = pdenom.multiply(next);
        }

        plugboard = num.divide(pdenom);



        /////////////////////////////////
        // Rotor combinations

        BigInteger small_rotors = new BigInteger("1");
        BigInteger big_rotors = new BigInteger("1");

        // -----------------
        // Small case:
        // R rotors selected from P possible known rotors

        // Rotor wheel combinations
        for(int j=P; j>(P-R); j--) { 
            next = BigInteger.valueOf(j);
            small_rotors = small_rotors.multiply(next);
        }

        // Rotor wheel notch positions (assume 1 per wheel). 
        // Ignore left-most wheel.
        for(int k=0; k<(R-1); k++ ) {
            next = BigInteger.valueOf(S);
            small_rotors = small_rotors.multiply(next);
        }

        // Rotor wheel starting positions
        for(int k=0; k<R; k++ ) {
            next = BigInteger.valueOf(S);
            small_rotors = small_rotors.multiply(next);
        }

        // -----------------
        // Big case:
        // R rotors selected from S! possible rotors

        if(doBig) { 
            BigInteger s_rm1 = BigInteger.valueOf(S).pow(R-1);
            BigInteger sfact_r = BigInteger.valueOf(1);
            for(int j=S; j>=1; j--) { 
                next = BigInteger.valueOf(j);
                sfact_r = sfact_r.multiply(next);
            }
            sfact_r = sfact_r.pow(R);
            big_rotors = s_rm1.multiply(sfact_r);
        }

        /////////////////////////////////
        // Reflector combinations

        BigInteger small_reflector = BigInteger.valueOf(Q);
        BigInteger big_reflector = new BigInteger("1");

        // (S!)/((S/2)!)
        for(int j=S; j>(S/2); j--) {
            next = BigInteger.valueOf(j);
            big_reflector = big_reflector.multiply(next);
        }

        // divided by 2^N
        BigInteger rfldenom = BigInteger.valueOf(2);
        rfldenom = rfldenom.pow(N);

        if(doBig) { 
            big_reflector = big_reflector.divide(rfldenom);
        }



        /////////////////////////////////
        // Final combinations

        small_combos = small_combos.multiply(plugboard);
        small_combos = small_combos.multiply(small_rotors);
        small_combos = small_combos.multiply(small_reflector);

        if(doBig) {
            big_combos = big_combos.multiply(plugboard);
            big_combos = big_combos.multiply(big_rotors);
            big_combos = big_combos.multiply(big_reflector);
        }

        System.out.println("Final number of (small) Enigma combinations: ");
        System.out.println(formatter.format(small_combos));

        if(doBig) { 
            System.out.println("Final number of (big) Enigma combinations: ");
            System.out.println(formatter.format(big_combos));
        }

    }
}

Output of the program:

Final number of (small) Enigma combinations:
537,293,436,636,253,096,800,000
Final number of (big) Enigma combinations:
422,732,921,460,335,478,939,047,043,039,799,222,455,533,136,281,221,092,624,796,865,514,111,348,059,884,989,972,480,000,000,000,000,000,000,000,000

Sources

  1. "The Enigma Cipher". Tony Sale and Andrew Hodges. Publication date unknown. Accessed 18 March 2017. <https://web.archive.org/web/20170320081639/http://www.codesandciphers.org.uk/enigma/index.htm>

  2. "BigInteger". Oracle Corporation. Copyright 1993-2016, Publication date unknown. Accessed 22 March 2017. <https://docs.oracle.com/javase/8/docs/api/java/math/BigInteger.html>

Tags:    ciphers    enigma    encryption    java   

Enigma Cipher Implementation: Part 3: Enigma in Java Without Objects

Posted in Enigma

permalink

As the title suggests, we're continuing with the third in a series of posts exploring a verb-oriented approach to programming - in an attempt to free ourselves from the fetishization of objects, we are attempting to learn how to use languages against their will.

This is all inspired by Steve Yegge's 2006 blog post, "Execution in the Kingdom of Nouns," an excellent read that inspired me to explore the subject more deeply.

Java Pseudocode

In the last post, we ran through the pseudocode for an Enigma machine based entirely upon Strings, iterators, and integer indexes, leading to a vastly simpler abstraction of the Enigma machine than would have resulted if we had implicitly chosen a noun-centric approach, divided the entire Enigma encryption process into its component nouns like rotor wheels and reflectors, and implemented each as an object.

Here was the pseudocode:

define plaintext message
define normal alphabet and scrambled alphabets
define list of switchboard swap pairs
define list of reflector swap pairs
for each character in plaintext message:

    # Apply switchboard transformation
    for each pair in switchboard swap pairs:
        if character in swap pair, swap its value

    # Apply forward rotor transformation
    for each scrambled alphabet:
        get index of character in normal alphabet
        get new character at that index in scrambled alphabet
        replace character with new character 

    # Apply reflector transformation
    for each pair in reflector swap pairs:
        if character in swap pair, swap its value

    # Apply reverse rotor transformation
    for each scrambled alphabet:
        get index of input character in scrambled alphabet
        get new character at that index in normal alphabet
        replace character with new character 

    # Apply switchboard transformation
    for each pair in switchboard swap pairs:
        if character in swap pair, swap its value

    concatenate transformed input character to ciphertext message 

    # Increment rotor wheels
    for each rotor/scrambled alphabet, left to right:
        get index of left notch in left alphabet
        get index of right notch in right alphabet
        if left index equals right index:
            cycle left alphabet forward 1 character
    cycle right-most alphabet forward 1 character

Java Code

The Enigma code is defined in the Java program as follows:

  • The Enigma class defines a set of constants for historically accurate cipher settings.
  • The main method contains the encryption procedure.
  • There is one static helper method called rotateString.

Everything is in a public class. Starting with the definitions of constants:

public class Enigma {

    public static final String ALPHA = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

    // Historically accurate rotor scrambles
    // See http://www.codesandciphers.org.uk/enigma/rotorspec.htm
    public static final String[] WHEEL = { ALPHA,
                        "EKMFLGDQVZNTOWYHXUSPAIBRCJ", // Rotor I    - Royal
                        "AJDKSIRUXBLHWTMCQGZNPYFVOE", // Rotor II   - Flags
                        "BDFHJLCPRTXVZNYEIWGAKMUSQO", // Rotor III  - Wave
                        "ESOVPZJAYQUIRHXLNFTGKDCMWB", // Rotor IV   - Kings
                        "VZBRGITYUPSDNHLXAWMJQOFECK", // Rotor V    - Above
                        "JPGVOUMFYQBENHZRDKASXLICTW",
                        "NZJHGRCXMYSWBOUFAIVLPEKQDT",
                        "FKQHTLXOCBJSPDZRAMEWNIUYGV"};

    // Knocking (notch and clasp) advances the wheel to the left
    public static final String[] NOTCH = {"",  // No notch
                                          "R", // Royal
                                          "F", // Flags
                                          "W", // Wave
                                          "K", // Kings
                                          "A", // Above
                                          "AN",
                                          "AN",
                                          "AN"};

    // Reflectors
    public static final String REFLECTOR_ALPHA = "AY:BR:CU:DH:EQ:FS:GL:IP:JX:KN:MO:TZ:VW";

Main Method: User Settings

The next part of the code is the main method, where we begin by defining variables that correspond to settings that the Enigma operator would set from the daily Enigma code book. These included:

  • The numbering and ordering of wheels (e.g., IV II I)
  • The initial rotor settings for each wheel (position 0-25)
  • The pairs of letters connected on the switch board

The wheels are specified using the WHEEL array of Strings, above. Each element of the WHEEL array contains a different scrambled alphabet, corresponding to the alphabet scrambles hard-coded into the historical rotor wheels. These go into rotorAlpha, which stores each rotor's alphabet in a String.

The locations of the notches that advance neighboring wheels are fixed by the choice of wheels, and are available through the NOTCH array. The notch locations implemented in NOTCH are historically accurate for each rotor wheel.

The initial rotor settings were also contained in the code book as a sequence of 3 numbers, each 0-25, indicating how many turns each wheel was rotated before starting.

The plugboard pairs specify the wired connections on the front of the machine. These plugboard pairs were also daily machine settings specified in the daily Enigma code book. The plugboard pairs are input as a single string, with pairs of letters separated by a ":", like this: AB:CD:EF:GH. Pairs must be unique (no letter can connect to itself). Letters cannot be repeated (no letter can connect to more than 1 other letter).

    public static void main(String[] args) { 

        // These two strings should encrypt/decrypt to each other when you run them through the Enigma.
        //String message = "ABCDE FG HIJKL MNOP QRS TUVWXYZ"; 
        String message = "TVVFT KS UNVYJ FAFV NPC DZJPWEJ";


        //////////////////////////////////
        // Operator Settings
        // 
        // Enigma operators have code sheets that specify: 
        //  - The numbering/ordering of wheels 
        //  - The initial rotor settings
        //  - The plugboard pairs

        // Rotor scrambles are applied right-to-left
        //              {LAST, MIDDLE, FIRST}
        String[] rotorAlpha = {WHEEL[1],WHEEL[2],WHEEL[3]};
        String[] rotorNotch = {NOTCH[1],NOTCH[2],NOTCH[3]};
        int[] rotorInit = {0,0,0};

        String plugboardPairs = "IR:HQ:NT:WZ:VC:OY:GP:LF:BX:AK";

        String coded = enigma(message, rotorAlpha, rotorNotch, rotorInit, plugboardPairs);
        System.out.println(coded);
    }

Cipher Procedure

The next bit of code is the meat of the Enigma method. Notice that this is purely procedural code, and makes no use of objects other than the built-in String type. This is the kind of verb-oriented code we are striving for when we write noun-free Java code.

We also pass in any information that's required. Normally we would wrap all of these quantities in an object, to keep the list of parameters short, but this implementation is entirely object-free.

    public static String enigma(String message,
                                String[] rotorAlpha,
                                String[] rotorNotch,
                                int[] rotorInit,
                                String[] plugboardPairs) { 

        StringBuilder message_final = new StringBuilder();

        //////////////////////////////////////
        // Enigma Cipher

        // Apply each transformation in sequence
        for(int i=0; i<message.length(); i++) {

            // Starting char
            char c_orig = message.charAt(i);
            char c = c_orig;

            // Perform plugboard swap
            for(String pair : plugboardPairs.split(":")) {
                if(c==pair.charAt(0)) {
                    c = pair.charAt(1);
                } else if(c==pair.charAt(1)) {
                    c = pair.charAt(0);
                }
            }

            // Perform rotor letter substitutions
            // (forward order: right-to-left)
            int ix = -100;
            for(int j=(rotorAlpha.length-1); j>=0; j--) { 
                ix = ALPHA.indexOf(c);
                String thisAlpha = rotorAlpha[j];
                if(ix>=0) { 
                    c = thisAlpha.charAt(ix);
                } else {
                    c = c_orig;
                }
            }

            // Perform reflection
            for(String pair : REFLECTOR_ALPHA.split(":")) {
                if(c==pair.charAt(0)) {
                    c = pair.charAt(1);
                } else if(c==pair.charAt(1)) {
                    c = pair.charAt(0);
                }
            }

            // Perform rotor letter substitutions
            // (backwards order: left-to-right) 
            ix = -100;
            for(int j=0; j<rotorAlpha.length; j++) { 
                String thisAlpha = rotorAlpha[j];
                ix = thisAlpha.indexOf(c);
                if(ix>=0) { 
                    c = ALPHA.charAt(ix);
                } else {
                    c = c_orig;
                }
            }


            // Perform plugboard swap
            for(String pair : plugboardPairs.split(":")) {
                if(c==pair.charAt(0)) {
                    c = pair.charAt(1);
                } else if(c==pair.charAt(1)) {
                    c = pair.charAt(0);
                }
            }


            // Final text
            if( c>='A' && c<='Z') { 
                message_final.append(c);
            } else {
                // Could not resolve 
                message_final.append(c_orig);
            }

            // Increment rotors
            for(int j=0; j<(rotorAlpha.length-1); j++) {
                String alphaL = rotorAlpha[j];
                int ixL = alphaL.indexOf(rotorNotch[j]);
                String alphaR = rotorAlpha[j+1];
                int ixR = alphaR.indexOf(rotorNotch[j+1]);
                if(ixL!=ixR) { 
                    rotorAlpha[j] = rotateString(rotorAlpha[j]);
                }
            }
            // Always increment the right-most rotor
            int lenny = rotorAlpha.length;
            rotorAlpha[lenny-1] = rotateString(rotorAlpha[lenny-1]);
        }

        return message_final.toString();
    }

Utility Method: String Rotator

One last piece that's needed to emulate the rotation of the rotor wheels is a method to rotate strings forward 1 character. Here's that method:

    /// Rotate a string forward by 1 character, so "ABCDEF" becomes "FABCDE"
    public static String rotateString(String original) {
        int lenny = original.length();
        StringBuilder rotated = new StringBuilder();
        rotated.append(original.charAt(lenny-1));
        for(int i=0;i<lenny-1;i++) { 
            rotated.append(original.charAt(i));
        }
        return rotated.toString();
    }

} // end Enigma class

Complete Enigma Implementation

Here is a link to the complete Enigma code on git.charlesreid1.com: Enigma.java

Now that the Enigma machine implementation is finished, we can test it out. One feature of the Enigma that makes it easy to test is, it is symmetric. If we feed a plain text into the Enigma and get the corresponding ciphertext, we can feed that ciphertext through the Enigma (with the same initial settings) and recover the original plain text.

Running the alphabet through the Enigma yields:

$ java Enigma
IN:  ABCDE FG HIJKL MNOP QRS TUVWXYZ
OUT: TVVFT KS UNVYJ FAFV NPC DZJPWEJ

Running this back through the Enigma yields:

$ java Enigma
IN:  TVVFT KS UNVYJ FAFV NPC DZJPWEJ
OUT: ABCDE FG HIJKL MNOP QRS TUVWXYZ

NOTE: This code modifies the Enigma machine's settings in-place. This means multiple sequential calls to the enigma() method will not reset the rotors. The following code will not recover the original plain text message:

// This won't work:
String coded = enigma(message, rotorAlpha, rotorNotch, rotorInit, plugboardPairs);
String original2 = enigma(coded, rotorAlpha, rotorNotch, rotorInit, plugboardPairs);

To do this correctly, we would need multiple copies of the input arrays rotorAlpha, rotorNotch, and rotorInit:

String plugboardPairs = "IR:HQ:NT:WZ:VC:OY:GP:LF:BX:AK";

String[] rotorAlpha = {WHEEL[1],WHEEL[2],WHEEL[3]};
String[] rotorNotch = {NOTCH[1],NOTCH[2],NOTCH[3]};
int[] rotorInit = {0,0,0};
String coded = enigma(message, rotorAlpha, rotorNotch, rotorInit, plugboardPairs);

String[] rotorAlpha2 = {WHEEL[1],WHEEL[2],WHEEL[3]};
String[] rotorNotch2 = {NOTCH[1],NOTCH[2],NOTCH[3]};
int[] rotorInit2 = {0,0,0};
String original = enigma(coded, rotorAlpha2, rotorNotch2, rotorInit2, plugboardPairs);

System.out.println("ORIGINAL:  "+message);
System.out.println("RECOVERED: "+original);

This works fine:

ORIGINAL:  ABCDE FG HIJKL MNOP QRS TUVWXYZ
RECOVERED: ABCDE FG HIJKL MNOP QRS TUVWXYZ 

Sources

  1. "Execution in the Kingdom of Nouns". Steve Yegge. March 2006. Accessed 18 March 2017. <https://web.archive.org/web/20170320081755/https://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html>

  2. "The Enigma Cipher". Tony Sale and Andrew Hodges. Publication date unknown. Accessed 18 March 2017. <https://web.archive.org/web/20170320081639/http://www.codesandciphers.org.uk/enigma/index.htm>

Tags:    ciphers    enigma    encryption    java   

Enigma Cipher Implementation: Part 2: Pseudocode

Posted in Enigma

permalink

This is the second of several posts walking through an implementation of the Enigma cipher in code.

Thanks to the website of the late Tony Sale for providing a wealth of detailed, accurate information entirely free of graduate level mathematics, and includes some very clear examples for luddites like me who need everything spelled out really clearly.

The Goal

The goal of analyzing the Enigma machine is to better understand the workings of a device that played an important role in the history of computing. It is also an excellent system to better understand some of the design decisions we make when creating a code representation of a problem.

The intention is to replicate some of the encryption mechanisms of the original Enigma. From the Part 1 post, which covered more about how Enigma works, we have built up a basic understanding of what each step does. In this post we discuss how to implement this functionality.

The basic idea is to start with a plaintext input (typed by the operator) and apply a rotating cripher to encrypt it, resulting in a ciphertext output:

                _________
               |         |
PLAINTEXT ---> | ENIGMA  | ----> CIPHERTEXT
               |_________|

Function or Object?

One of the first decisions typically made (sometimes implicitly) is whether to implement the Enigma as a function or an object. Is the Enigma a noun, or a verb?

The Noun Approach

Programming in a language like Java or C++, the noun approach seems perfectly natural: start with an Enigma object (the noun), and create more objects to represent more of the nouns (the switchboard, the rotor wheels, and the reflector). Each component is modeled as a black box function taking a character in and returning a character out. Each component stores and organizes information important for it to perform its particular transformation. For example, the rotors would store the scrambled version of the alphabet that they implement, while the switchboard and the reflector would store the connected letter pairs.

This approach reflects the kind of engineering approach that was taken to the design of the Enigma: many simple components, working together in concert, result in a more complex integrated system.

But one of the reasons the Enigma machine is an interesting system for implementing in code is because of the simplicity of the mechanical operations performed. This can help identify where a person actually begins the software design process. If the design process starts with one foot in the world of objects already, the object approach will be adopted by default.

With this approach, we are reshaping our data structures to fit the problem. This makes the implementation more modular and the driver easier to read at a high level. However, reshaping the data structures to fit the problem and our abstraction of it can lead to more inefficient code and implementations.

The Verb Approach

Instead, we can start by examining the encryption process itself - the verb of encryption. The action of encryption consists of elementary steps - swapping out two characters. Each of these actions is simple enough that it can be implemented using built-in string manipulation methods, and all of the quantities being stored are likewise simple enough that no exotic data structures are required.

The verb approach requires considering the problem up-front (possibly recasting it in different terms), and thinking through the actions involved in order to best utilize simple, built-in data structures and functionality.

With this approach we essentially reshape the problem to fit our data structures, rather than the other way around.

Applying the Verb Approach to the Rotors

Rotors

To get a better sense of what the verb approach looks like, let's look at the Enigma rotors.

Each rotor implements a particular permutations of the 26 letters of the alphabet:

Enigma rotor

This is equivalent to matching up two strings. To make things slightly more concrete, let's look at Rotor I, nicknamed Royal (due to the location of its notch at the letter R), implemented the scrambled alphabet "EKMFLGDQVZNTOWYHXUSPAIBRCJ". With an offset of 0, this would therefore implement the following shift:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
||||||||||||||||||||||||||
EKMFLGDQVZNTOWYHXUSPAIBRCJ

This operation, when broken down, is trivial: we are matching two characters from two strings, both at a particular location.

Starting with a char, representing the character coming into the rotor, we can perform a two-step operation: 1. Get the index of the incoming character in the normal alphabet ABCDEF.... 2. Get the corresponding character at that index in the scrambled alphabet EKMFLG....

We will also need to perform the reverse operation when we return the signal back through the machine from the reflector. In that case, we're actually performing the opposite lookup, for the same wheel:

EKMFLGDQVZNTOWYHXUSPAIBRCJ
||||||||||||||||||||||||||
ABCDEFGHIJKLMNOPQRSTUVWXYZ

When we were going through Rotor I the first time, A became E, and E became L; when going through in reverse, E now becomes A, and L now becomes E.

The procedure can also be reversed: 1. Get the index of the incoming character in the scrambled alphabet EKMFLG.... 2. Get the corresponding character at that index in the normal alphabet ABCDEF....

Applying this multiple times in sequence applies multiple scrambles and replicates multiple rotors.

Example of Verb Approach: Rotor Rotation

One of the reasons an object-oriented approach may seem natural, besides the chosen language suggesting it, is because the Enigma is an object with a state. Objects provide a natural way of representing things with an internal state, or information specific to that entity and required for its operation.

For this reason, it may seem at first blush that the Engima requires an object-oriented implementation. However, we can continue with our verb-centric thinking, and examine how the operations change when the wheels are rotated.

After 2 rotations:

ABCDEFGHIJKLMNOPQRSTUVWXYZAB
  ||||||||||||||||||||||||||
  EKMFLGDQVZNTOWYHXUSPAIBRCJ

After 4 rotations:

ABCDEFGHIJKLMNOPQRSTUVWXYZABCD
    ||||||||||||||||||||||||||
    EKMFLGDQVZNTOWYHXUSPAIBRCJ

As the wheels rotate, we are still performing the same index lookup operations, we are just rotating each scrambled alphabet by one character. Again, this is a trivial operation that is probably built in for string types.

Rotor Pseudocode

Putting all of this together, here's pseudocode for a single forward transformation by an Engima rotor:

define plaintext message
define normal alphabet and scrambled alphabet 
for each character in plaintext message:
    get index of input character in normal alphabet
    get new character at that index in scrambled alphabet
    concatenate transformed character to ciphertext message 

Because the Enigma has multiple rotors, we also want pseudocode for multiple rotors:

define plaintext message
define normal alphabet and scrambled alphabets
for each character in plaintext message:
    for each scrambled alphabet:
        get index of character in normal alphabet
        get new character at that index in scrambled alphabet
        replace character with new character 
    concatenate transformed character to ciphertext message 

Likewise, to apply multiple rotations in reverse, we just swap the alphabets out:

define plaintext message
define normal alphabet and scrambled alphabets
for each character in plaintext message:
    for each scrambled alphabet:
        get index of character in scrambled alphabet
        get new character at that index in normal alphabet
        replace character with new character 
    concatenate transformed character to ciphertext message 

A More Complete Enigma Pseudocode

To combine these operations, or add additional operations, we just add them before concatenating the final transformed character to the ciphertext message. Thus, a more complete pseudocode would include other steps:

define plaintext message
define normal alphabet and scrambled alphabets
for each character in plaintext message:

    # Apply switchboard transformation

    # Apply forward rotor transformation
    for each scrambled alphabet:
        get index of character in normal alphabet
        get new character at that index in scrambled alphabet
        replace character with new character 

    # Apply reflector transformation

    # Apply reverse rotor transformation
    for each scrambled alphabet:
        get index of input character in scrambled alphabet
        get new character at that index in normal alphabet
        replace character with new character 

    # Apply switchboard transformation

    concatenate transformed character to ciphertext message 

Now let's cover the switchboard and reflector transformations.

The Switchboard

We can take the same verb-based approach to the switchboard as we took for the rotors. When we think through the actual operation being performed by the switchboard, it is even simpler than the operation performed by the rotor wheels. We have pairs of letters, and we are simply looking for one letter in a pair, and swapping it out with the other letter in the pair. In the case of the switchboard, Enigma operators utilized 7-10 wires to connect pairs of letters; the remaining letters were unchanged. So the switchboard step is very simple:

define plaintext message
define list of switchboard swap pairs
for each character in message:
    for each pair in swap pairs:
        if character in swap pair, swap its value
    add character to ciphertext

That's it! No need for a reverse function, since the swap procedure is symmetric. (Note that if there is no wire connecting a letter to another letter, there is no swap operation, and the character is added to the ciphertext directly.)

The Reflector

The reflector pseudocode looks identical to the switchboard pseudocode, except the reflector defines pairings for all 13 posible letter pairs, instead of only 7-10.

define plaintext message
define list of reflector swap pairs
for each character in message:
    for each pair in swap pairs:
        if character in swap pair, swap its value
    add character to ciphertext

Again, no need for a reverse version, since the reflector is a symmetric transformation.

Nearing a Complete Enigma Pseudocode

Almost there. We have one more thing going on - those rotor wheels are moving. Add the "increment rotor wheels" verb, and define that below.

define plaintext message
define normal alphabet and scrambled alphabets
define list of switchboard swap pairs
define list of reflector swap pairs
for each character in plaintext message:

    # Apply switchboard transformation
    for each pair in switchboard swap pairs:
        if character in swap pair, swap its value

    # Apply forward rotor transformation
    for each rotor/scrambled alphabet:
        get index of character in normal alphabet
        get new character at that index in scrambled alphabet
        replace character with new character 

    # Apply reflector transformation
    for each pair in reflector swap pairs:
        if character in swap pair, swap its value

    # Apply reverse rotor transformation
    for each scrambled alphabet:
        get index of input character in scrambled alphabet
        get new character at that index in normal alphabet
        replace character with new character 

    # Apply switchboard transformation
    for each pair in switchboard swap pairs:
        if character in swap pair, swap its value

    concatenate transformed input character to ciphertext message 

    increment rotor wheels

Incrementing the Rotor Wheels

Each rotor wheel has a notch located at a particular letter. The wheels were identified by the letter on which the notch was located (Rotor I was "Royal" because the notch was located at "R", and so on).

The notches were designed to catch on the notches of other rotor wheels, in such a way that the wheels would turn together periodically. The right-most wheel would rotate once per keypress. Once per 26 letters (if S = 26), the notch would catch the notch of the next rotor over and advance it forward by 1 letter. It was this mechanism that kept the machine constantly skipping through the space of possible keys, mapping each character to each other character, with one distinct key (alphabet scramble) used per letter of the message.

To increment the wheels in pseudocode:

function increment rotors:
    for each rotor/scrambled alphabet, left to right:
        get index of left notch in left alphabet
        get index of right notch in right alphabet
        if left index equals right index:
            cycle left alphabet forward 1 character
    cycle right-most alphabet forward 1 character

Enigma Pseudocode

Almost there. We have one more thing going on - those rotor wheels are moving. Add the "increment rotor wheels" verb, and define that below.

define plaintext message
define normal alphabet and scrambled alphabets
define list of switchboard swap pairs
define list of reflector swap pairs
for each character in plaintext message:

    # Apply switchboard transformation
    for each pair in switchboard swap pairs:
        if character in swap pair, swap its value

    # Apply forward rotor transformation
    for each rotor/scrambled alphabet:
        get index of character in normal alphabet
        get new character at that index in scrambled alphabet
        replace character with new character 

    # Apply reflector transformation
    for each pair in reflector swap pairs:
        if character in swap pair, swap its value

    # Apply reverse rotor transformation
    for each scrambled alphabet:
        get index of input character in scrambled alphabet
        get new character at that index in normal alphabet
        replace character with new character 

    # Apply switchboard transformation
    for each pair in switchboard swap pairs:
        if character in swap pair, swap its value

    concatenate transformed input character to ciphertext message 

    # Increment rotor wheels
    for each rotor/scrambled alphabet, left to right:
        get index of left notch in left alphabet
        get index of right notch in right alphabet
        if left index equals right index:
            cycle left alphabet forward 1 character
    cycle right-most alphabet forward 1 character

Sources

  1. "The Enigma Cipher". Tony Sale and Andrew Hodges. Publication date unknown. Accessed 18 March 2017. <https://web.archive.org/web/20170320081639/http://www.codesandciphers.org.uk/enigma/index.htm>

Tags:    ciphers    enigma    encryption   

March 2022

How to Read Ulysses

July 2020

Applied Gitflow

September 2019

Mocking AWS in Unit Tests

May 2018

Current Projects