computability, arithmetic

Addition is usually defined formally by:

\[ \begin{array}{lcl} x + 0 &=& x \\ x + (y+1) &=& (x + y) + 1 \\ \end{array} \] where the “plus one” function is assumed to be a part of the very construction of the natural numbers themselves.

This definition played a fascinating role in the history of computation. It predates both the formal definition of computation and the formal theory of arithmetic, and it played a role in shaping both.

This is the story of the **primitive recursive functions**.
I like to call this collection of functions, tongue-in-cheek, the first
total functional programming language.
They have been studied extensively, and what follows is my own exposition.

- background
- definitions
- limitations
- logical formalisms
- generalization

The ancient Greeks drew a distinction between *actual infinity* and *potential infinity*.
The distinction may sound like hair-splitting at first, but it turns out to be fruitful.
Actual infinity involves the existence of infinite objects,
while potential infinity involves only an unending process.

The ancient Greeks were comfortable with potential infinity, but not actual infinity. Aristotle is often attributed with inventing the slogan “infinitum actu non datur”, or “infinity is not actually given”. For example, Aristotle would not have approved of a construction on the set of even numbers, but would have been fine with a construction manipulating an arbitrary even number. Mathematicians got along just fine with potential infinity for a long time.

The reluctance to embrace actual infinity changed drastically in the 1800’s when it became clear that individual real numbers required actual infinity. At the same time, it became clear how unintuitive infinite sets can be. Georg Cantor and his explorations of trigonometric series led the math world into a renewed debate about actual infinity.

Amidst these changes and uncertainty grew an increased desire to place mathematics
on more solid foundations.
In particular, many mathematicians sought to make *reasoning* about infinite objects
reducible to finite methods, and to free the structure of mathematical proofs from
actual infinity.
Out of this maelstrom came the idea of the modern computer.

First-order logic was invented in the late 1800s, independently by Charles Sanders Peirce and by Gottlob Frege. Frege was attempting to ground mathematics in first-order logic, but in 1901 Bertrand Russell discovered a major flaw in his system. This is now known as Russell’s paradox, even though the paradox was known to Zermelo two years earlier.

Russel went on, together with Alfred Whitehead, to describe a logical formalism for mathematics
in *Principia Mathematica*.

The backdrop is now set for Thoralf Skolem’s primitive recursive functions.
Principia Mathematica used quantifiers to define functions such as addition.
Quantifiers, which embody the phrases “for all” and “there exists”, assume actual infinity.
In his paper “The foundations of elementary arithmetic established by means of the recursive
mode of thought, without the use of apparent variables ranging over infinite domains”^{1},
Skolem defines, by way of examples, the primitive recursive functions.
These functions, Skolem argues, provide an alternate foundation for mathematics
which rely only on potential infinity.

Skolem’s paper is very accessible and fun to read.

Similar to the natural numbers, the primitive recursive functions have an inductive definition. The domain of every primitive recursive function is the set of \(k\)-tuples of natural numbers (for a fixed number \(k\)), which is denoted \(\mathbb{N}^k\). The range is always \(\mathbb{N}\).

The **basic** primitive recursive functions are:

\[ \begin{array}{|l|l|l|} \hline name & notation & function \\ \hline constant & C^k_y & (x_1, , x_k)\mapsto y \\ \hline successor & S & x\mapsto x+1 \\ \hline projection & P^k_i & (x_1, \ldots, x_k)\mapsto x_i\text{, where } 1\leq i\leq k \\ \hline \end{array} \]

A function is a **primitive recursive function** if it can be constructed using only
the basic functions, composition, and primitive recursion:

\[ \begin{array}{|l|l|l|l|} \hline name & given & notation & function \\ \hline composition & g:\mathbb{N}^m\to\mathbb{N} & g \circ (h_1, \ldots, h_m) & x_1\ldots,x_k \mapsto g(h_1(x_1, \ldots, x_k), \ldots, h_m(x_1, \ldots, x_k)) \\ & h_1,\ldots,h_m:\mathbb{N}^k\to\mathbb{N} & & \\ \hline primitive~recursion & g:\mathbb{N}^k\to\mathbb{N} & R(g, h) & f(0, x_1, \ldots, x_k) \mapsto g(x_1, \ldots, x_k) \\ & h:\mathbb{N}^{k+2}\to\mathbb{N} & & f(x_0+1, x_1, \ldots, x_k) \mapsto h(x_0, f(x_0, x_1, \ldots, x_k), x_1, \ldots, x_k) \\ \hline \end{array} \]

Addition is a primitive recursion function, as evidenced by:

\[\mathsf{add} \equiv R(P^1_1, S\circ P^3_2) \]

This is equivalent to the definition that we gave at the beginning (except that the recursion is performed on the left summand).

Multiplication is a primitive recursion function:

\[\mathsf{mult} \equiv R(C^1_0, \mathsf{add} \circ (P^3_2, P^3_3)) \]

As we will explore, many functions are primitive recursive.

We can represent the primitive recursive functions as an imperative programming language^{2}
which contains:

- variable assignments:
`X=0`

`X=X+1`

`X=Y`

- bounded loops:
`FOR X ... END`

For example, we can implement truncated subtraction as

```
LOOP Y
A = 0
LOOP X
X = A
A = A + 1
END
END
```

where we run the program with `X=x`

, `Y=y`

, and the program terminates with `x∸y`

in `X`

.

Before the Church-Turing thesis was widely accepted, there was no precise and commonly accepted definition of algorithm (even though we have written records of algorithms from ancient Mesopotamia). For a brief window of time, the primitive recursive functions looked like a reasonable candidate for the definition of algorithm.

Two students of Hilbert, Ackermann and Sudan, each produced the description of a function which everyone would recognize as an algorithm, but which failed to be primitive recursive.

We now know that the addition of one more operator, namely the minimization operator \(\mu\), is enough to construct every function that meets our intuitive understanding of an algorithm.

The Ackermann and Sudan examples both employ a use of “double recursion”, creating functions which grow faster than any primitive recursive function.

The Ackermann function is defined as:

\[ \begin{array}{lcl} A(0, n) & = & n + 1 \\ A(m+1, 0) & = & A(m, 1) \\ A(m+1, n+1) & = & A(m, A(m+1, n)) \\ \end{array} \]

The “diagonal” of the Ackermann function, \(d(m)=A(m, m)\), grows incredibly fast.
One way to see this is by comparison to the explosive up-arrow notation from Knuth.
A single arrow represents exponentiation:
\(
x\uparrow y = x^y
\).
Two arrows represents repeated exponentiation, so that \(x\uparrow\uparrow y\)
is an exponential tower of \(y\)-many \(x\)’s.
For example, \(2\uparrow\uparrow 3 =\) 2^{22} (this is sometimes called tetration).
Three arrows represents iteration of the double arrow, etc.

Fixing \(m=3\), the function \(A(3, n)\) grows like \(2\uparrow n\).

Fixing \(m=4\), the function \(A(4, n)\) grows like \(2\uparrow\uparrow n\).

In general, for \(m\geq 3\):

\[ A(m, n) = 2\uparrow^{m-2}(n+3) - 3 \] where \(\uparrow^n\) denotes \(n\)-many \(\uparrow\)’s.

This is a mind bogglingly fast function.

It is possible to show that each individual primitive recursive function is eventually bounded by a “row” of the Ackermann function. In other words, if \(f\) is primitive recursive, then there exists numbers \(m_0, n_0\) such that for all \(n\geq n_0\), \(f(n) \leq A(m_0, n)\).

Closely related to these “Ackermann rows” is the Grzegorczyk hierarchy, which categorize some
fast growing functions.^{3}.

As a humorous aside, the *graph* of the Ackermann function is, hilariously, primitive recursive.

In other words, the function \[ \mathsf{AG}(m, n, y) = \begin{cases} 1 & \text{if } A(m, n) = y \\ 0 & \text{otherwise} \end{cases} \] is primitive recursive.

The trick is to notice that the size of the data needed to validate \(y\)
as a solution is not that large *relative to* \(y\).

To see this, notice that \(A\) is strictly monotonic in each coordinate. Moreover, in order to validate if \(A(m, n)=y\), you can first compute the table of all values \(A(a, b)\) where \(0\leq a,b\leq y\), marking any value which requires a recursive call not on the table with a dot.

For example, in order to verify \(A(3, 1)=13\), we can compute this table:

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | \(\cdot\) |

1 |
2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | \(\cdot\) | \(\cdot\) |

2 |
3 | 5 | 7 | 9 | 11 | 13 | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) |

3 |
5 | 13 | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) | \(\cdot\) |

In general, a function is primitive recursive *if and only if*:

- its graph is primitive recursive
- it is bounded by a primitive recursive function

There is another fundamental property of the primitive recursive functions
which prevents them from capturing the notion of algorithm: *totality*.

Totality is a very nice property, but unfortunately any collection of total functions which can be enumerated by an algorithm cannot contain all the algorithms.

A function that is defined on all inputs is called **total**.
This notion does not come up a lot in “normal” mathematics,
since “on all inputs” is usually trivially true
(by defining the inputs to be those where the function is defined).
In the computable world, however, it is difficult to determine which potential inputs are in the
domain.
Take, for example, the function TP which maps *n* to the *n*-th pair of twin primes.
No one alive today knows if this function’s domain is ℕ (all natural numbers)
or just {0, 1, …, *N*} for some big *N*.

It is easy to see that the primitive recursive functions are all total,
meaning they are defined on all ℕ^{k}
(and we’ll have more to say about this later, regarding Parson’s theorem).

The same diagonal argument that is used to show that there are more real numbers than rational numbers, the same diagonal argument used to prove that the halting set is not computable, can also demonstrate that there is an algorithm which is not primitive recursive.

It is easy to see that we can enumerate the primitive recursive functions:

\[ P_0, P_1, P_2, \ldots \]

Consider the following algorithm \(d\) defined as:

\[d(n) = P_n(n)+1 \]

By construction, \(d\) is different from every primitive recursive function.

If we add one new construction technique to the primitive recursive functions, we can describe every algorithm. This new collection of functions is called the general recursive functions.

\[ \begin{array}{|l|l|l|l|} \hline name & given & notation & function \\ \hline minimization & g:\mathbb{N}^{k+1}\to\mathbb{N} & \mu g & x_1,\ldots,x_k\mapsto \\ & & & \text{ the least }y\text{ such that } \\ & & & \text{ }g(y, x_1,\ldots, x_k)=0 \\ \hline \end{array} \]

Notice that for any given \(x_1, \ldots, x_k\), the function \(\mu g\) is only defined if there is at least one \(y\) such that \(g(y, x_1, \ldots, x_k)=0\). For this reason, the diagonal argument cannot be applied to the general recursive functions.

A letter from Herbrand in 1931 sparked Gödel to define them in 1934.

The story of primitive recursion actually predates Skolem. Hermann Grassmann’s 1861 book “Lehrbuch der Arithmetik” seems to be the first appearance of the formal definition of addition and multiplication by primitive recursion. In the 1880’s Grassmann’s ideas were adopted by Charles Peirce, Richard Dedekind, and Giuseppe Peano. This group of mathematicians used these ideas for a logical foundation for arithmetic, more similar to the Principia Mathematica than Skolem’s paper. This culminated in what we now call the Peano axioms, which roughly capture the fact that the integers are a semiring. Peano arithmetic is the name of the first order logical theory that assumes the Peano axioms.

Ackermann’s 1924 dissertation^{4} was the first to create a logical theory directly
capturing the semantics of the primitive recursive functions.
We now call this system **primitive recursive arithmetic**.

The early 1920’s also saw the rise of Hilbert’s program, which sought to find a solid foundation for mathematics resting on “finitistic” and formal grounds. Ackermann was one of Hilbert’s students, and his primitive recursive arithmetic was the kind of system that Hilbert hoped could provide a foundation for math. Peano arithmetic was another contender. Hilbert’s position was that mathematical results involving actual infinity could be trusted as long as the proofs themselves relied only on potential infinity.

Gödel showed in 1931 that the full scope of Hilbert’s program was *impossible*^{5}.
He was exquisitely careful to perform most of the argument using primitive recursive functions
(including a delightful use of the Chinese remainder theorem).

Emil Post very nearly discovered the same incompleteness theorem himself ten years earlier,
but failed to publish the results.^{6}

Note that the Church-Turing thesis would not even be proposed for another five years,
and that the definition of computation was still unsettled.
It was crucial that the proof techniques that Gödel used were incontrovertible,
and had to appeal to our intuitive notion of computation.
Gödel himself remained somewhat skeptical of his 1931 result until he saw
Turing’s famous results.^{7} ^{8}

In 1941, Haskell Curry devised a formalization of primitive recursive arithmetic
using only equality of terms (and no logical connectives)^{9}.

Here is a timeline to help visualize the story. I’ve drawn two vertical lines, one to mark when the Church-Turing thesis was announced and one to to mark when the ENIAC was created.

The primitive recursive functions also show up naturally inside of Peano Arithmetic. In order to describe this connection, we need to define what it means for a function to be provably total.

We say that a function \(f:\mathbb{N}\to\mathbb{N}\) is described by a formula \(F\) in Peano Arithmetic if:

\[ f(x) = y \Longleftrightarrow F(x, y) \text{ is true} \]

where “F is true” means that the formula holds when you interpret the variables as natural numbers (and the addition symbol as addition, etc).

We say that a function \(f:\mathbb{N}\to\mathbb{N}\) is
**provably total** (inside Peano Arithmetic) if there exists
a formula \(F\) in Peano Arithmetic which describes \(f\) and which can be proved to be total.
In other words, there are proofs of:

- For every \(y\in\mathbb{N}\), there exists \(x\in\mathbb{N}\), such that \(F(x, y)\).
- For every \(y_1, y_2, x\in\mathbb{N}\), if \(F(x, y_1)\) and \(F(x, y_1)\) hold, then \(y_1 = y_2\) also holds.

Parson’s theorem states that the primitive recursive functions are exactly those
functions which are provably total in a certain a subsystem of Peano arithmetic
named \(\mathsf{I}\Sigma_1\).
Peano arithmetic allows for induction over any formula,
and \(\mathsf{I}\Sigma_1\) is the restriction of Peano Arithmetic to formulas
of the form \(\exists x\phi(x)\), where \(\phi(x)\) contains only
bounded quantifiers^{10}.

Parson’s theorem is often stated in the form of a conservation theorem: if \(\phi\) is a formula of the form \(\forall x\exists y\theta(x, y)\) (where \(\theta\) is bounded) and is provable in \(\mathsf{I}\Sigma_1\), then \(\phi\) is also provable in primitive recursive arithmetic.

These two statements are equivalent since the formula expressing the totality of a
primitive recursive function has the right form:
\(\forall x\exists y~f(x)=y\).
Note that “\(f(x)=y\)” is not necessarily captured by a bounded formula.
It is, however, captured by a \(\exists x\phi(x)\) formula by
Kleene’s normal form theorem
(and you can always collapse two consecutive existential quantifiers into one).
Indeed, tetration is primitive recursive but not definable by a bounded formula of PA.
(For a neat example of the difference between primitive recursion and bounded formulas,
you can consider the latter to be loop programs where the variable used to loop cannot change^{11}.)

While the incompleteness did show that Hilbert’s program was impossible in its entirety,
a partial realization was found in 1976 by Harvey Friedman
and used primitive recursive arithmetic.^{12}

There is an fairly expressive logical system named \(\mathsf{WKL}_0\) which asserts the existence of actual infinity in some circumstances.

An illustrative example of Weak Kőnig’s lemma (for which this system is named) is the following. Suppose you have a collection a finite binary sequences which form a tree (meaning it is closed under initial sub-sequences). Suppose also that there is an algorithm which can determine which finite binary sequence are in the collection. The tree might be infinite, but the algorithm describes it as a potential infinity. Weak Kőnig’s lemma states if the tree is infinite then there exists an infinite path through the tree. Moreover, we can construct such a tree so that no infinite path through the tree is described by an algorithm, even though the sequences that comprise the tree are described by an algorithm. In this sense, Weak Kőnig’s lemma calls forth actual infinity out of potential infinity.

The partial realization of Hilbert’s program is:

Every formula of the form \(\forall x\exists y\phi(x, y)\)
(where \(\theta\) is bounded) which is provable in \(\mathsf{WKL}_0\)
is also provable in primitive recursive arithmetic.^{13}

In other words, the use of actual infinity inside \(\mathsf{WKL}_0\)
can often be replace with “finitistic methods”.
To see many examples of important mathematical theorems that can be proved
with \(\mathsf{WKL}_0\), see Simpson’s book.^{14}

After seeing Parson’s theorem, it is natural to wonder if there is a good description
of *all* the provably total functions of Peano Arithmetic.
Indeed there is, and one such description involves a generalization of the primitive recursive functions
called the **primitive recursive functionals**.

In 1958, Gödel used the primitive recursive functionals to prove
the consistency of Peano Arithmetic.^{15}
The primitive recursive functionals are terms in a logical calculus
that Gödel invented, now called System T.
One of Gödel’s results about System T is that every provably total function of
Peano Arithmetic is denoted by a term in System T.
The reverse is also true.

Interestingly, though Peano Arithmetic can prove that any given primitive recursive functional
is total, it **cannot** prove that
“for all primitive recursive functionals f, f is total”,
as this would contradict Gödel’s second incompleteness theorem.
In other words, this universal quantifier on the functionals cannot be moved
from outside the meta-theory to inside the meta-theory.

Note that Georg Kreisel characterized the provably total functions of Peano Arithmetic
in 1952 using a different classification,
namely the “ordinal recursive functionals below \(\epsilon_0\)”^{16}.

Before we define the primitive recursive functionals, we start with an example from functional programming.

The first thing to note is how similar the natural numbers are to lists. Numbers are described by “start with zero, then keep applying the successor function”. Lists are described by “start with the empty list, then keep appending elements”. In other words, the natural numbers are like lists whose values are ignored.

One of the mostly useful functions on lists is `fold`

.
It is not a coincidence that `fold`

is useful, as it just reverses the construction of a list.
Much like a proof by induction, fold takes the following:

- a base case (for the empty list)
- a step case (for handling one element of the list, in the context of an intermediate calculation)

With these two inputs, `fold`

returns a function from an arbitrary list to
whatever type of data the two cases return.

In Haskell, (the right) fold looks like this:

```
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr step base [] = base
foldr step base (x:xs) = step x (foldr step base xs)
```

What would `fold`

on the natural numbers look like?

It is nearly the same as `foldr`

, except that it does not have to handle the list elements.
We write it out in Haskell, and give it the name `iter`

for “iterator”.

```
-- Peano Numerals
data ℕ = Z | S ℕ
```

```
-- iterator on ℕ
iter :: (b -> b) -> b -> ℕ -> b
Z = base
iter step base S n) = step (iter step base n) iter step base (
```

For the same reason that induction is the primary tool for proofs of statements about the natural numbers, the iterator is the primary tool for functions on the natural numbers. We can use it to define addition, multiplication, exponentiation, etc.

```
add :: ℕ -> ℕ -> ℕ
= iter m S n add m n
```

```
mult :: ℕ -> ℕ -> ℕ
= iter Z (add m) n mult m n
```

```
ex :: ℕ -> ℕ -> ℕ
= iter (S Z) (mult m) n ex m n
```

So far this looks a lot like primitive recursion.
This is because we’ve only used `iter`

in the case where the type `b`

is \(\mathbb{N}\).
Things get interesting when we let `b`

have higher order.
In fact, we can use `iter`

to define the Ackermann function,
and we will need to let `b`

be \(\mathbb{N} \to \mathbb{N}\).

Recall the definition of the Ackermann function:

\[ \begin{array}{lcl} A(0, n) & = & n + 1 \\ A(m+1, 0) & = & A(m, 1) \\ A(m+1, n+1) & = & A(m, A(m+1, n)) \\ \end{array} \]

The last case looks like a straight-forward use of the iterator (with `b`

as \(\mathbb{N}\)).
So we can start by defining the Ackermann function using a single case statement:

```
one :: ℕ
= S Z one
```

```
ack' :: ℕ -> ℕ -> ℕ
Z = S
ack' S m) = iter (ack' m) (ack' m one) ack' (
```

Written this way, we can see an opportunity to use the iterator a second time!
But instead of using `iter`

to produce natural numbers,
we will use it to produce *functions* from natural numbers to natural numbers:

- base case - the successor function
- step case - the function obtained by iterating the previous “row” of the ackerman function

Which looks like this:

```
ack :: ℕ -> ℕ -> ℕ
= iter step S
ack where
= iter ackm (ackm one) step ackm
```

The function `iter`

is referred to as a *catamorphism* in the function programming world.
The dual notion is called an *anamorphism*, and these ideas lead to all kind of fun
in functional programming.^{17}

Loosely speaking, the primitive recursive functionals are the functions that you can define
using `iter`

, where `b`

is any type built up from \(\mathbb{N}\)
and the function arrow \(\to\)
(for example, \(\mathbb{N}\to(\mathbb{N}\to\mathbb{N})\to\mathbb{N}\)).
These types are usually called the “finite types”.

More precisely, the finite types are those defined by the grammar:
\[

\tau ::= \mathbb{N} ~| ~\tau\to\tau

\]

The **primitive recursive functionals** are the functions that you make using:

- the constant function \(n\mapsto 0\)
- the successor function \(n\mapsto n+1\)
- the S, K, and R combinators over the finite types

I will write out definitions of these combinators using Haskell (ignoring the restriction to the finite types).

The K combinator is just the projection of a pair onto the first coordinate:

```
k :: a -> b -> a
= a k a b
```

The S combinator “fuses” (schmelzen in German, from Schönfinkel^{18}) the occurrences
of `x`

in `(f x)(g x)`

:

```
s :: (a -> b -> c) -> (a -> b) -> a -> c
= f x (g x) s f g x
```

Finally, the R combinator is the recursor and is only slightly different than `iter`

.

Note that `iter`

does not pass the “counter” to the step function.
Doing so results in what is called the “recursor”.

```
-- iterator of ℕ
iter :: (b -> b) -> b -> ℕ -> b
Z = base
iter step base S n) = step (iter step base n)
iter step base (
-- recursor of ℕ
recursor :: (ℕ -> b -> b) -> b -> ℕ -> b
Z = base
recursor step base S n) = step b (recursor step base n) recursor step base (
```

I will end this post with a peek into another beautiful world, namely second-order arithmetic.

Second-order arithmetic adds the notion of sets of natural numbers to Peano arithmetic. In fact, the system \(WKL_0\) mentioned earlier is a subsystem of second-order arithmetic. An incredible amount of interesting mathematics can be expressed in second-order arithmetic.

Moreover, in the same way that System T captures the provably total functions of Peano arithmetic, there is an analogous system for second-order arithmetic called System F.

System F was discovered independently by the computer scientist John Reynolds
and by the logician Jean-Yves Girard,
and it provides the theoretical underpinnings of Haskell^{19} and ML.

Thoralf Skolem. “The foundations of elementary arithmetic established by means of the recursive mode of thought without the use of apparent variables ranging over infinite domains”. An English translation is included in From Frege to Gödel.↩︎

A good exposition is: Piergiorgio Odifreddi. “Classical Recursion Theory, volume II”, VIII.7↩︎

Kurt Gödel. “On Formally Undecidable Propositions of Principia Mathematica and Related Systems”. An English translation is included in From Frege to Gödel.↩︎

John Stillwell. “Emil Post and His Anticipation of Gödel and Turing”↩︎

A bounded quainter is eitther \((\forall x\leq n)\varphi\) or \((\exists x\leq n)\varphi\). The former is shorthand for \(\forall x. (x\leq n \to \varphi )\) and the latter is shorthand for \(\exists x. (x\leq n \land \varphi )\).↩︎

Stephen Simpson. “Partial realizations of Hilbert’s program”↩︎

A proof can be found in “Subsystems of Second Order Arithmetic”, Theorem IX.3.16 (page 381).↩︎

“Subsystems of Second Order Arithmetic”, Theorem I.10.3 (page 36).↩︎

A great exposition is given “Avigad, Feferman.”Gödel’s Functional (“Dialectica”) Interpretation”.↩︎

Georg Kreisel. “On the interpretation of non-finitist proofs – Part II”↩︎

Meijer, Fokkinga, Paterson. “Functional programming with bananas, lenses, envelopes and barbed wire”↩︎

Moses Schönfinkel. “On the building-blocks of mathematical logic”. An English translation is included in From Frege to Gödel.↩︎

time-lock-puzzles, number-theory

Time-lock puzzles were conceived by Timothy May in 1993^{1}
and brought into reality by Rivest, Shamir, and Wagner in 1996^{2}
(hence called the RSW puzzle).
The idea is to encrypt something that cannot be decrypted until
after a set time in the future.

The RSW puzzle achieves these goals by forcing the decryption process to require a receptive task, a task assumed to always take the same amount of time. Crucially, the repetitive tasks cannot be performed in parallel.

The original RSW paper is very readable, but I am going to explain the construction here with the key details explained more explicitly.

The key idea is to use the same clever trick used in the RSA cryptosystem. If you know how RSA works, you might have fun stopping here and trying to invent the RSW puzzle for yourself.

Decryption of the RSW puzzle involves computing the power of a fixed number *a*.
The person who produces the cypher text will have a “trapdoor” allowing them
to substitute a vastly smaller exponent.

We start with the product of two prime numbers *p* and *q*:

\[ n = pq \]

The factorization of *n* is the secret trapdoor.

We now fix numbers *a* and *t*
and compute *a* to the exponent 2^{t}, modulo *n*:

\[ a^{(2^t)}~(\mathsf{mod} {~n}) \]

There are two ways to compute this number, the slow way and the fast way. The slow way is what forces the secret to be revealed only in the future. The fast way is the means by which the secret is place into the timelock.

If one does not know the factorization of *n*, then computing *a*^{2t}
amounts to the brute force calculation of squaring *a* *t*-many times.
The value of *t* is chosen to be large enough to make this calculation
take the desired amount of time.
The lack of parallelization of the puzzle stems from no known way to parallelize this process.

If, however, you know that *n* = *p**q*, then you can make use of
Euler’s theorem:

\[ a^{\varphi(n)}\equiv 1~(\mathsf{mod} {~n}) \]

where *φ*(*n*) is
Euler’s totient function,
which counts the numbers up to *n* which are relatively prime to *n*.
In our particular case, *φ*(*n*) = (*p* − 1)(*q* − 1).

Letting *x* be the remainder of 2^{t} after division by (*p* − 1)(*q* − 1), by
Euler’s theorem
we have that

\[ a^{(2^t)} \equiv a^x ~(\mathsf{mod} {~n}) \]

Since *x* is vastly smaller than 2^{t},
*a*^{x} (modulo *n*) is fast to compute.

Instead of encrypting a long message *M*, we encrypt *M* with a private key *K*
from some other cryptosystem, and place *K* in a time-lock puzzle.
Let *C*_{M} be the cipher text corresponding to *M*.

Let \[ C_K = K + a^{(2^t)}~(\mathsf{mod} {~n})\]

Computing *C*_{K} will make use of the fact that *n* = *p**q*.

We make public *C*_{M}, *C*_{M}, *n*, *a*, and *t*, and make sure that *p* and *q* remain secret.
The person who wishes to unlock the time puzzle must compute *a*^{2t}
in order to compute *K* from *C*_{K}.
Using *K*, they then decrypt M from *C*_{M}.

The RSW time-lock puzzle is only the beginning of the story.
For a recent survey, see the master’s thesis of Ceylin Doğan^{3}.

I’ve only found references to this dead link: http://www.hks.net/cpunks/cpunks- 0/1460.html↩︎

Rivest, R.L., Shamir, A., Wagner, D.A.: Time-lock puzzles and timed-release crypto↩︎

Dogan, C.: A Comprehensive Study of Time Lock Puzzles and Timed Signatures in Cryptography↩︎

math, logic, pigeonhole

The (finite) pigeonhole principle states that if you have more labels than objects (with everything finite) and you want to assign all the labels to the objects, at least one of the objects will have multiple labels. Or, using the imagery of pigeons, if you have more pigeons than holes, you cannot place all the pigeons into the holes without at least one hole having multiple pigeons. It is an obvious statement which turns out to be surprisingly usefully. Shockingly useful! The statement is attributed to Dirichlet, who actually used a metaphor involving letters (the kind you mail).

There are many great examples of proofs that involve applying the pigeonhole principle. The Mutilated chessboard problem is one of my favorites (hint: each domino covers exactly one black and one white square), but perhaps my favorite is Fermat’s theorem on sums of two squares.

The theorem states: for any odd prime number \(p\), \(p\) can be written as the sum of two squares exactly when it is congruent to one modulo four. In symbols:

\[ p=x^{2}+y^{2} \quad\text{if and only if}\quad p \equiv 1 \pmod 4 \]

Before we get to the fun part of the proof, we must narrow things down. The “only if” direction of the proof is straightforward: since \(p\) is an odd prime, it cannot be congruent to 0 or 2 modulo 4, and 3 can be ruled out by a case analysis on all the sums of squares modulo 4.

The “if” direction is where the fun begins. We make one simplification, using Euler’s criterion, before bringing out the pigeons. Assuming that \(p \equiv 1 \pmod 4 \), Euler’s criterion tells us that there is an \(x\) such that \(x^{2}+1 \equiv 0 \pmod p \). The curious reader can look up the proof of Euler’s criterion, it is fairly short, but we’ll skip it so that we can jump straight to naming pigeons.

We declare one pigeon for each pair of natural numbers \(u, v\) such that \(0 \leq u,v < \sqrt p\). How many pigeons do we have? We have \((\lfloor \sqrt p \rfloor + 1)^{2}\) pigeons, which is definitely greater than \(p\). Foreshadowing!

The holes will be the numbers modulo \(p\), and we will place pigeon \(u, v\) into the hole \(ux - v \pmod p\). The pigeonhole principle tells us that there are numbers \(u_1, u_2, v_1, v_2\) such that \[u_1 x - v_1 \equiv u_2 x - v_2 \pmod p\] A little bit of algebra yields: \[(u_1 - u_2)x \equiv (v_1 - v_2) \pmod p\] \[(u_1 - u_2)^{2}x^{2} \equiv (v_1 - v_2)^2 \pmod p\] \[(u_1 - u_2)^{2}(-1) \equiv (v_1 - v_2)^2 \pmod p\] \[0 \equiv (u_1 - u_2)^{2} + (v_1 - v_2)^2 \pmod p\]

Note that either \((u_1 - u_2)\) or \((v_1 - v_2)\) must be positive since the numbers came from two distinct pigeons.

Therefore \[0 < (u_1 - u_2)^2 + (v_1 - v_2)^2 < (\sqrt p)^2 + (\sqrt p)^2 = 2p\] We have then that \((u_1 - u_2)^2 + (v_1 - v_2)^2\) is a multiple of \(p\) strictly between \(0\) and \(2p\), leaving only \(p\). This completes the proof.

There is a strong connection between the pigeonhole principle and mathematical induction. People have studied weak proof systems and have found that adding principles of induction to these weak systems has the same effect as adding the pigeonhole principle as a new axiom. The connection goes quite deep, but requires building up some terminology around the syntax of formulas. The book Metamathematics of First-Order Arthimetic by Hájek and Pudlák explains this in detail (see Chapter 1, Section 2, part b).

The most intuitive explanation that I have found of this connection comes from the Hájek and Pudlák book. A failure of induction translates very easily into a failure of the pigeonhole principle.

A failure of induction, for a formula \(\phi (x)\), would look like this:

- \(\phi (0)\) is true
- for every \(x\), if \(\phi (x)\) is true, then so is \(\phi (x+1)\)
- \(\phi (a)\) is false for some \(a\)

We can translate this into a failure of the pigeonhole principle as follows:

Our pigeons are the numbers \(\leq a\). The holes are the numbers \(< a\). We place pigeons in holes as follows:

- if \(\phi (x)\) holds, put pigeon \(x\) into hole \(x\)
- if \(\phi (x)\) does not hold, put pigeon \(x\) into hole \(x - 1\)

This is not constructive intuition, and I would love a more constructive view of the connection between induction and the pigeonhole principle.

It also turns out that induction is comparable to some bounding principles. If we view a formula \(\phi (x, y)\) as a partial, multi-valued function, thinking of \(f (x) = y\) exactly when \(\phi (x, y)\), then a bounding principle is a way of stating that if the domain of a function is bounded, then so it its range.

There is also an infinite version of the pigeon hole principle, which states that you cannot assign an infinite number labels to a finite collection of objects without labeling at least one object with an infinite number of labels. The contrapositive states that a finite union of finite sets is finite.

According to Akihiro Kanamori, in The Mathematical Infinite as a Matter of Method, the pigeon hole principle may have had a role in making the definition of “infinite” rigorous:

In 1872 Dedekind was putting together Was sind und was sollen die Zahlen?, and he would be the first to define infinite set, with the definition being a set for which there is a one-to-one correspondence with a proper subset. This is just the negation of the Pigeonhole Principle. Dedekind in effect had inverted a negative aspect of finite cardinality into a positive existence definition of the infinite.

The connection between the pigeonhole principle, induction, and bounding principles holds in the infinite case as well. See Jeff Hirst’s thesis, theorem 6.4 on page 104.

Ramsey’s theorem is a generalization of the pigeonhole principle. It has both a finite and an infinite version, which itself has a host of generalizations and leads to a whole field of study.

One special case of Ramsey’s theorem state that any group of six people must contain three people such than one of the following holds:

- all three people know each other
- none of them know each other

We can re-phrase this in terms of graph theory: if you color the edges of the complete graph on six vertices with two colors, say red and blue, there is a sub-graph of size three such that either all the edges are red or all the edges are blue. The size three sub-graph is called homogeneous or monochromatic.

The finite version of Ramsey’s theorem generalizes the above statement to any number of colors, any size homogeneous set, and multi-dimensional graphs. The size graph that you need to guarantee a homogeneous set of the given size grows extremely fast.

I’ll end this post with a fun quote from Paul Erdős:

Suppose aliens invade the earth and threaten to obliterate it in a year’s time unless human beings can find the Ramsey number for red five and blue five. We could marshal the world’s best minds and fastest computers, and within a year we could probably calculate the value. If the aliens demanded the Ramsey number for red six and blue six, however, we would have no choice but to launch a preemptive attack.

rubik's cube, group theory, haskell

Permutations and symmetry are central themes of group theory,
so it is perhaps not surprising that the Rubik’s Cube has a nice algebraic description.
The first such description was given by David Singmaster in his 1979 book
*Notes on Rubik’s Magic Cube*.
David Joyner used the algebra in Singmaster’s book to write an entire introductory
book on group theory, called *Adventures in Group Theory*. The book is fantastic!

Implementing the Rubik’s Cube Group in Haskell is quite simple, especially given all the algebra readily available in the language. What is more, it is really fun to play with the implementation and translate known Rubik’s Cube algorithms into group elements. Concepts like conjugation and commutators, which are important in the study of non commutative groups, are helpful tools for solving the Rubik’s Cube.

I am going to describe the Rubik’s Cube group and how I implemented it in Haskell (available here). For more details, to dive deeper, or to read about other similar puzzles, check out Adventures in Group Theory.

Here is an outline:

- The illegal Rubik’s Cube group
- Corner and edge orientations and permutations
- Semidirect products
- Haskell implementation of the illegal cube group
- The (legal) Rubik’s Cube group
- Haskell implementation of the legal cube group

**I highly recommend grabbing a cube to use while you read!**

If you take the Rubik’s Cube apart (without peeling off any stickers)

and put it back together anyway that the pieces will fit, you get a permutation of the
fifty-four stickers which may or may not be a solvable Rubik’s Cube anymore.
The collection of all such permutations is what Joyner calls the
**illegal Rubik’s Cube group**.
Investigating this group provides insight into the actual Rubik’s Cube group.
Later I will give a description of the *legal* Rubik’s Cube group as a subgroup of the illegal one.

Notice that the Rubik’s Cube is made out of eight corner pieces and twelve edge pieces. The key observation for the cube algebra is what Joyner calls the first fundamental theorem of cube theory (theorem 9.6.1):

A position of the Rubik’s Cube is completely determined by:

- how the corners are permuted
- how the corners are oriented
- how the edges are permuted
- how the edges are oriented

For example, consider the standard Singmaster move `R`

:

In the spirit of Stefan Pochmann’s blindsolving mnemonics, let us give some corners memorable names:

- red-yellow-green - Robin
- white-blue-red - Papa Smurf
- white-red-green - the Grinch
- blue-yellow-red - Superman

I am using the standard “minus yellow” coloring scheme here, where green is opposite blue and white is opposite yellow. The corners are permuted by R according to the following cycle

Superman ⮕ Robin ⮕ the Grinch ⮕ Papa Smurf ⮕ Superman

Just saying that Papa Smurf moves to Superman’s position is only half of the story. He lands in this position with the white sticker on the “blue face”. In some sequences of moves, he could land with the white sticker facing the “blue face”, the “yellow face”, or the “red face”. The point of the orientations is to specified these positions. Each corner will have three possible orientations and each edge will have two possible orientations.

The orientations are given relative to an arbitrary, but fixed, standard reference. For the corners, mark each white and yellow sticker with a plus sign. The standard reference for the corners is the position of the plus sign on the solved cube. Given an element of the illegal cube group, the orientation of a corner is the number of clockwise rotations needed to move the plus sign to match the standard reference. For example, consider again Papa Smurf after the move R. One clockwise turn is required to point the white sticker to the “yellow face”, Therefore this corner has orientation 1 in move R. And Superman needs two clockwise turns to point the yellow sticker to the “yellow face” after move R, so it has orientation 2. Similarly, the Grinch has orientation 2 after R and Robin has orientation 1.

The orientations for the edges are defined in the same way, relative to some standard reference.

Let \( \mathbb{Z}_m \) be the integers modulo \(m\), \( \mathbb{Z}_m^{n} \) the group of vectors over \( \mathbb{Z}_m \) of length \(n\), and \( \mathsf{S}_n \) be the group of permutations of \(n\)-element sets.

Then the **elements** of the **illegal Rubiks’s Cube group** are described by
\[
(\mathbb{Z}_3^{8}\times\mathsf{S}_{8})
\times
(\mathbb{Z}_2^{12}\times\mathsf{S}_{12}),
\]

where \( \mathbb{Z}_3^8 \) gives the eight corner orientations, \( \mathsf{S}_8 \) gives the corner permutation, \( \mathbb{Z}_2^{12} \) gives the twelve edge orientations, and \( \mathsf{S}_{12} \) gives the edge permutation.

Since \( \mathbb{Z}_n^m \) and \( \mathsf{S}_n \) are both groups, a group operation can be defined coordinate-wise on \( (\mathbb{Z}_3^{8}\times\mathsf{S}_{8}) \times (\mathbb{Z}_2^{12}\times\mathsf{S}_{12}) \), but this operation does not match the Rubik’s Cube. Consider the moves U, R, and UR:

U⮕

R⮕

U⮕ R⮕

Focusing only on the corners, consider the group operation on \( \mathbb{Z}_3^{8}\times\mathsf{S}_{8} \). Write U in terms of its corner orientations and permutation as \( (U_o, U_p) \). Similarly, write R as \( (R_o, R_p) \). What pair of orientations and permutation does \( UR=(U_o, U_p) \bullet (R_o, R_p) \) result in?

The Coordinate-wise operation in the second coordinate matches the Rubik’s Cube. Notice that UR permutes the corners according to \( R_p\circ U_p\).

This situation is different, however, for the orientations.
Notice that U does not change the orientation of any of the corners,
and that from the solved state R adds 1 to Robin’s orientation.
*But*, from the solved state UR adds 2 to Robin’s orientation.

If you are holding the cube and watching UR in action, it is clear what is happening:
\( U_p \) moves the Robin to Superman’s spot, and then \( R_o \)
adds 2 to the orientation of the corner now occupying Superman’s spot, namely Robin.
In other words, from the solved state, R adds 2 to Superman’s orientation,
but if some other corner is occupying that position, then R adds 2 to *that* cube instead.

The corner orientations of UR are therefore: \[ U_o \bullet R_0 = U_o + R_o’ \] where \( + \) is the usual operation on \(\mathbb{Z}_3^8 \), and \( R_o’ \) is the result of permuting the indices of \( R_o \) according to \( U_p \).

In general, let \( \phi_p(v) \) denote the vector obtained by permuting the indices of \(v\) according to \(p\). The operation on \( \mathbb{Z}_3^{8}\times\mathsf{S}_{8} \) is defined by

\[ (v, p) \bullet (w, q) = (v+\phi_{p}(w),~q\circ p) \]

This group is written as
\[ \mathbb{Z}_3^{8}\rtimes_\phi\mathsf{S}_{8} \]
or sometimes just \( \mathbb{Z}_3^{8}\rtimes\mathsf{S}_{8} \),
and is called the (external) **semidirect product** of
\( \mathbb{Z}_3^{8} \) and \( \mathsf{S}_{8} \) with respect to \( \phi \).

The corners of the illegal Rubik’s Cube group are given by \( \mathbb{Z}_3^{8}\rtimes\mathsf{S}_{8} \) and similarly the edges are given by \( \mathbb{Z}_2^{12}\rtimes\mathsf{S}_{12} \). Moreover, the illegal Rubik’s Cube group is given by \[ (\mathbb{Z}_3^{8}\rtimes\mathsf{S}_{8}) \times (\mathbb{Z}_2^{12}\rtimes\mathsf{S}_{12}) \]

This is proposition 11.1.1 of Joyner.

The semidirect products is more general than the construction in the last section.

Let \(\mathsf{Aut}(G)\) denote the group of automorphisms of a group \(G\) (i.e. the isomorphisms of \(G\) with itself under composition).

Let \(\phi:H\to\mathsf{Aut}(G)\) be a group homomorphism. Then \[ G\rtimes_\phi H\] is the group whose elements are \( G\times H\) and whose operation is given by \[ (a, x) \bullet (b, y) = (a\bullet\phi_x(b),~x\bullet y) \]

The illegal Rubik’s Cube group was described using \( \mathbb{Z}_m^{n}\rtimes_\phi\mathsf{S}_{n} \), where \( \phi_p(v) \) is the vector obtained by permuting the indices of \(v\) according to \(p\). It is easy to check that this \(\phi\) is a homomorphism from \( \mathsf{S}_{n} \) to \( \mathsf{Aut}(\mathbb{Z}_m^{n}) \).

Sometimes semidirect products are described as direct products with a “twist”. The twist is the replacement of \(b\) by \(\phi_x(b)\).

This subsection can be skipped. The semidirect product used in the illegal Rubik’s Cube group is actually an example of a specific kind of semidirect product called a wreath product.

A (left) group action is a function \[\phi: H\to (X\to X)\] from a group \(H\) to functions on a set \(X\), such that the identity in \(H\) is mapped to the identity function on \(X\), and which respect the group operation: \[\phi_{gh}=\phi_g\circ\phi_h\]

If \(\phi\) is a group action on \(X\), and \(G\) is a group, then there is a homomorphism \[ \Phi : H\to\mathsf{Aut}\left(\prod_XG\right) \] where \(\prod_XG\) is the direct product of \(G\) with itself using \(X\) as a index set. The definition of \( \Phi_h(v) \) is exactly like the construction used in the Rubik’s Cube, it is the vector obtained by permuting the indices of \(v\) according to \(\phi_h\).

Therefore an action \(\phi\) of \(H\) on \(X\) can be used to form a semidirect product \[\prod_XG \rtimes_\Phi H\]

This construction is called a **wreath product** and can be written as:
\[G \wr_X H\]
or just \(G \wr H\).

Bringing it back to the illegal Rubik’s Cube, notice that there is a group action of \(\mathsf{S}_n\) on \(\{1, 2, , n\}\) given by applying the permutations to the set of \(n\) numbers. Therefore the illegal Rubik’s Cube group is described by a wreath product: \[ (\mathbb{Z}_3\wr\mathsf{S}_{8}) \times (\mathbb{Z}_2\wr\mathsf{S}_{12}) \]

The implementation mostly entailed gluing together existing libraries and providing an explicit translation of the Singmaster moves to \( (\mathbb{Z}_3^{8}\rtimes\mathsf{S}_{8}) \times (\mathbb{Z}_2^{12}\rtimes\mathsf{S}_{12}) \).

For \(\mathbb{Z}_m\), I used the modular arithmetic package. I wrapped it in a newtype in order to specify it as a group under addition.

`newtype Cyclic n = Cyclic (Mod Int n)`

For \(\mathbb{Z}_m^n\), I used fixed size vectors over \(\mathbb{Z}_m\).

For \(\mathsf{S}_n\) I used fixed size vectors of length \(n\) over \(\mathbb{Z}_n\). For example, the vector \([2, 1, 0]\) corresponds to the permutation \[ \sigma=\left( \begin{array}{cc} 0 & 1 & 2\\ 2 & 1 & 0\end{array} \right) \]

`newtype Perm n = Perm (VecList n (Mod Int n))`

This type, unfortunately, admits instances that are not permutations,
since nothing prevents values from being repeated.
For example, the vector `[0, 0]`

is not a permutation even though it type checks.

For this reason, I made a function `mkPerm`

that only creates proper permutations.
The guarantee is achieved by only creating permutations from sequences of transpositions.
A transposition is defined as two modular integers:

`data Trnsp n = Trnsp (Mod Int n) (Mod Int n)`

which is technically not correct, but does not do much harm. The biggest issue is that you must be careful when computing the sign of a permutation not to count the fake transpositions. Here is a way to construct all proper permutations:

```
mkPerm :: forall n. Arity n => [Trnsp n] -> Perm n
= Perm $ DVF.map (\z -> foldl evalTrnsp z sws) (generate toMod)
mkPerm sws where
Trnsp x y)
evalTrnsp z (| z == x = y
| z == y = x
| otherwise = z
```

To make working with transpositions easier, I created two operators which mimic the usual cycle notation. The cycle \( (3~2~1~0) = (3~2)(3~1)(3~0) \) can be written as:

`3 ~~> 2 ~> 1 ~> 0) (`

which is turned into:

`Trnsp 3 2, Trnsp 3 1, Trnsp 3 0] [`

This is intended for creating cycles, though this is not enforced. For example,

`0 ~~> 1 ~> 2 ~> 1 ~> 3) (`

creates \( (0~1)(0~2)(0~1)(0~3) = (0~3)(1~2) \), etc.

Semidirect products are defined in
monoid extras.
The package provides the constructors `Semi`

and `Action`

, each which take two type parameters,
corresponding to the two components of the product.
Thinking of a semidirect product as a direct product with a twist,
`Action`

is used to define the twist \(\phi\) in the first coordinate.
For the Rubik’s Cube, we saw the twist when adding two orientation vectors together,
since the second vector had to have its indices permuted according the the permutation
associated with the first vector.
(Recall Robin having his orientation changed while in Superman’s original position.)

Once an instance of `Action`

is given, and provided the two types are monoids,
the corresponding instance of `Semi`

will have the semidirect product operation.

Therefore \( \mathbb{Z}_m^{n}\rtimes\mathsf{S}_{n} \) can be implemented with:

```
instance Arity n => Action (Perm n) (VecList n (Cyclic m)) where
Perm p) v = map (v !) p act (
```

```
instance (Arity n, KnownNat m) => Group (Semi (VecList n (Cyclic m)) (Perm n)) where
= tag (act p' (invert v)) p'
invert g where
= unSemi g
(v, p) = invert p p'
```

Note that the action `act`

*is not safe* for an arbitrary `Perm`

,
but it is safe for permutations made with `mkPerm`

.

Note also that inverses in the semidirect product are given by
\[ (v, p)^{-1} = \left(\phi_{p^{-1}}(v^{-1}),~p^{-1}\right)\]
and **not** by \( (v^{-1},~p^{-1}) \). The “twist” must be unwound.

The illegal Rubik’s Cube group is implemented as:

```
type Corners = Semi (VecList 8 (Cyclic 3)) (Perm 8)
type Edges = Semi (VecList 12 (Cyclic 2)) (Perm 12)
data IRubik = IRubik Corners Edges
```

The following function provides a convenient way to construct instances of `IRubik`

:

```
mkIRubik :: [Int] -> [Trnsp] -> [Int] -> [Trnsp] -> IRubik
=
mkIRubik co cp eo ep IRubik
map (Cyclic . toMod) co)) (mkPerm cp))
(tag (fromList (map (Cyclic . toMod) eo)) (mkPerm ep)) (tag (fromList (
```

The elements of the illegal Rubik’s Cube group which are also *legal* moves
are exactly those moves which can be described as a sequence of the basic Singmaster moves:
F, U, R, B, D, and L.
Therefore the legal moves can be easily expressed in
\(
(\mathbb{Z}_3^{8}\rtimes\mathsf{S}_{8})
\times
(\mathbb{Z}_2^{12}\rtimes\mathsf{S}_{12})
\)
by translating the basic moves and making use of the group operation.

The inverses are also translated for convenience.

An abstract basic move is defined as:

`data Move = F | U | R | B | D | L | F' | U' | R' | B' | D' | L'`

Translating the moves involves the nitty-gritty details of labeling the corners and edges with numbers and specifying the standard references for the orientations. The Haddocks/comments of the implementation contain these details, but this is not necessary for a high-level understanding.

The forward move F is given by:

```
f :: IRubik
= mkIRubik
f 1, 2, 0, 0, 2, 1, 0, 0] (0 ~~> 4 ~> 5 ~> 1)
[1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0] (0 ~~> 4 ~> 8 ~> 5) [
```

From this you can see that F performs a four-cycle on both the corners and the edges, and you can see how the corner and edge orientations change.

Constraining the creation of elements of the illegal cube group to those given by sequences of basic moves gives an implementation of the legal cube group:

`newtype Rubik = Rubik { illegal :: IRubik }`

```
mkRubik :: [Move] -> Rubik
= Rubik . foldMap moveToIR
mkRubik where
F = f
moveToIR F' = invert f
moveToIR .
.
.
```

Much more can be said about the legal cube group. We now embark on the journey for a nice algebraic description.

The (legal) Rubik’s Cube group is the subgroup of the illegal Rubik’s Cube group consisting of all elements \[ (v,~r,~w,~s)\in (\mathbb{Z}_3^{8}\rtimes\mathsf{S}_{8}) \times (\mathbb{Z}_2^{12}\rtimes\mathsf{S}_{12}) \] satisfying:

- “equal parity as permutations”: \[ \mathsf{sign}~r=\mathsf{sign}~s\]
- “conservation of total twists”: \[ v_1+\ldots+v_8\equiv 0~(\mathsf{mod}~3)\]
- “conservation of total flips”: \[ w_1+\ldots+w_{12}\equiv 0~(\mathsf{mod}~2)\]

This is theorem 11.2.1 of Joyner, also called the second fundamental theorem of cube theory, attributed to Ann Scott, which we now prove.

Proving that the moves of the legal cube group satisfy the properties above is fairly straightforward. First check that each of the the basic moves satisfies the properties. Then show that an arbitrary sequence of basic moves \(X_0 X_1 \ldots X_k\) also satisfies them. Proving “equal parity as permutations” amounts to noticing that everything in sight is a homomorphism. The conservation properties are proved by induction on the length \(k\).

The other direction is more interesting. We must be show that any element of the illegal Rubik’s Cube group which satisfies the three properties can be written as a sequence of the standard moves. This amounts to providing a sequence of basic moves which returns the cube to the solved state. This will be done in two steps. First we show how to return the corners and edges to their original position in the solved cube, while preserving the three properties. Then we show how to reorient the corners and edges without permuting anything.

Given an arbitrary element of the illegal group satisfying the properties, we must show how to place every corner and edge in the correct position using only basic moves, while preserving the properties.

The following three sequences will be instrumental (written as unit tests to make it clear what they do):

```
corner3cycle :: Bool
=
corner3cycle U, R, U', L', U, R', U', L]) ==
illegal (mkRubik [
mkIRubik0, 2, 2, 2, 0, 0, 0, 0] (1 ~~> 3 ~> 2)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] [] [
```

```
edge3cycle :: Bool
=
edge3cycle R, R, U, R, U, R', U', R', U', R', U, R']) ==
illegal (mkRubik [
mkIRubik0, 0, 0, 0, 0, 0, 0, 0] []
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] (0 ~~> 1 ~> 3) [
```

```
edgeSwapCornerSwap :: Bool
=
edgeSwapCornerSwap R',U, L', U, U, R, U', R', U, U, R, L, U']) ==
illegal (mkRubik [
mkIRubik0, 0, 0, 0, 0, 0, 0, 0] (2 ~~> 3)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] (1 ~~> 2) [
```

These threes moves, together with something called conjugation, lets us do the following:

- perform a three-cycle on any three corners without permuting anything else
- perform a three-cycle on any three edges without permuting anything else
- swap two corners and swap two edges without permuting anything else

Moreover, each of these maneuvers preserves the properties.

In an arbitrary group \(G\), two elements \(a,~b\in G\) are conjugate if
\[ a = cbc^{-1} \] for some \(c\in G\).
In the Rubik’s Cube, conjugation turns the specific 3-cycle in `corner3cycle`

into an arbitrary 3-cycle.
If \(X\) is an arbitrary sequence of basic moves, and \(Y\) is the
move in `corner3cycle`

, then \(XYX^{-1}\) permutes and reorients only
the corners that \(X\) moves to positions 1, 2 and 3, in exactly the same way
that \(Y\) affects the positions 1, 2 and 3 from the solved state.
Similarly, `edge3cycle`

together with conjugation allows us to perform arbitrary 3-cycles on the
edges, and `edgeSwapCornerSwap`

allows us to swap any two corners and two edges.

Step 1 is almost complete and relies on the fact that the alternating group
\(A_n\) (the even permutations of \(S_n\)) is generated by the 3-cycles.
Let \((v,~r,~w,~s)\) be an arbitrary element of the illegal cube group satisfying the three
properties.
We can assume that the permutations \(r\) and \(s\) are both even, since otherwise they are
both odd and we can apply `edgeSwapCornerSwap`

to get two even permutations.
Since we can generate arbitrary 3-cycles with basic moves, we can generate any even permutation,
including \(r\) and \(s\).

First we show that the “conservation of total twists” property guarantees that the corners can be reoriented to the solved positions without permuting anything.

There is a sequence of basic moves that twists one corner clockwise and another corner counter-clockwise, and does nothing else:

```
reorientCorners :: Bool
=
reorientCorners R', D', R, D, R', D', R, D, U, D', R', D, R, D', R', D, R, U']) ==
illegal (mkRubik [
mkIRubik1, 0, 0, 2, 0, 0, 0, 0] []
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] [] [
```

This sequence, together with conjugation, lets us solve the orientation of any move satisfying the conservation of twists property.

Similarly the “conservation of total flips” property guarantees that the edges can be reoriented to the solved positions without permuting anything. The following sequence, together with its conjugates, demonstrates this fact:

```
reorientEdges :: Bool
=
reorientEdges F, R', F', R', F, F, L, D, R, D', L', R', F, F, R, R]) ==
illegal (mkRubik [
mkIRubik0, 0, 0, 0, 0, 0, 0, 0] []
[1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0] [] [
```

This completes the proof.

**Fun tangent**: the `reorientCorners`

move is interesting algebraically.
For a given group \(G\), the commutator of two elements \(g, h\in G\) is defined as
\[ [gh]=g^{-1}h^{-1}gh \]
For Abelian groups the commutators are always just the
identity element, but for non-Abelian groups they are very useful.
For the Rubik’s Cube, commutators are useful since \([XY]\) will only
affect the corners and edges which are affected by both \(X\) and \(Y\).
The sequence in `reorientCorners`

is the commutator\([XU’]\),
where \(X\) is the commutator \([RD]\). Note that \(X\) has order 3,
and so \(X X=X^{-1}\).

We can write the legal Rubik’s Cube group in a more compact way that absorbs the properties.

First note that we can rewrite the group \[ \left\{ (v,~p)\in\mathbb{Z}_m^{n+1}\rtimes_\phi\mathsf{S}_{n+1} ~\mid~v_1+\ldots+v_{n+1}\equiv0~(\mathsf{mod}~m) \right\}\] as \[ \mathbb{Z}_m^{n}\rtimes_\phi\mathsf{S}_{n+1} \] since \(v_1+\ldots+v_{n+1}\equiv0~(\mathsf{mod}~m) \) means that \(v_{n+1}\) is always equal to \( -(v_1+\ldots+v_{n}) \) and can be left implicit.

Therefore, after a bit of rearranging, we can write the legal Rubik’s Cube group as \[ \left\{ (v,~r,~w,~s)\in (\mathbb{Z}_3^{7}\times\mathbb{Z}_2^{11}) \rtimes (\mathsf{S}_{8}\times\mathsf{S}_{12}) ~\mid~ \mathsf{sign}~r=\mathsf{sign}~s \right\} \]

Let \[ R_p = \left\{ (r,~s)\in\mathsf{S}_{8}\times\mathsf{S}_{12} ~\mid~ \mathsf{sign}~r=\mathsf{sign}~s \right\} \] Notice that \[ \mathsf{A}_{8} \times\mathsf{A}_{12} < R < \mathsf{S}_{8}\times\mathsf{S}_{12} \] Since \(\mathsf{A}_{8} \times\mathsf{A}_{12}\) has index 4 in \(\mathsf{S}_{8}\times\mathsf{S}_{12} \), then \(\mathsf{A}_{8} \times\mathsf{A}_{12}\) has index 2 in \(R_p\) and is therefore normal. Any pair of transpositions \((r,~s)\in\mathsf{S}_{8}\times\mathsf{S}_{12} \) has order 2, and so generates subgroup of \(R_p\) that is isomorphic to \(\mathbb{Z}_2\) and only intersects \(\mathsf{A}_{8} \times\mathsf{A}_{12}\) at the identity. Moreover these two subgroups generate all of \(R_p\).

The properties that have just been stated for \( \mathsf{A}_{8} \times\mathsf{A}_{12} \) and \( \mathbb{Z}_2 \) inside \(R_p\), namely that there are two subgroups such that:

- one of them is normal
- they have trivial intersection
- they generate the entire group

give an “internal” version of the earlier definition of an “external” semidirect product. It is not obvious, but it can be proved that these two constructions are the same.

Interestingly, the “twist” \(\phi\) in the external construction always ends up being conjugation inside the group. If \(H\rtimes K\) is an external semidirect product, we can identify \(H\) with \(\tilde{H}=\{(h,~e)~\mid~h\in H\}\) inside the product, and similarly for \(K\) in the second coordinate. Then \[ (1,~k)(h,~1)(1,~k)^{-1} =(\phi_k h,~k)(1,~k^{-1}) =(\phi_k h\cdot\phi_k 1,~k)(1,~kk^{-1}) =(\phi_k h,~1) \] On the other hand, if \(H,~K\) are subgroups of a group \(G\) such that \(H\) is normal, then conjugation is a homomorphism \(\phi:K\to\mathsf{Aut}(H)\).

Back to the Rubik’s Cube, \(R_p\) is an (internal) semidirect product: \[R_p=(\mathsf{A}_{8} \times\mathsf{A}_{12})\rtimes\mathbb{Z}_2\]

Putting it all together, the legal Rubik’s Cube group is \[ (\mathbb{Z}_3^{7}\times\mathbb{Z}_2^{11}) \rtimes \big((\mathsf{A}_{8} \times\mathsf{A}_{12})\rtimes\mathbb{Z}_2\big) \] How many states does the Rubik’s cube have? It is now easy to answer: \((3^7\cdot 2^{11}\cdot 8!\cdot 12!)~/~2\)

I do not yet have comments enabled on this blog, so feel free to chat by way of an issue on my blog or on the implementation. Happy cubing!

I used VisualCube to create the Rubik’s Cube images.