A Universal FRACTRAN Interpreter in FRACTRAN

A Universal FRACTRAN Interpreter in FRACTRAN

Chris Lomont, May 1, 2017

Introduction

FRACTRAN is a Turing-complete esoteric programming language invented by the prolific mathematician John Conway in 1972 [Con72, Con87].

From Wikipedia (with slight edits):

A FRACTRAN program is an ordered list of positive (reduced) fractions together with an initial positive integer input s. The program is run by updating the integer s (which I call the state of the program) as follows:

  1. for the first fraction f in the list for which sf is an integer, replace s by sf.
  2. repeat this rule until no fraction in the list produces an integer when multiplied by s, then halt.

Note this means each time the process is started at the front of the list.

In The Book of Numbers [ConGuy86], John Conway and Richard Guy gave the following FRACTRAN program (and called it PRIMEGAME) :

{ 17/91 , 78/85 , 19/51 , 23/38 , 29/33 , 77/29 , 95/23 , 77/19 , 1/17 , 11/13 , 13/11 , 15/14 , 15/2 , 55/1 }

Starting with s=2, PRIMEGAME generates the following sequence of integers:
2, 15, 825, 725, 1925, 2275, 425, 390, 330, 290, 770, ... (sequence A007542 in the OEIS).

After the first 2, this sequence contains the following powers of 2 (spaced in the output):
2^2 = 4 , 2^3 = 8 , 2^5 = 32 , 2^7 = 128 , 2^11 = 2048 , 2^13 = 8192 , 2^17 = 131072 , 2 19 = 524288 , ...
(sequence A034785 in the OEIS).

The exponents of the powers of 2 are all the primes in order. PRIMEGAME generates all the prime numbers!

Here is Conway himself explaining it: http://www.uctv.tv/shows/Fractran-A-Ridiculous-Logical-Language-with-John-Conway-23320

A lesser known Conway FRACTRAN program is FIBONACCIGAME:

{17/65, 133/34, 17/19, 23/17, 2233/69, 23/29, 31/23, 74/341, 31/37, 41/31, 129/287, 41/43, 13/41, 1/13, 1/3}

Starting with 78*5^(n-1), it halts on 2^Fn where Fn is the nth Fibonacci number.

For years I have been fascinated by this, but have never had the time to dig into it. Friday night I decided to spend the weekend and really understand FRACTRAN, setting a motivational goal to write a nontrivial FRACTRAN program. After writing a small FRACTRAN interpreter in Mathematica to get a feel for it and to test ideas, I decided the write the FRACTRAN interpreter itself in FRACTRAN.

That’s right, I wrote FRACTRAN in FRACTRAN.

When I started, I did not know at least two others had done this, as recorded in this Code Golf challenge on StackOverflow. StackOverflow user Jesse Beder made one using 1779 fractions], and user Amadeus did it in 84 fractions. I discovered these while trying to understand FRACTRAN. Unfortunately, I did not find any FRACTRAN explanation as thorough as mine, so I decided to write it all up.

Mine ended up using 48 fractions, which, as far as I can tell, is the current record. I would not be surprised if my techniques can be reduced a few more fractions, which I may attempt if I return to this topic someday.

So, now to explain what I learned and how you too can waste a few days.

[Con72] Conway, J. H. "Unpredictable Iterations." In Proceedings of the 1972 Number Theory Conference Held at the University of Colorado, Boulder, Colo., Aug. 14-18, 1972. Boulder, CO: University of Colorado, pp. 49-52, 1972.

[Con87] Conway, J. H. "Fractran: A Simple Universal Programming Language for Arithmetic." Ch. 2 in Open Problems in Communication and Computation (Ed. T. M. Cover and B. Gopinath). New York: Springer-Verlag, pp. 4-26, 1987.

[ConGuy86] Conway, John H.; Guy, Richard K. (1996). The Book of Numbers. Springer-Verlag New York, Inc. ISBN 0-387-97993-X.

FRACTRAN 101

First, a few rules. The state s is required to be greater than 0, otherwise multiplying by any fraction would yield 0, which would repeat indefinitely. It’s not very interesting.

All fractions are required to be in reduced terms for clarity of analysis.

Finally, All fractions are required to have positive numerator and denominator, again to prevent uninteresting edge cases and to simplify the analysis.

Plenty of places work through PRIMEGAME, which is complicated so I will not repeat that. After grokking this post you will be able to reverse-engineer it, and write your own FRACTRAN programs.

To gain understanding, here are a few examples.

Example 1: Addition

The simplest FRACTRAN program is a single fraction.
{3/2}

What does this do to a state s? If s has a prime factor of 2, one such factor is removed, and a factor of 3 is added in, and the process repeats. If the state s does not have a prime factor of 2, the program halts.

So this program replaces factors of 2 with factors of 3, one per pass. Starting with 2^a 3^b will yield 3^(a+b) after enough steps.

So one way to view a single fraction with prime numerator and denominator is an adder of exponents.

Similarly for primes p and q, the fraction p/q sends p^a q^b to p^(a+b) after multiple steps.

Example 2: Subtraction

To subtract a constant k from the exponent of a prime p, the fraction 1/p^k will replace p^(k+a) with p^a. So subtracting by a constant is easy.

Suppose the state is p^a * q^b with and a >= b, and you want to compute p^(a-b)*q^anything. The idea is to subtract values from both p and q exponents, which is possible if both have positive exponents. The program {1/p*q} does this when repeatedly applied.

It will stop when the state is p^(a-b)*q^0. If a <=b to start, it ends on p^0 * q^(b-a). If you want to maintain the value in the exponent of p, then copy it elsewhere, perform the subtraction, and copy it back.

Now you can subtract by constants or by register values.

Example 3: Zero an exponent

Suppose you want to zero out an exponent. The adder gives the idea: take the program {1/2}. Each pass this removes any factor of 2 and replaces it with a 1, effectively removing a 2 from the factorization of s. After enough passes it will remove all factors of 2, setting the exponent of 2 in the prime factorization to 0.

Example 4: Move an exponent

Suppose you have some prime power q^d and you want to move the exponent d to a different prime p. Assume p has exponent 0 to start.

How do you do this?

Well, similar to the addition example, the program {p/q} will return the following sequence when run with state p^0 q^d:

p^0 * q^d, p^1 * q^(d-1), p^2 * q^(d-2), ..., p^d * q^0

at which point it halts. So we moved the value d, and zeroed the exponent of q.

Example 5 – Copy an exponent (and flow control!)

Example 4 showed how to move a value, but what if we want to copy it?

The solution is to make two copies at the same time, then copy one back. Use a third prime r as temporary storage, and assume it’s exponent is also 0. The program {p r / q} will take the initial state p^0 q^d through the steps

p^0 * r^0 * q^d, p^1 * r^1 * q^(d-1), p^2 * r^2 * q^(d-2), ..., p^d * r^d * q^0

at which point it stops.

The first idea to copy the exponent on r back to q is to follow up with the fraction q/r, but as soon as this gives q a positive exponent, the first fraction sends it back to r. We need a way to cause the first fraction to become inactive.

The trick now is to introduce primes that control the flow of the program. This lets the first fraction run to completion, then get "disabled", and run the second one till completion.

Since this gets messy in letters, here is a concrete version: given 2^d 7, convert this to 2^d 3^d * 13.

Take the time to understand how this works. It is a pretty illustrative program using flow control to effect an outcome.

Before we do more examples, an interlude:

Insight 1 : the register or variable viewpoint

A FRACTRAN state is completely characterized by the exponents of the prime powers in it’s factorized form. Thus a state s is

s = 2^r2 * 3^r3 * 5^r5 * 7^r7 * 11^r7 * .....

where the ri are non-negative integers, and all but a finite number are zero at any step.

A FRACTRAN program operates on these exponents, adding and subtracting from them.

A useful way to view a FRACTRAN program state is as an infinite tuple of non-negative integers (r2,r3,r5,r7,r11,...) where all but a finite number of entries are 0.

Thus the state 2^1*3^8*7^20 can be viewed as the following state:
(1, 3, 0, 20, 0, 0, ....).

A FRACTRAN program operates on this, adding and subtracting from the state, only modifying a finite number of locations at once.

So what does a fraction acting on a state do to the state?

Factoring the fraction f and state s into prime powers yields form

subject to

  1. Primes pi are disjoint from primes qj since f is reduced.
  2. All ri and sj are >= 1
  3. All ai and bj are >= 0.
  4. Other prime powers may occur in s, but are not affected by this operation.

fs is in integer if and only if all sj <= bj. This is the condition for the fraction to modify the state.

If fs is an integer, the new state is

Then the control flow goes back to the first fraction.

Instead of all this notation with primes, which are merely placeholders, look only at the state tuple of exponents exponents, and label the slots as v1, v2, v3, …. Note these numbers are not primes, they are the index of a prime. So v1 corresponds to prime 2, v2 corresponds to prime 3, v3 corresponds to prime 5, etc.

For all primes in the denominator of f, let I be the set of indices. For all primes in the numerator of f, let J be the set of indices. I and J are disjoint since f is reduced. Let ki be the exponent of the ith prime, which is a constant defined by the fraction. vi holds the corresponding slot in the state s.

Then multiplication of f by s becomes (// is a comment)

A FRACTRAN program is a list of such operations. This is the only operation.

Hopefully this last viewpoint is more in line with programmer-speak, making FRACTRAN more transparent.

Now some more examples.

Example: loops

The copy example showed a simple loop. The same idea generalizes and can be used to loop over many statements.

For example, if you know each line will work, here is an infinite loop

Example: Subroutines

Start with state s=41

This halts after 64 steps with the state 2^14. Surprisingly the largest state value during execution was 653663743948999527918944098014904289807469568.

Example: Multiplication

Multiplication of a times b can be done by adding a to a counter c, and doing it b many times. The resulting is a double loop (and necessary copying).

With input 2^a 3^b the program {455/33,11/13,1/11,3/7,11/2,1/3} produces output 5^(ab) .

It is worked through on the web many places. Wikipedia has a particularly good explanation, so I won’t detail it here.

Example: Div and Remainder

Division and remainder can be implemented with pseudo-code like

The only hard part is the while loop. This can be done by trying to subtract d from r as in the subtraction example, and checking if d was >= 1 at the end, in which case the subtraction failed. In the failure care, fix up the answer and return. If subtraction succeeded, loop again.

Now with these examples in mind, on to the overall goal, writing a FRACTRAN interpreter in FRACTRAN.

FRACTRAN FRACTRAN

The goal is to make a FRACTRAN program that runs any other FRACTRAN program.

The process

  1. Start with normal pseudo-code
  2. Transform into a register based version
  3. Label control flow tokens
  4. Replace registers and flow tokens with primes

The end result is fairly pretty. I hope you like it. It was a full weekend working all this out.

Notation

I’ll call the desired FRACTRAN interpreter CLF-INTERPRET (for Chris Lomont’s FRACTRAN INTERPRETer, in the spirit of Conway’s PRIMEGAME, and to distinguish it from the larger FRACTRAN interpreters mentioned in the introduction).

For CLF-INTERPRET to run any another FRACTRAN program, it will have to take as input state both the original FRACTRAN program and the original state encoded as an integer. The interpreter will decode the fractions, apply them to the simulated state, mark outputs, and halt if the original halts.

For example, if the FRACTRAN program to interpret is encoded as integer p, and the interpreted state is s, this could be stored for a CLF-INTERPRET state as

2^p 3^s

and CLF-INTERPRET can use other prime exponents for state to interpret the program in p.

PROGRAM-1 : Pseudo-code

Start with pseudo-code for a FRACTRAN interpreter, using C-like syntax without braces. Notation and assumptions:

  1. All integers are non-negative.
  2. a / b means divide integer a by b, round down, just like in C.
  3. a % b is modulus, returning the remainder of a divided by b.
  4. Indentation denotes blocks, like Python.
  5. // starts a comment that runs to the end of the line.
  6. Labels are of form label:, and are used for loops since goto is easy in FRACTRAN.
  7. Printing works. Later this will be done via a prime tag in the state.
  8. Multiplication, division, and modulus can be done.
  9. The initial program is passed in as a positive integer p.
  10. The initial state is passed in as a positive integer s.

Fleshing out the pseudo-code directs the encoding method. The initial idea is to encode digits from the fractions with "spacing" symbols to separate them. Using the symbols 0 to 9 for the digits in the original program and the symbol 10 to separate integers, a list of fractions can be encoded as a base 11 integer.

Then fractions are read from the encoded program one numerator or denominator at a time, digit by digit. Decoding is easier when the most significant digits of a numerator or denominator extract first, and extraction of digits from the encoded program is easiest from the lowest position of the encoded program. This means the digits of the original fractions are encoded in reverse order, with earlier fractions being lower significance in the encoded program.

An example:

To encode {3/5,40/23}, form the sequence {10,4,0,10,2,3,10,3,10,5 }, and treat these as the base-11 digits of a number, which is 24455007857 (in Mathematica or Wolfram Alpha FromDigits[{10, 4, 0, 10, 2, 3, 10, 3, 10, 5}, 11]).

Here is the pseudo-code for PROGRAM-1: FRACTRAN interpreter pseudo-code.

Pretty straightforward.

PROGRAM-2 : Register code

This part took the longest time to understand and get correct. Trying to get various math and flow of control things to work in some symbolic form amenable to conversion to FRACTRAN led me to many realizations, summarized here.

Flow control

Trying to tackle precise flow control simultaneously while converting the higher level pseudo-code was a mess, leading to a few aborted attempts. I learned to ignore flow control at this stage and do it afterwards.

Insight # 2 : A looping primitive

From Insight #1 above, performing an if is possible. We need loops, which can be done with an if, but converting the pseudo-code using only if statements was verbose. Quite often when a fraction modifies the state, that fraction does so many times in a row before it no longer results in an integer. This leads to a conceptual operation of the form

This is pretty much the same insight as #1, except it let me work on a slightly higher level. It abstracts the effective behavior of a fraction, making the program easier to reason about.

Observations

  1. The encoded program can be a huge number. This leads to many performance issues.
  2. Variable copies/moves/zeroing may require many steps, since only a constant can be moved per pass.
  3. Can prevent some copies by computing multiple copies the first time.
  4. A single fraction can act on any number of variables. Exploit it to lower the fraction count.
  5. Testing against 0 is tricky; this is avoided for simplicity. Testing for >= 1 is easy.
  6. Since the only operation adds or subtracts from variables, keeping them in known states is very useful.
  7. Zeroing registers is necessary before reusing them for the same process.
  8. Putting commented asserts for variables that should be 0 or nonzero helped debugging.
  9. Subroutines can often be avoided and inlined for shorter programs.
  10. Division, multiplication, and remainder are slow.
  11. Since FRACTRAN starts at the top each pass, organize code so the most common operations are higher whenever possible for performance.

Applying these observations led me to, after several aborted attempts to code ordering and algorithm form that was short and simple to reason about. The code form forced a new program encoding with padded, interleaved numerator and denominator digits. The new encoding is in the program comments. Instead of requiring base 11 program encoding as in the pseudo-code example, I used general base B in case I wanted to tweak it later. This allows speed, program size, and intermediate result size trade-offs.

In the end I was able to recast the pseudo-code into a set of nested loops using the while loop trick from insight #2.

Often querying a variable changes its value, so I use a trick of doing some things in pairs to avoid copying the variable first then restoring it. The trick is to do the work "over and back" in less fractions than all the original copying would require, and does twice the work without needing copy/restore of values. This led to the interleaved numerator/denominator format.

With all these things in mind, enjoy PROGRAM-2 : FRACTRAN interpreter register-code. It illustrates how to convert normal pseudo-code into a form that can be mechanically turned into FRACTRAN.

This program was implemented in Mathematica and used to interpret simple FRACTRAN programs. For anything too complex it is too slow, mostly due to the slow copies/moves. Replacing the while blocks with a fast version allowed larger programs to run, but then the simple division algorithm was the bottle neck. Replacing that with an equivalent routine made this algorithm able to successfully run all FRACTRAN programs tried, including Conway’s PRIMEGAME. So the algorithm is likely correct.

PROGRAM-3 : To fractions!

The next step towards CLF-INTERPRET is implementing the control flow of PROGRAM-2 using primes as flow control markers. This can be done by putting primes in the numerator and denominator of fractions, so the resulting instruction cannot possibly be executed except when that prime is a factor of the state s, and possible followup lines are dictated. These control flow primes dictate which lines are eligible for consideration each pass through the list of fractions.

Steps:

  1. Use the above program as comments.
  2. For each line in the original, write the corresponding fraction, using variables from the code.
  3. Replace the additive notation of the code to powers: p += k becomes p^k, p-=k becomes 1/p^k.
  4. Place a prime in each numerator and denominator to dictate flow.
  5. Use a flow token ([m], where m is an integer) to denote such a prime. Replace with primes later.
  6. Label loop controls and loop changes (see example below).
  7. Leave space in flow token numbering in case of errors.
  8. For labels, in the comment place the flow token to get there.

Consider an example. Take the block of code from PROGRAM-2:

Convert code to comments, remove unused lines, and make initial fractions (the goto is handled below).

For the first line, pick a flow token for the numerator and denominator. I picked [1] and [2]:

This line cannot execute without prime [1] as a factor of state. Whenever this fraction executes, the [1] is replaced by a [2] in the state. We want this to loop, so after it we could add [1]/[2], which would cause the line to executed until the rest of the denominator was not satisfied. Call such a fraction a "loop control" and place in the comments. It’s ok to place it before the line; it still works, and for loops over larger numbers of lines it results in less steps during execution.

We can also use this to cause multiple lines to loop. Consider this code (ignore the [?] and [!] for now)

If any of the fractions with a [1] in the denominator executes, then the top loop control resets the [2] with a [1], and the fractions are tested again.

There is a problem with the last line. If the last line used [2]/[1] like the previous, when it executed and reset to the top, then d would be positive, which would trigger a higher line. This would be incorrect.

The solution is to give the last line a new pair of loop controls, [3] and [4], and to insert a "loop change" construct that always executes after all lines in the loop cannot execute. This construct is [3]/[1] which switches to the new loop. This results in

And that’s all there is to it. Any time the next line will interfere with a previous line, switch to a new loop state. Tag program labels with a new state, and place tokens to effect gotos.

The final issue is print and program_end. These are handled by adding a unique flow token for each, so the state will have these factors when such events occur. Print is [101] and the interpreted program terminates when token [102] is a factor of the state (and the interpreter stops soon thereafter).

Finally, start the program with flow token [1].

Here is the resulting flow control tagged program.

The resulting program is 48 fractions, and has requires 32 unique primes (8 variables, 24 flow control tokens).

PROGRAM-4 : CLF-INTERPRET

The final step is very straightforward: replace each variable and flow token with a unique prime.

First pick a base B and replace it; I choose B=11.

The program needs 32 unique primes; the 32 smallest primes are those from 2 through 131. For fun I’ll assign letters to the first 26

My name is Chris Lomont. I’ll assign the first 10 primes needed at the program start to those with letters "CHRISLOMNT" for shameless self-promotion and trivial watermarking. This puts encoded program p as the exponent for prime 67, and start token [1] is 5.

Next I assign a few nice things: the interpreted state is prime 7. The print token 101 is prime 2, so anytime the state is even, the interpreted state is the exponent of the factor 7 of the CLF-INTERPRET state. The end token is prime 3, so anytime 3 divides the CLF-INTERPRET state, the interpreted program halted. CLF-INTERPRET will halt soon thereafter.

After this I assigned the lowest unused primes to program variables, then assigned primes to the flow control tokens.

The large fractions in the final form are due to having exponents of B and B-1 from the base expansions. Using a lower base like B=3 and redoing the work results in a version of CLF-INTERPRET with much smaller numbers, at the cost of slightly less clear encoding and needing more steps to extract numbers.

PROGRAM-5 : CLF-INTERPRET-FAST

CLF-INTERPRET, while theoretically correct (as far as I can test!) is too slow to run any decent programs, mostly due to the linear performance of lines such as this

The problem is an encoded program is a huge number, and lines like this reduce it in a linear manner. For example, PRIMEGAME encodes as the program

32753194753582418421057144093528848329987944476675050163790617367881883494565655231458924

which to extract the left most digit takes 1/11th as many subtractions. Clearly this line needs to be faster.

The solution is to take the resulting line as a fraction:

and expand it into multiple lines, with earlier ones extracting larger powers. This is mechanical to do. Separate the variables part from the flow control part, and simply raise the variable part to a set of decreasing powers:

For example, using powers of 10 to move a to b and c, this works for a up to a certain size.

For larger a it is still correct; it just becomes slow.

Lines with the comment :P: need enough expansion to reduce encoded programs, and lines with :S: need enough to reduce the state

For example, computing 5000 steps of PRIMEGAME, which outputs the primes 2,3,5,7,11,13, has a max state of 269070432954010009765625, a 77 bit number. Encoding PRIMEGAME gives a 295 bit number. So expanding the :P: into 295 lines using powers of 2 as exponents, and the :S: into 78 such lines gives an exponential speedup on these lines.

The above assumes the fractions of the interpreted program do not have excessively large integers in them, otherwise the same expansions need to be done to lines handling numerators and denominators.

The final issue is that the division algorithm, even with the above tricks, is still far too slow. Hence it was marked in it’s entirety with :D:.

The algorithm used, repeated subtraction, is basically (using the code notation)

The problem lies in that this loop takes s/d passes, which is far too large. The solution is to replace with a division algorithm that removes bigger multiples of d when possible (much like longhand school division).

The result is a much larger program, depending on how far one expands the necessary lines.

Since my weekend is over, and I need to stop working on this, I’ll stop here. Hopefully when I get time I’ll finish this part of the writeup, giving a parametrized version that easily expands to handle any (pre-chosen) size program. But this part of my code is still an experimental mess.

Sorry for the cliffhanger! The initial goal of CLF-INTERPRET was more of a success than I expected on Friday night!

THE END