### What to Publish

An obvious (and very effective) compression technique here would be to just record the program that generated the proof. That is, the size of the program should come close to Kolmogorov-Chaitin entropy of its output (the symbolic proof). The downside to this approach is that it may work too well: it may take a lot of computing time to decompress. Indeed, it's easy to imagine a (large) proof being the product of a massively parallel, perhaps distributed, computing infrastructure. In that event, once the validity of such a proof was settled, the result, that is the theorem and the program that proves it, would be historically recorded (in peer review math journals), and the proof itself would not be revisited until computing resources became cheap and plentiful enough to repeat the exercise.

What makes a theorem worthwhile or interesting, by the way? I don't know what the criteria are, but one, generality certainly helps (a statement that cuts across a class of objects rather than a few instances, for example) and two, the statement of the theorem (the conjecture) ought to be compact--that is, it ought to be a low entropy statement. From the perspective of this second point, I note in passing, for a lengthy proof, the statement of the theorem itself can be viewed as a compression of its proof (so long as we consider all valid proofs of a same proposition to be equivalent).

What if a researcher doesn't want the entire proof, but just parts of it? In other words, can we devise a way to random access the text of such a large proof? e.g. jump from the billionth line to the trillionth line? In many cases, yes. To be precise, if the program outputting the proof is memory efficient, then a snapshot of its state can be efficiently recorded at any point along its execution path. If that is the case, we can annotate the program with separate, relatively small checkpoint data that would allow us to restore the call stack to the checkpoint (breakpoint, in debugger terminology) and from there see the program execute to completion. In general, the less each part of a proof depends on the intermediate results before it, the more memory efficient the program that generates it can be. Most of the computer assisted proofs I read about today appear to fall into this category. (For a counter example, if the

*n*th part of a proof depends on results (data) from the

*n*-1 parts before it, then it can't be memory efficient, and this strategy won't work.)

*Diagram: Annotated proof generating program. With this scheme, you publish both the program (blue) and annotations (green), not the much larger output*

*(yellow)*. Each annotation contains data allowing the program to output the remainder of the proof starting at the execution point that annotation represents. So n annotations partition the proof into n+1 chunks.Why might reading a proof piecemeal be worthwhile? For one, math proofs are often developed by introducing and proving sub-theorems which when combined yield the desired final result. It may be the proof of these lemmas and sub-theorems that a researcher (human or machine) may want to revisit. And many of these "sub-theorems" may have, relatively, much smaller proofs. So there is possible value in being able to random access a very large proof. But I have another motivation..

I'm imagining you have lots of these very large computer generated theorems (I mean their proofs, of course), and they're piling up at an ever faster rate. Maybe some of these build on other ones, whatever. Regardless, if there are many of these, it would be nice, if once a theorem were proven, we would have an unimpeachable record of it that would obviate the need to verify the long proof again at a future date. So here's a stab at a trustworthy, if not unimpeachable, record keeping system.

Consider dividing the program's output (proof) into contiguous chunks as outlined above. We consider the coordinates of each chunk to be the (self-delimited) annotation data that allows us to rebuild the call stack to the checkpoint. And we define a given chunk to be the program's output from the start of its checkpoint (coordinates) to the next recorded checkpoint. Now, in addition to publishing the program, the coordinates of the chunks, and possibly the chunks themselves, we also publish a cryptographic hash of each chunk. We then construct a Merkle tree with these hashes as its leaf nodes and publish that tree too. Or perhaps we just publish the root hash the Merkle tree (?). The idea here is you can sample the proof and reliably demonstrate that it's part of the published whole.

*Diagram: Proof generating program, chunk checkpoints, chunk Merkle tree*

### Where to Publish

Now if in our imaginary future ecosystem we're piling on a lot of these proofs at an ever faster pace, we should also consider

*where*they'll be published. These computer generated proofs are not being peer reviewed directly by humans; rather, this peer review has been mechanized to a point where the entire publishing process proceeds unimpeded without human intervention. Where to publish?

How about borrowing some design elements from Bitcoin's blockchain? Here's a simplified [re]view of its basic design.

The chain depicted above started on the right and ended with the most recent block on the left. Each block consists of 2 parts (white and blue, above): one, a linking mechanism connecting the block with its predecessor, and two, a payload that is app-specific. Structurally, the block chain is a singly linked list (left to right), or if you prefer, a stack that is only ever appended; "physically", the head of the linked list (the latest block), or again, top of the stack, is located at the end of the file. The role of this linking, however, is not for navigating the blocks during read access. Rather, it's role is syntactic: it enforces the form a block must take in order for it to be eligible for inclusion at the end of the chain (i.e. what can be appended to the head of the linked list).

The linking mechanism itself is interesting. It involves writing a

*nonce*which when combined with the block's cargo data yields a [cryptographic] hash that is very close to a hash of the entirety of previous block. Finding such a nonce for a cryptographically secure hash is computationally hard: an algorithm can do no better than trial and error. So hard, that you need a network of incentivized computing nodes competing to find the first eligible block that may be appended to the end of the chain. This nonce is the so-called

*proof of work*. The protocol adjusts the difficulty level (the maximum allowed difference in the above hashes) so that on average blocks are appended to the chain at a steady rate.

How do we know who's first to find an eligible new block? We don't. However, the protocol values the longest known chain the greatest. Thus once a new eligible block is discovered and it's existence is broadcasted across the network, there's little incentive for a computing node to work on the older, shorter chain.

Note we glossed over the application-specific payload data in each block. In Bitcoin, this part of the block records [cryptographically secured] transactions of bitcoins across individuals. Naturally, in order for a Bitcoin block to be well formed, it must also satisfy certain constraints that define (and validate) such transactions. The reason why it was glossed over, as you've probably already guessed, is that I want to explore swapping out bitcoin transactions for math proofs, instead.

Now while the Bitcoin blockchain is computationally hard to construct, it is computationally easy to verify. In its entirety. That is, verifying a file of the entire blockchain is as simple as playing the file from the beginning, the first block in the chain, and then verifying that each subsequent block properly matches the one before it. This involves checking both each block's nonce and the app-specific payload (the transaction signatures must match the public keys of the coins involved). The motivation behind the approach I'm exploring, however, is to store computational work (math proofs, here) in the app-specific section of each block. And, generally, the only way to verify a computational result is to redo it. So it would appear that Bitcoin's principle of quick verifiability would be in opposition to the strategy I have in mind.

#### A Layered Goldilocks Approach

How about a layered approach? What if some [computationally hard] properties of a blockchain (the linking/chaining mechanism) are easy to verify but verifying some of its other properties are more time consuming (such as verifying the recorded hash of a chunk of a proof as discussed above)? Suppose the latest 10 blocks can be verified in a reasonable amount of time, but verifying the entire blockchain takes an unreasonable amount of computing resources. If someone gave you a blockchain of math proofs so recorded, how confident would you be that it was valid and not just some made up chain? Let us outline the verification steps that we can reasonably perform:

- Verify that the current, existing distributed blockchain is a longer, appended version of the one you were given.
- Verify the proof-of-work chaining mechanism is intact.
- Sample the blocks to ensure they record valid hashes of the computations they represent.

So, to recap, our proposed computation-recording blockchain has the following attributes:

- It allows for programs to be recorded in it and later referenced (identified) by their hash.
- It allows for the chunks of a so-recorded program's output to be parameterized as chunk coordinates.
- It supports a format to record the hash of a chunk so described.
- It supports an explicit block falsification protocol (that is only likely ever exercised on blocks at the tail end of the chain).

Taken together, I'm inclined to think such an approach might just work. The underlying hidden force holding this together is history. This suggestion, I think, is not as preposterous as it sounds. Indeed, observe what happens to the Bitcoin blockchain as computational resources become ever more powerful and plentiful: the nonces of the blocks in the early parts of the chain are ever easier to reproduce. Here too proof-of-work, then, is a time-sensitive concept. A more extreme example would be if a vulnerability were later found that necessitated a change in hash function. It is doubtful we'd throw away the historical blockchain: we'd likely find a way to recognize it as a historical artifact and secure the chain with a better function going forward.

### A Concluding Thought

A longstanding principle of science has been repeatability. Experimental results are supposed to be repeatable. The modern laboratories of science are big and expensive. Be they planetary science or particle physics, because these experiments are expensive to duplicate, we compensate by bearing witness to them in large numbers. Years from now, we won't be worried about the veracity of pictures New Horizons snapped of Pluto even if we haven't been there since. Same for data collected from the LHC: if it's later shutdown, we'll still trust the recorded data was not doctored, since there were so many witnesses when the experiments took place. From this perspective, the present discussion is about bearing witness in numbers (the number of computing nodes on the network, that is) to math proofs we might not have the resources to revisit again and again. In this sense, mathematics may have already entered the realm of big science.