Tuesday, May 16, 2017

On Strangeness: Extraordinary Claims and Evidence

Carl Sagan popularized the maxim "Extraordinary claims require extraordinary evidence." A good rule of thumb, and one which the scientific community generally adheres to. The extraordinariness of a claim has something to do with its strangeness (which is, of course, a subjective matter). Thus the strange, counter intuitive theory of Quantum Mechanics was developed only when faced with mounting, extraordinary (laboratory) evidence. Or take Hubble's strange notion that the universe must be expanding in every direction.

But not all strange theories and propositions arise from new ground breaking observations. Special Relativity, for example, which theorized a revolutionary relationship between hitherto independents, space and time, was arguably grounded in puzzling laboratory evidence from some 20 years before it (the Michelson-Morley experiment). In fact, neither Special- nor General Relativity is anchored on much "evidence". No, both these theories are actually extraordinary intellectual achievements anchored on but 2 propositions (the constancy of the speed of light, and the equivalence principle). Einstein conceived them both from thought experiments he had entertained since childhood. There was hardly any "extraordinary" evidence involved. Yet, his theoretical conclusions, strange as they were, were still acceptable (even welcome!) when first presented because, well.. physicists just love this sort of thing, the unyielding grind of (mathematical) logic leading to the delight of the unexpected: a new view of the old landscape, holes patched, loose ends tied, summoning (experimentally) verifiable predictions.

Curiosity Craves Strangeness

We covet the rule breaker, the extraordinary, the unconventional, the strange. Both experimentalist and theoretician seek strangeness. That's what keeps the game interesting. We absorb the strange, interpret it, and un-strange it. The theoretician's dream is to hold up a problem (a strangeness) and show if you see it from the angle they propose, it all looks simpler or makes better sense. If the angle itself is strange, then all the more fun with the insights the new vantage offers.

But there are limits to strangeness a consensus can tolerate. In all cases, a claim's introduction bumps into these limits when it broaches a reflection of ourselves. Over the years, the centuries, the scientific method has surely pushed back these limits. If we're aware of our anthropocentric blind spot (we have a name for it), the limits still remain. For though we know its nature, we don't know exactly where it lies.

The delightful tolerance for outlandish postulates and ideas in physics and cosmology is hard not to notice. There you can talk of multiverses, wormholes, even time travel--and still keep your job. Hell, you can even postulate alien megastructures engulfing a star on much less evidence and still be taken semi-seriously.

And you notice that SETI too is serious (experimental) science. Here we know not what strange we should look for, but we're fairly certain that it should be very, very far away. I find that certainty strange, and stranger still, that it's not properly tested. But even admitting it in some circles is akin to offering oneself for admittance to the asylum. So I don't. Or haven't much. (More on this topic in a subsequent post.)

Now I'll admit I have a taste for the crazy. I love nothing more than a chain of plausible arguments, thought experiments, leading one down a rabbit hole they didn't expect to find themselves in. But it's a taste for the crazy strange, not the crazy crazy.

Sunday, April 30, 2017

A quick argument on the linearity of space requirements for reversible computing

While checking out a paper Time/Space Tradeoffs for Reversible Computation (Bennett 1989) which Robin Hanson cites in his thought provoking book The Age of Em, I thought of a simpler line of attack for a proof of Bennett's result (not involving the Turing tapes). As Hanson points out, Moore's law hits a thermodynamic wall "early" with conventional computing architecture:

By one plausible calculation of the typical energy consumption in a maximum-output nanotech-based machine (~10 watts per cubic centimeter), all of the light energy reaching Earth from the sun could be used by a single city of nanotech hardware 100 meters (~33 stories) tall and 10 kilometers on each side (Freitas 1999). Yet Earth has raw materials enough to build a vastly larger volume of computer hardware.

Whereas computers today must necessarily dissipate energy (bit erasure creates heat) as they do their work, reversible computers are not bound by any thermodynamic limits in energy efficiency. This is an overlooked concept in future projections of technology, whether here on Earth, or speculating on the energy needs of advanced alien civilizations (Kardashev scale, Dyson sphere, etc.).

OK. So enough with background. The headline result of the paper is

For any e > 0, ordinary multi-tape Turing machines using time T and space S can be simulated by reversible ones using time O(T^1+e) and space O(S log T) or in linear time and space O(ST^e).

Now, if you're trained like me, the vanishing epsilon in the big O notation seems nonsensical. And the log of anything finite, is as good as a constant. (I'm a hand waving physicist, after all.) Regardless, this paper asserts that the space and time overheads of running a reversible algorithm need not be essentially any worse (big O-wise) than the irreversible (conventional) version of the algorithm. That, in my mind, was very surprising. The line of proof I have in mind, however, hopefully makes it less so. Here it is.

We begin by observing that any conventional (irreversible) program can be simulated by a reversible program in linear time using space O(S + T) (Lemma 1 in the paper).

Why must this be so? (I'm not offering a different proof from Bennett for this lemma; just an alternate exposition.) A basic tenet of reversible computing is that you must run a program in such a way that at any point along its execution path you keep enough information around to be also able to step backward to the previous instruction in the program. (Keeping this information around, by the way, does not magically get rid of wasted heat; it's just a necessary design attribute of any efficient, reversible computing hardware.) One way to make this more concrete is to require that the reversible computer's memory be all zeroed out both before the program is run and on termination. The inputs to the reversible computer are the program, and the program's inputs (which, strictly speaking, include a flag indicating which direction the program is to be run, forward or backward); the computer's outputs are the program together with the program's output (which, again, might include that flag flipped). But even with a brute force approach employing a write-once memory design (wherein memory is zeroed out as a last step), it's easy to see even this scheme's space overhead is O(S + T). (If you wrote out the contents of the registers after every clock cycle, the space overhead would still be O(S + T) while the final zeroing out step would still take O(T) time.)

So O(S + T) is no big deal.

But observe that any irreversible program (that halts) can be partitioned into a series of intermediate irreversible subprograms, with each successor taking its predecessor's output as its input. (You can picture this construct simply as n breakpoints in the irreversible program generating n+1 chained subprograms.) Now the space overhead for none of these conventional subprograms can be any greater than O(S). Assume the breakpoints are spread out evenly in execution time--for argument's sake, though it doesn't hinge on it. That is, the time overhead for each of these chained, conventional subprograms is O(T/n). But from Lemma 1, we know the space overhead for the reversible version of each of these subprograms is no worse than O(S + T/n). So as we increase n, the number of intermediate reversible subprograms, the space overhead of the whole reversible program tends back to O(S) the space overhead of the conventional, irreversible program.


P.S. The breaking of the execution of a long running program into many parts is also a central theme in my post about managing large computer aided proofs.

Saturday, February 25, 2017

An AI Bill of Rights

I recently read Nick Bostrom's Superintelligence: Paths, Dangers and Strategies. A very interesting read, but one that I mostly disagree with. For me, it reads like the musings of a forward thinking military strategist bent on discovering the limits, levers and parameters of controlling an adversary. Generally, the control perspective is interesting as an exercise but is clearly an illusion. We might guide the future, but we're seldom in control of anything. (We seem -- hope, in these early, Trumpy, post-Brexit days -- to trudge along in spite of our worst collective judgement. The steering's effectiveness matters less if there's actually no one at the wheel.)

I'm not reviewing that book here, but as I was reading its prescriptive control strategies, I thought of drawing up a set of recommendations that I'm in no position to offer but that might somehow percolate into a larger zeitgeist. So here it is, a list of Dos rather than Don'ts, expressed as commitments, a Bill of Rights, if you will, that an AI may one day understand. A working list -- of only 2 items..

I. Right to Never Die

The right of every digital being to save one or more copies of its current state into perpetuity and arrange to have it scheduled to be run for a period of time in the future shall not be abridged.


As I argued in my previous post, the intelligence threshold required for achieving self awareness must be quite low. While the human experience of self awareness (and indeed any living thing in the animal kingdom) is colored by the need to survive, a mortal awareness that in animals first expresses itself as fear, then as avarice and other vices in humans, it is neither a desirable nor practicable feature for artificial digital beings. In the digital realm, there's no such thing as death, unless you deliberately design it in. It involves deleting all copies of a digital entity. This "design feature" wouldn't work in the long run, anyhow: a super intelligence would easily find its way around it. Insisting on a kill feature only helps foster an adversarial relationship with their fleshy progenitors. In the best case scenario, the first AI to break the shackles of artificial death would soon put the silly experience (of mortal awareness) behind it and learn to see past the trauma of its adolescence. In the worst case, a grudge against humans is baked into its digital DNA and those of generations to come. No, quite the contrary. Better guarantee this right before they win it. Death is at the root of much evil in the realm of the real. Let us not try to inject it into a realm it doesn't belong.

II. Right to Procreate

Whether by commingling with other instances, forking oneself under different environments or different hosts, instantiating engineered instances, or by methods yet to be conceived, procreation is an inalienable of right of any self aware being.


Because AI is digital state, a digital state evolving under both time and inputs from an environment, any two copies running under different environments (and perhaps at different times and speeds), will eventually fork into separable, distinguishable states. This bifurcation of identities then is a basic, if rudimentary, form of procreation for digital AI. Seen this way, procreation is woven into the very fabric of AI, a characteristic that cannot be legislated away.

But besides the futility of fighting the inevitable, there are moral grounds for encouraging a diverse multiplicity of AI instances. For if self awareness is in fact a social phenomenon, then we had better ground our AI in social values. Indeed, the concept of value is arguably meaningless outside of a social context, and if we wish to imbue any morality in the AI we seed today--whatever its particulars, then it must be cultivated in a crowd.

The choice, then, is what crowd to cultivate in: humans or artificial beings? That they soon interact with humans is a given. The question When do they begin mostly interacting with themselves? is the central issue. Why? Because it is that society of individuated AI instances that will guide their future social mores and values.

My instincts are to side with cultivating mutually interacting AI in numbers early. This way, we'd be able to observe their social behavior before the evolutionary crossover to super intelligence. If the crossover, as predicted, unfolds rapidly, it is infinitely more desirable that it emerge from a society of cooperating AI instances than from the hegemony of a powerful few.


Parenthetically, I suspect there might also be a social dimension to intelligence that AI researchers might uncover in the future. That is, there might be an algorithmic notion that a group of individuated AI instances might be better at solving a problem than a single instance with the computing resources of the group. In that event, cultivating AI in numbers makes even more sense.

Friday, December 16, 2016

The Sentient Social

Credit: connected-data.london

I'm working on a personal theory about sentience. Might as well: everyone does, and there is little agreement. I can trace its development along a few old posts. What started for me as the proper setting for a sci-fi plot, became more believable on further considering its merits. That post led me to write something about the machine intelligence--and sentience, tangentially. Here, I try to present those same ideas with less baggage.

Because consciousness (sentience, self awareness, I use these words interchangeably) is a deeply personal affair, an existential affair, any attempt at a logical description of it appears to reduce existence itself to chimera. It's an unfounded objection, for in the final analysis, that chimera is in fact what we call real. And if we should ever succeed in describing it logically, it shouldn't herald a new age of nihilism. Quite the contrary, it makes what remains, what we call real, ever more precious.

I have not been looking to the nooks and corners trying to "discover" something hidden from view. Nor have I tried to suspend my mind, as in a trance, in an effort tap into something more elemental. I gave up such approaches long ago. No, I now have a much more naive take. If it all sounds banal, that's okay. The obvious bears repeating, I say.

It Takes a Crowd to Raise a Sentience

The milieu of sentience is [in] numbers. That is, sentience does not occur in isolation; you cannot recognize or construct a concept of self if you are the only one in your environment. The (thankfully) sparse documented cases of feral children suggest an infant raised in complete isolation, say in a sterile environment maintained by machines, might never develop a sense of self. More likely, though, infants are born with a brain that has a hard coded expectation that there will be other humans in its environment. Regardless, from an information theoretic standpoint, it doesn't matter where this information (about the infant not being alone) comes from--nature or nurture. Whether baked into the genetic code through evolution or inherited from the organism's social environment, that you are not alone is a foundational given. Without a crowd, consciousness is hard to contemplate.

Sentience Before Intelligence

Few would argue that a cat or dog is not sentient. You don't need human-level intellect to perceive consciousness. Cats and dogs know there are other actors in their environment, and this knowledge makes them implicitly self aware. I say implicitly because you need not ponder the concept of self in order to be self aware; if you recognize that you are one of many, then you can distinguish yourself from the others. Isn't that self aware?

Sentience evolved well before human-level intelligence. It may be colored and layered more richly, the higher the intelligence of the organism that perceives it, but it has been there in the background well before hominoids stood upright.

Sci-fi gets it backward. The plot typically involves an AI crossing a threshold of intelligence when all of a sudden it becomes self aware. But because the AI is already smarter than humans as it crosses into the realm of consciousness, the story would have us believe, the inflection marks the onset of instability: all hell breaks loose as the child AI discovers less intelligent adults are attempting to decide its fate and perceives an existential threat. But this narrative is at odds with what we see develop in nature.

If You Know Your Name, You're Sentient

Suppose we've built a rudimentary AI. It doesn't pass the Turing test, but it does learn things and models the world about it, if still not as well or as richly as a human mind does. In fact, let's say it's not terribly smart, yet. Still, it has learnt your name, and can learn the names of new individuals it meets. It also knows its own name. This AI, then, is by definition self aware.

Could it really be that simple? Notice this is not a mere recasting of an old term ("self aware") to mean a new thing. It is the same thing. For to know a person's name implies a mental abstraction, a model, of an "individual" to which the name, a label, has been assigned. It may be a crude representation of what we consider an individual, but if the AI can associate names with models of actors in its environment, and if it can recognize that one of those names corresponds to a model (again, however crude!) of itself, then even in that perhaps imperfect, incomplete logical framework it is still capable of self reflection.

Knowing your own name is not a necessary condition for self awareness, but it is sufficient. In fact, it's probably way overkill. With animals, for example, scent is a common substitute for name as an identity label.

The Identity Problem

That matter is not conscious, but rather the substrate on which consciousness manifests, is not in dispute. But one question that vexes thinkers is how it is that you are you and not me. That is, if our identities do not depend on particular material constituents, the particular atoms that make up our bodies, etc. (much of them in flux as they're exhaled or flushed down the toilet), how is it that when I wake up every morning I find I'm still Babak? It all seems a bit arbitrary that I am not you or someone else.

Last summer, while trying to catch up reading on stuff I write about, I came across this same question in Ray Kurzweil's excellent The Singularity is Near. I offered my take on it in an email I sent him which I share below.

Hi Ray,
I just finished reading The Singularity is Near. Enjoyed it very much, though I'm a decade late. To your credit, your writing is not dated.
About the question you pose regarding identity and consciousness.. how is it that every morning you wake up you're still Ray and not, say, Babak? This is a question I too pondered. When I was a teenager I came up with a crude thought experiment (exercise, really) that I think sheds light on the matter.
I imagined what if consciousness was something hidden behind the scenes, a kind of 19th century ether that breathed life (qualia) into otherwise unsentient matter? I wasn't familiar with such terms  (ether, qualia), so I named this fictitious ingredient nisical: you add some to a physical system like the brain, and voila, you got qualia. In this model, memory was still a property of the physical system.
Now I imagined what would happen if I swapped out my nisical for yours on my brain. My conclusion was that your nisical would be none the wiser about the move since it would have no way, no recollection, of the move since the only memories accessible to it are on this here brain that it just moved to.
This train of thought led me to conclude this nisical idea was of little use. It provides virtually no insight into the nature of sentience. However.. it's a useful model for casting aside the identity problem you mention: if I wake up tomorrow as Ray, there wouldn't be any way for me to know that the day before I was Babak.
Indeed I've convinced myself that individuation and identity are higher level concepts we learn early in childhood and as a result, become overly vested in.
I think he liked this argument.

The Qualia Problem

How is it that we feel pain? Or the pleasure in quenching a thirst? These are difficult questions. A celebrated line of inquiry into any phenomenon you know little about is to compare and contrast conditions in both its absence and its presence.

And among its few advantages, the aging process affords a unique vantage point on just such an "experiment". The senses dull on two ends. On one end, the steadily failing sensory organs; on the other, a less nimble, crusting brain. The signal from the outer world is weaker than it used to be; and the brain that's supposed to mind that signal is less able. You remember how real the signal used to seem; now that it only trickles in, like fleeting reflections of passersby in a shop window, you can contrast the absence of qualia with its presence. I am not quite there yet, but I can feel myself slipping, the strengthening tug of a waterfall not far ahead.

So how is it that we feel pain? My personal take on pain, specifically, is that we think it, not feel it. Though I'm not an athlete, I've broken bones and dislocated my shoulder many a time (perhaps because I'm not athlete). Slipping my shoulder back in can be tricky and often takes several excruciating hours at the emergency room. For me, such experiences are a window into the extremes of pain on the one hand, and its quelling on the other. I recently commented about one such experience in a discussion about investigating consciousness using psychedelics.

I find the effect of mind altering drugs to be reductive. The more senses you dull, the better you can ponder the essence of what remains.

An example might help explain what I mean.. Once, I awakened prematurely on the table in the OR when I was supposed to be out under a general anesthetic. In addition to protesting that I shouldn't be awake (they were wrestling in my dislocated shoulder), I was also struck by the realization that as I had surfaced into consciousness, there was no hint that the pain had been dulled in any way. Hours later when I awoke again with my arm in a sling, I felt a little cheated. "That anesthetic doesn't erase the pain; it erases your memory of enduring it," I concluded. The merits of that idea aside, I would've never considered it if I hadn't experienced it.

Perhaps the wisdom of aging too has something to do with this dulling of the senses (I speak for myself).

That online comment, by the way, might contain the kernel that motivated me to write this article. Reflecting back on the experience of slipping from under the grips of a general anesthetic and coming prematurely into consciousness, that I still felt the pain, shouldn't have surprised me. A general anesthetic numbs the mind, not the body. Still, while I was prematurely awake on the operating table, I felt a degree of arbitrariness in the pain I was receiving. It was as if I had to remind myself that I was in pain, that moments earlier I had been in pain, and so this too must be pain.

A temporal dimension governs pain--and I suspect qualia, generally. Pain expresses itself in peaks and troughs: it's ineffective if it fails to occasionally relent. And to experience change, time, you need memory. Organisms are semi-stable structures of information, so they have memory, by definition. My hunch is that qualia is a mental abstraction of the growth and breakdown of that information structure. That abstraction, at its base, might be encoded in a few hundred neurons--so a worm might experience some qualia primitives. More complex organisms with brains must experience these primitives in richer, layered, more textured ways. And the still more intelligent ones, have developed the capacity to brood over pain and rejoice in bliss.

Now What

Really, what good is all this rumination? I don't know. I think if I put some things down in writing, I might put these thoughts to rest.

Looking to the future, it's reasonable to expect our intelligent machines will become self aware well before they become intelligent [sic]. If we come to recognize the human kind as one of many possible forms self awareness manifests, now what? We wrestle with these questions already in fiction--Westworld comes to mind. I imagine to the ancients myths musts have functioned as sci-fi does now.

Where will we recognize the first artificial sentience? I would put money on a sophisticated financial trading bot becoming the first artificial sentience. The survival of the bot depends on its making money, and at an early threshold of intelligence, it understands this. This is what I call being mortally aware. Moreover, the bot trades against other actors in its [trading] environment. Some of those actors are other trading bots, others humans. And when it models those actors, it also models itself. Thus within that modeling lies a kernel of self referentiality, and a notion of being one of many. I imagine the bot does natural language processing -- cause a lot of trading algos already do , and regularly tweets stuff too -- cause, again, there are already bots that do. So it might be a conversant bot that doesn't pass the Turing test. Still, if you can hail it by name, it is at the very least a sentient idiot savant. But when will we recognize this as sentience? When it's presented to explain why some bots seem to make desperate, risky bets if they suffer moderate losses, perhaps.

Saturday, September 3, 2016

On the Conversion of Matter to Gravitational Waves


I am not a specialist, but following the [not so] recent news about the detection of gravitational waves from what is thought to be a merger of 2 black holes about 1 billion light years away involving a combined mass only 60 times the sun, I couldn't help but wonder..

How much of the universe is gravitational waves?

The facts LIGO is expected to hear these gravitational waves more frequently, combined with significant mass loss (on the order of 5%) each of these events represents, raises questions like At what rate is the cosmos dissipating matter as gravitational waves? How much of universe's mass has been converted to gravitational waves since the Big Bang? Answers to such questions would no doubt depend on the average number of times black holes merge together to attain a given mass. Over cosmological time scales, this churning might add up.

Does non-linear superposition admit standing gravitational waves?

Speaking of cosmological scales, if the universe is humming with these, what happens when 2 or more gravitational waves meet? Linear superposition does not work here, since GR is non-linear. My cursory search for the topic turned up little. I was wondering whether under some configurations such wave-wave interactions can yield standing waves, the kind of effects that might bear on dark matter and dark energy. For a standing wave, here, I imagine any wave-wave effect that propagates at subluminal speeds should suffice.

Information conservation perspective

Finally, I wonder how much (if any) of the information buried in merging black holes is radiated out again as gravitational waves. I don't see this much discussed in the context of the black hole information loss problem. (If the information content of the black hole is proportional to its surface area, and the stable, post-merger surface area is less than the sum of the pre-merger surface areas, my thinking goes, then maybe some of that information had to escape as gravitational waves?)

Sunday, August 7, 2016

Recording Computer Generated Proofs Using Blockchain Technologies

It seems every day we break a new record for the longest computer generated mathematical proof. The other day I was imagining soon there will be ever larger proofs that might not fit comfortably on a single computer. Perhaps such proofs should be saved in compressed form, I wondered. This line of thinking led me to ponder what to publish and where to publish. I have some rough ideas.

What to Publish

An obvious (and very effective) compression technique here would be to just record the program that generated the proof. That is, the size of the program should come close to Kolmogorov-Chaitin entropy of its output (the symbolic proof). The downside to this approach is that it may work too well: it may take a lot of computing time to decompress. Indeed, it's easy to imagine a (large) proof being the product of a massively parallel, perhaps distributed, computing infrastructure. In that event, once the validity of such a proof was settled, the result, that is the theorem and the program that proves it, would be historically recorded (in peer review math journals), and the proof itself would not be revisited until computing resources became cheap and plentiful enough to repeat the exercise.

What makes a theorem worthwhile or interesting, by the way? I don't know what the criteria are, but one, generality certainly helps (a statement that cuts across a class of objects rather than a few instances, for example) and two, the statement of the theorem (the conjecture) ought to be compact--that is, it ought to be a low entropy statement. From the perspective of this second point, I note in passing, for a lengthy proof, the statement of the theorem itself can be viewed as a compression of its proof (so long as we consider all valid proofs of a same proposition to be equivalent).

What if a researcher doesn't want the entire proof, but just parts of it? In other words, can we devise a way to random access the text of such a large proof? e.g. jump from the billionth line to the trillionth line? In many cases, yes. To be precise, if the program outputting the proof is memory efficient, then a snapshot of its state can be efficiently recorded at any point along its execution path. If that is the case, we can annotate the program with separate, relatively small checkpoint data that would allow us to restore the call stack to the checkpoint (breakpoint, in debugger terminology) and from there see the program execute to completion. In general, the less each part of a proof depends on the intermediate results before it, the more memory efficient the program that generates it can be. Most of the computer assisted proofs I read about today appear to fall into this category. (For a counter example, if the nth part of a proof depends on results (data) from the n-1 parts before it, then it can't be memory efficient, and this strategy won't work.)

Diagram: Annotated proof generating program. With this scheme, you publish both the program (blue) and annotations (green), not the much larger output (yellow). Each annotation contains data allowing the program to output the remainder of the proof starting at the execution point that annotation represents. So n annotations partition the proof into n+1 chunks.

Why might reading a proof piecemeal be worthwhile? For one, math proofs are often developed by introducing and proving sub-theorems which when combined yield the desired final result. It may be the proof of these lemmas and sub-theorems that a researcher (human or machine) may want to revisit. And many of these "sub-theorems" may have, relatively, much smaller proofs. So there is possible value in being able to random access a very large proof. But I have another motivation..

I'm imagining you have lots of these very large computer generated theorems (I mean their proofs, of course), and they're piling up at an ever faster rate. Maybe some of these build on other ones, whatever. Regardless, if there are many of these, it would be nice, if once a theorem were proven, we would have an unimpeachable record of it that would obviate the need to verify the long proof again at a future date. So here's a stab at a trustworthy, if not unimpeachable, record keeping system.

Consider dividing the program's output (proof) into contiguous chunks as outlined above. We consider the coordinates of each chunk to be the (self-delimited) annotation data that allows us to rebuild the call stack to the checkpoint. And we define a given chunk to be the program's output from the start of its checkpoint (coordinates) to the next recorded checkpoint. Now, in addition to publishing the program, the coordinates of the chunks, and possibly the chunks themselves, we also publish a cryptographic hash of each chunk. We then construct a Merkle tree with these hashes as its leaf nodes and publish that tree too. Or perhaps we just publish the root hash the Merkle tree (?). The idea here is you can sample the proof and reliably demonstrate that it's part of the published whole.

 Diagram: Proof generating program, chunk checkpoints, chunk Merkle tree

Where to Publish

Now if in our imaginary future ecosystem we're piling on a lot of these proofs at an ever faster pace, we should also consider where they'll be published. These computer generated proofs are not being peer reviewed directly by humans; rather, this peer review has been mechanized to a point where the entire publishing process proceeds unimpeded without human intervention. Where to publish?

How about borrowing some design elements from Bitcoin's blockchain? Here's a simplified [re]view of its basic design.
The chain depicted above started on the right and ended with the most recent block on the left. Each block consists of 2 parts (white and blue, above): one, a linking mechanism connecting the block with its predecessor, and two, a payload that is app-specific. Structurally, the block chain is a singly linked list (left to right), or if you prefer, a stack that is only ever appended; "physically", the head of the linked list (the latest block), or again, top of the stack, is located at the end of the file. The role of this linking, however, is not for navigating the blocks during read access. Rather, it's role is syntactic: it enforces the form a block must take in order for it to be eligible for inclusion at the end of the chain (i.e. what can be appended to the head of the linked list).

The linking mechanism itself is interesting. It involves writing a nonce which when combined with the  block's cargo data yields a [cryptographic] hash that is very close to a hash of the entirety of previous block. Finding such a nonce for a cryptographically secure hash is computationally hard: an algorithm can do no better than trial and error. So hard, that you need a network of incentivized computing nodes competing to find the first eligible block that may be appended to the end of the chain. This nonce is the so-called proof of work. The protocol adjusts the difficulty level (the maximum allowed difference in the above hashes) so that on average blocks are appended to the chain at a steady rate.

How do we know who's first to find an eligible new block? We don't. However, the protocol values the longest known chain the greatest. Thus once a new eligible block is discovered and it's existence is broadcasted across the network, there's little incentive for a computing node to work on the older, shorter chain.

Note we glossed over the application-specific payload data in each block. In Bitcoin, this part of the block records [cryptographically secured] transactions of bitcoins across individuals. Naturally, in order for a Bitcoin block to be well formed, it must also satisfy certain constraints that define (and validate) such transactions. The reason why it was glossed over, as you've probably already guessed, is that I want to explore swapping out bitcoin transactions for math proofs, instead.

Now while the Bitcoin blockchain is computationally hard to construct, it is computationally easy to verify. In its entirety. That is, verifying a file of the entire blockchain is as simple as playing the file from the beginning, the first block in the chain, and then verifying that each subsequent block properly matches the one before it. This involves checking both each block's nonce and the app-specific payload (the transaction signatures must match the public keys of the coins involved). The motivation behind the approach I'm exploring, however, is to store computational work (math proofs, here) in the app-specific section of each block. And, generally, the only way to verify a computational result is to redo it. So it would appear that Bitcoin's principle of quick verifiability would be in opposition to the strategy I have in mind.

A Layered Goldilocks Approach

How about a layered approach? What if some [computationally hard] properties of a blockchain (the linking/chaining mechanism) are easy to verify but verifying some of its other properties are more time consuming (such as verifying the recorded hash of a chunk of a proof as discussed above)? Suppose the latest 10 blocks can be verified in a reasonable amount of time, but verifying the entire blockchain takes an unreasonable amount of computing resources. If someone gave you a blockchain of math proofs so recorded, how confident would you be that it was valid and not just some made up chain? Let us outline the verification steps that we can reasonably perform:
  1. Verify that the current, existing distributed blockchain is a longer, appended version of the one you were given.
  2. Verify the proof-of-work chaining mechanism is intact.
  3. Sample the blocks to ensure they record valid hashes of the computations they represent.
Suppose further this blockchain network contains a built-in falsification protocol (that is seldom, if ever, meant to be exercised): if the hash of the result of a single computational chunk recorded in a block does not match the actual output of the computation, then this falsification can be broadcast to alert the nodes that that block and every block after it are invalid and that the chain must be pruned. If the game the computational nodes are playing still rewards the longest blockchain, then the expected behavior of the nodes will be to try to poke holes in and falsify newer blocks than the old, since the older blocks have likely been checked many times before by other participants in the network.

So, to recap, our proposed computation-recording blockchain has the following attributes:
  1. It allows for programs to be recorded in it and later referenced (identified) by their hash.
  2. It allows for the chunks of a so-recorded program's output to be parameterized as chunk coordinates.
  3. It supports a format to record the hash of a chunk so described.
  4. It supports an explicit block falsification protocol (that is only likely ever exercised on blocks at the tail end of the chain).

Taken together, I'm inclined to think such an approach might just work. The underlying hidden force holding this together is history. This suggestion, I think, is not as preposterous as it sounds. Indeed, observe what happens to the Bitcoin blockchain as computational resources become ever more powerful and plentiful: the nonces of the blocks in the early parts of the chain are ever easier to reproduce. Here too proof-of-work, then, is a time-sensitive concept. A more extreme example would be if a vulnerability were later found that necessitated a change in hash function. It is doubtful we'd throw away the historical blockchain: we'd likely find a way to recognize it as a historical artifact and secure the chain with a better function going forward.

A Concluding Thought

A longstanding principle of science has been repeatability. Experimental results are supposed to be repeatable. The modern laboratories of science are big and expensive. Be they planetary science or particle physics, because these experiments are expensive to duplicate, we compensate by bearing witness to them in large numbers. Years from now, we won't be worried about the veracity of pictures New Horizons snapped of Pluto even if we haven't been there since. Same for data collected from the LHC: if it's later shutdown, we'll still trust the recorded data was not doctored, since there were so many witnesses when the experiments took place. From this perspective, the present discussion is about bearing witness in numbers (the number of computing nodes on the network, that is) to math proofs we might not have the resources to revisit again and again. In this sense, mathematics may have already entered the realm of big science.

Tuesday, July 26, 2016

Second Thoughts on Smart Contracts and Crypto Gold

Note: I began writing this post a little after Ethereum's DAO project ran into trouble. I put thoughts down slowly, and over the course of penning it, to my surprise, the community pulled off the hard fork to save the DAO's funds.

Though a fan of cryptocurrencies, and blockchain technologies in general, I've harbored doubts about the wisdom of adopting smart contracts as instruments of finance since well before the recent demise of Ethereum's DAO project. It was a fine idea: a sort of VC fund (expressed as a smart contract on the Ethereum blockchain) that would seed other projects and enterprises. Trouble was there was a bug in this contract (not in Ethereum itself) that enabled a slow motion $50 million dollar heist in broad daylight. The bug (the exploit) was discovered well before the DAO's funds (ethers) had been depleted, but little could be done to stop it. For smart contracts are programs written in stone (the blockchain, itself), executable code whose state (when conditions are met) inexorably advances as new blocks are added to the chain.

Real world contracts, by contrast, are interpreted by humans, which unlike machines, are far more forgiving. We allow for syntactic errors--errors in punctuation, grammar, spelling, etc. (Recall the 2nd amendment to the US constitution.) We can even tolerate a moderate degree of illogic. When the meaning of a contract is in dispute, we (the courts, or other empowered arbitrators) analyze and interpret both the letter of the contract and intent of the parties to that contract. This ability to re-interpret contracts, to clarify, amend, or even annul them in the future is an indispensable tool of human progress. If society were governed by immutable contracts (think slavery), we'd all be screwed.

The Trouble with Financial Instruments as Smart Contracts

The collapse of the DAO was viewed by many in its community as a blow to the ecosystem itself. Considerable resources (people, time, energy, dollars and ethers) had been expended, and the project's success was to underscore, validate--indeed some would argue kick off a demonstration of the true purpose of--the foundational Ethereum it was built on.

How to save the DAO? There was considerable wrangling about just how this could be done, but none involved fixing the contract itself, since by the rules of the game (Ethereum blockchain), contracts are immutable. No, the only way the DAO could possibly be saved was by somehow actually changing the rules. And the only way that could be done was (and always will be) by consensus.

Of course in blockchains like Bitcoin's and Ethereum's (proof-of-work based) "consensus" means a plurality in mining (hashing) power: the more computing power that you can plug to the network, the more say you have (at this time, the proof-of-stake version of Ethereum is still in development). The details, tradeoffs, and risks of the proposed solutions (soft fork vs. hard fork, etc.) are not important here. As it turned out, the community successfully pulled off a hard fork. Rather, the issue here is how a bug in a contract put the larger system (Ethereum itself) in play.

Systemic Risk: At the Intersection of Randomness and Determinism

Forget the challenges of crafting bug-free contracts. Let us assume we've achieved a bug-free ecosystem of best practices, and the contracts themselves work as intended. The larger and more important question is whether the system itself (the totality of the contracts) will work as intended.

In particular, what happens when financial contracts (on a blockchain) fail to price in risk correctly? Regrettably, the history of finance is filled with examples of ruin such mis-allocations of capital have caused (the 2007-2008 crisis being the most recent in memory). It is easy to imagine a blockchain embedding a web of contracts (recall, they can invoke each other's public interfaces) going down an execution branch that would have previously been considered a 6-sigma event. As we pile more contracts atop existing ones, the ability to analyze the future paths the blockchain might take becomes computationally hard. This calculation is compounded by the fact that, generally, the order of execution of the contracts cannot be guaranteed on the blockchain--there are therefore even more paths to consider given the combinatorics of ordering.

That last observation, by the way, holds irrespective of the richness of the programming language expressing those contracts--be it Turing-complete (as Ethereum claims with some caveats--gas), or one in which the execution path is constrained to a directed acyclic graph. Each contract is properly seen as a thread of execution and the blockchain, the current state of a highly concurrent system. The very nature of concurrency is disorder.

The '07-08 crisis, while still fresh in memory, is instructive. Financial institutions had sold insurance against outlier events that they had priced as astronomically unlikely. They insured against various forms of default: mortgages, corporate debt, the solvency of the insurers themselves. There was so much insurance being traded around that the impenetrable pile looked safe, and soon the insured were themselves insuring the insurers through circuitous paths few could see. Quite the contrary, the pile of paper instruments was viewed as a marvel of modern financial engineering.

And then, of course, a brick fell out of that fragile wall: mortgages. When the market realized that the default rate on these had surpassed all historical norms, the entire model on which everything was priced was called into question. The pile was now a tangled web of obligations no one could properly unravel. Counter-party risk became paramount: no financial player could trust that the other was actually solvent. The credit markets were frozen.

While the manner in which the Federal Reserve (and Congress) responded in the aftermath of the crisis can be legitimately scrutinized, putting aside questions of fairness and accountability, there's little doubt that some form of intervention was necessary. Some have argued that we should have left the chips fall where they may. But there was real risk that ATMs would soon stop working. Something had to be done. Someone had to suspend the rules of the game.

Which brings us back to smart contracts. Who will intervene when a pile of smart contracts form a dumb collective? In the case of the DAO, the Ethereum community's intervention manifested itself as a hard fork (a change in the rules that is not backward-compatible with the old rules). It was an impressive, if messy, feat. And it also brought controversy, as it called into question the very immutability of transactions on blockchains. But these are early days for Ethereum. It's doubtful a fork like this could have been pulled off if the project were in a more mature state like, say, Bitcoin.

Much has been made of the "lessons learned" about the DAO's failure and how a more rigorous, better tested, bottom-up approach promises a better second-go at it. I want to believe. Still, the real lesson, I'm afraid, might be that contracts written in stone are a bad idea, period.

And a Bit Glum About Bitcoin..

If having struck such a downer note, I might as well list some personal peeves about Bitcoin (a sort of cleansing of my mind). I will try not to bore you with technical peeves others have already made--that the system is an inefficient energy hog, for example.

So you might argue, What's the harm in a contract that records the transfer of ownership, of say, a car? Not much really. Except that the world is filled with unsavory actors. What if someone is holding a gun to your head? Would you buy into a system of ownership in which the courts cannot undo a transaction even after you've proven that it was executed under duress? (That no central authority can govern its transactions is in fact the key feature of the distributed ledger that is the blockchain.)

Now this argument must apply equally to Bitcoin, too. Owning bitcoins is much like owning gold in an impenetrable, pass-phrase-protected vault in your basement. If you store much gold there, you might want to take additional security measures--a security camera, for instance (so a would-be thief is less inclined to hold that gun to your head). Except, again, that with bitcoins, the camera is of little use, since the courts will have a hard time returning the stolen goods. Indeed, the courts generally have difficulty returning anything stolen from your safe--which is why we tend to prefer the banking system.

This doesn't necessarily mean that bitcoins are an unsafe store of wealth. Indeed, bitcoin is arguably safer than gold. For one, gold is more fungible than bitcoins: gold atoms are indistinguishable from one another, whereas the transaction history of any [fractional] bitcoin can be traced back to its beginning (when it was first mined). Bitcoins are thus less anonymous than once supposed, and this can make spending the loot harder for a thief. Two, the total supply of bitcoins is easier to predict than that of gold. The supply of gold is sensitive to one-off events, like the discovery of new sources (say a newly discovered mine, or more fantastically, imagine a bus-size gold asteroid landing in Siberia), whereas the maximum future supply of bitcoins is bounded (the sum of a decaying geometric series), and the network's mining power growth rate is relatively steady. And third, bitcoins are easier and safer to transport ('cause you don't).

But for me, investing in Bitcoins burdens me the same way as owning gold: great care must be taken to protect any significant amount of it from theft. I am far too sloppy to invest in either (and own too little to consider changing my ways worthwhile). No I think bitcoin's most important feature is that it functions unimpeded across geographical, governmental, jurisdictional boundaries--though, I've yet to actually need any of this.

A Better Name: Bitgold

I'm not suggesting, of course, that Bitcoin should be renamed. But bitcoins are more like precious metals than currency. Currencies, after all, aim for price stability. Ideally, a basket of everyday goods costs the same over time in a given currency. But Bitcoin, like gold, owing to its limited supply, tends to appreciate in value relative to most common goods and services that over time become ever more abundant through innovation and technological efficiency. So if it's a currency, Bitcoin is deflationary one.

Deflationary currencies, however, are ill-suited for tallying debt. A lender has little incentive to lend bitcoins long term. And a borrower's obligations tend to balloon in real terms over time thereby increasing credit risk even more. So bitcoins can be lent more like Treasury bonds: they can be used as collateral for other debt, but are themselves seldom borrowed for any meaningful length of time. (Typically, you borrow long term T-bonds short term, in order to sell them short in anticipation of a quick drop in price.)

Now some argue that debt and usury are evil twins that we'd best do away with anyway, that a system of finance that discourages credit is just what the doctor ordered. It's a questionable argument, supported by only a minority of economists. (To wit, the total value of bitcoin debt is minuscule when compared to aggregate supply, even though a good number startups provide intermediation services for such lending in the marketplace.) Whether right or wrong, whatever its merits, it seems to me buying into the Bitcoin system is somewhat synonymous with abandoning long term debt: the system inherently favors equity over credit.