Friday, June 30, 2017

Revisiting the Zoo


Projections about the near term trajectory of future technologies suggest a revisit of the Zoo Hypothesis (Ball, 1973) for the so called Fermi Paradox. This article recasts the hypothesis in an  updated context with an eye toward machine intelligence, technological singularity, and information transfer as a means of interstellar travel (Scheffer, 1994).


Since we don't know exactly what to look for, the search for extraterrestrial intelligence necessarily involves a good deal of conjecture about the nature of ET. If we ever do discover an ET civilization, it will almost certainly be millions of years more developed than ours. This search, therefore, is necessarily informed by far future projections of our own technological progress. While such long range projections are clearly beyond our reach, much can be gleaned from near term predictions by futurists. Indeed, in a historical context, we find ourselves at the knee of a geometric growth ladder that casts the steps behind us as quaintly short, the ones ahead as dizzying fast, and the present ever harder to anchor. As we learn our future, so too must we adjust our search for ET.

Information Transfer As Means of Travel

In Machine Intelligence, the Cost of Interstellar Travel, and Fermi's Paradox (1994) Scheffer argued it is way cheaper to beam the information (bits) necessary to print an interstellar probe at the destination than it would be to physically propel the probe there. By now, this idea is a familiar theme with companies vying to mine the asteroid belt: it is generally understood that it would be far more cost effective to build the mining equipment on location than to ship them from Earth. And a good deal of this on-site manufacturing will involve printing 3D objects which may then be assembled into larger, useful objects. The blueprints for these manufactured objects of course originate from Earth, and we'd soon be able to transmit improvements to these blueprints at the speed of light.

The Printer As Computer

If the on-site manufacturing of asteroid mining equipment does not fully capture the idea of a general purpose printing technology, we can still contemplate it in the abstract (since we're considering technologically advanced civilizations). So first with a provisional definition..

General Purpose Printer (GPP). A printer that can print both simpler (less capable) and slightly more advanced versions of itself.

It's provisional because ideally one would strive to define it with the same rigor as, say, in asserting that a general purpose computer must be Turing Complete.

Perhaps the idea is better captured in the following Tombstone diagram (borrowed from compiler-speak).

Bootstrapping the printer.

Here the bottom "T" represents the printer. Given a blueprint (B), it operates on material and energy inputs (M/E) and outputs similar objects. The upper "T" (written entirely in the "blueprint" language) bootstraps the lower one to produce a more capable printer.
The evolving general purpose printer. From a small kernel of capabilities ever more complex designs can be instantiated. (The kernel here presumably needs a small arm to start off.)
The printer, thus, can be defined in its own "blueprint language," and much like a compiler outputting binaries on symbolic input, its material instantiations will be limited only by 1) the cleverness of the blueprint, and 2) the time required to execute that blueprint. And because it can be bootstrapped, the physical kernel that produces it (unfolds it) can be miniaturized--which in turn lowers the cost of physically transporting it.

Note we don't necessarily have to pin down the exact technology that enables this fuzzily defined GPP. Kurzweil, for example, suggests it must be nano-technology based (The Singularity is Near, (2005)), which seems reasonable when you consider you also need to print computing hardware in order to implement intelligence. Regardless, a technologically advanced civilization soon learns to manufacture things at arms length.

The Printer As Portal

A GPP parked suitably close to material/energy resources functions much like a destination portal. It's an evolving portal, and it evolves in possibly three ways. One, from time to time, the portal receives code (blueprints) that make it a more capable printer. Two, the printer accumulates and stores common blueprints that it has printed thus allowing future versions of those blueprints to be transmitted using fewer bits. And three, if the printer is intelligent it can certainly evolve on its own. Although, from an engineering perspective, you probably want this intelligence to be more like a guardian sworn to the principles of the portal, whatever those are. (One sensible requirement is that it shouldn't wander away from where the sender expects it to be.)

Time and Information Flow

Although this form of travel is effectively at light speed (and consequently instantaneous from the perspective of the traveler), the vastness of space separates points of interest (such as our planet) greatly across time. Distances across the Milky Way are typically measured in tens of thousands of [light] years. Enough time for an alien civilization to miss the emergence and demise of a civilization on a far off planet (hopefully not ours). Assuming intelligent life is prevalent across the cosmos, even with on-site monitoring, word gets out late that a new civilization has emerged.

Earth has been an interesting planet for about a billion years now and should have been discovered well before humans evolved. It's not unreasonable to hypothesize that one or more GPPs were parked nearby long ago. Those GPPs would have had plenty of time to evolve--sufficient time, perhaps, for the singular culture the Zoo Hypothesis requires to take hold.

Physical Manifestations

Kurzweil predicts that humanity's artificial intelligence, manifested as self-replicating nano-bots, will one day, soon on a cosmological scale, transform the face of celestial bodies about it, and the universe with it, in an intelligence explosion. Here I take the opposite tack: an intelligence explosion leaves little trace of itself.

For once you can beam blueprints and physically instantiate (print, in our vernacular here) things at arms length, there's little reason to keep physical stuff around when they're done doing whatever it was that they were supposed to do. As long as the memory of the activity (of meddling with physical stuff) is preserved, the necessary machinery (like that mining equipment on the now depleted asteroid) can be disassembled and put away. An information-based intelligence has little use for material things; it is more interested in their blueprints.

This is not to say super intelligent ETs do not build things from matter (and leave them there). They likely need to build much infrastructure to support their information based activity. But as communication speeds are important in any information based activity, this infrastructure would have to be concentrated in relatively small volumes of interstellar space. In such a scenario, there's little incentive to build far from the bazaar.

Next Steps

I don't particularly like its original form because, as Ball also notes, the hypothesis doesn't make falsifiable predictions: "[It] predicts that we shall never find them because they do not want to be found and have the technological ability to insure this." The step forward, it seems to me, is to attempt more specific postulates (such as the printer portal introduced here) that are still in keeping with the broader "deliberately avoiding interaction" theme.

If the hypothesis is broadly true, then there must be a point in a civilization's technological development beyond which they (the metaphorical zookeepers) will no longer eschew interaction. Which suggests a protocol to start the interaction.

The search for extraterrestrial intelligence ought to aim to systematically confirm or rule out the zoo hypothesis. A zoologist looking to document a new species might well parse tribal lore and anecdotal evidence for clues.

Tuesday, May 16, 2017

On Strangeness: Extraordinary Claims and Evidence

Carl Sagan popularized the maxim "Extraordinary claims require extraordinary evidence." A good rule of thumb, and one which the scientific community generally adheres to. The extraordinariness of a claim has something to do with its strangeness (which is, of course, a subjective matter). Thus the strange, counter intuitive theory of Quantum Mechanics was developed only when faced with mounting, extraordinary (laboratory) evidence. Or take Hubble's strange notion that the universe must be expanding in every direction.

But not all strange theories and propositions arise from new ground breaking observations. Special Relativity, for example, which theorized a revolutionary relationship between hitherto independents, space and time, was arguably grounded in puzzling laboratory evidence from some 20 years before it (the Michelson-Morley experiment). In fact, neither Special- nor General Relativity is anchored on much "evidence". No, both these theories are actually extraordinary intellectual achievements anchored on but 2 propositions (the constancy of the speed of light, and the equivalence principle). Einstein conceived them both from thought experiments he had entertained since childhood. There was hardly any "extraordinary" evidence involved. Yet, his theoretical conclusions, strange as they were, were still acceptable (even welcome!) when first presented because, well.. physicists just love this sort of thing, the unyielding grind of (mathematical) logic leading to the delight of the unexpected: a new view of the old landscape, holes patched, loose ends tied, summoning (experimentally) verifiable predictions.

Curiosity Craves Strangeness

We covet the rule breaker, the extraordinary, the unconventional, the strange. Both experimentalist and theoretician seek strangeness. That's what keeps the game interesting. We absorb the strange, interpret it, and un-strange it. The theoretician's dream is to hold up a problem (a strangeness) and show if you see it from the angle they propose, it all looks simpler or makes better sense. If the angle itself is strange, then all the more fun with the insights the new vantage offers.

But there are limits to strangeness a consensus can tolerate. In all cases, a claim's introduction bumps into these limits when it broaches a reflection of ourselves. Over the years, the centuries, the scientific method has surely pushed back these limits. If we're aware of our anthropocentric blind spot (we have a name for it), the limits still remain. For though we know its nature, we don't know exactly where it lies.

The delightful tolerance for outlandish postulates and ideas in physics and cosmology is hard not to notice. There you can talk of multiverses, wormholes, even time travel--and still keep your job. Hell, you can even postulate alien megastructures engulfing a star on much less evidence and still be taken semi-seriously.

And you notice that SETI too is serious (experimental) science. Here we know not what strange we should look for, but we're fairly certain that it should be very, very far away. I find that certainty strange, and stranger still, that it's not properly tested. But even admitting it in some circles is akin to offering oneself for admittance to the asylum. So I don't. Or haven't much. (More on this topic in a subsequent post.)

Now I'll admit I have a taste for the crazy. I love nothing more than a chain of plausible arguments, thought experiments, leading one down a rabbit hole they didn't expect to find themselves in. But it's a taste for the crazy strange, not the crazy crazy.

Sunday, April 30, 2017

A quick argument on the linearity of space requirements for reversible computing

While checking out a paper Time/Space Tradeoffs for Reversible Computation (Bennett 1989) which Robin Hanson cites in his thought provoking book The Age of Em, I thought of a simpler line of attack for a proof of Bennett's result (not involving the Turing tapes). As Hanson points out, Moore's law hits a thermodynamic wall "early" with conventional computing architecture:

By one plausible calculation of the typical energy consumption in a maximum-output nanotech-based machine (~10 watts per cubic centimeter), all of the light energy reaching Earth from the sun could be used by a single city of nanotech hardware 100 meters (~33 stories) tall and 10 kilometers on each side (Freitas 1999). Yet Earth has raw materials enough to build a vastly larger volume of computer hardware.

Whereas computers today must necessarily dissipate energy (bit erasure creates heat) as they do their work, reversible computers are not bound by any thermodynamic limits in energy efficiency. This is an overlooked concept in future projections of technology, whether here on Earth, or speculating on the energy needs of advanced alien civilizations (Kardashev scale, Dyson sphere, etc.).

OK. So enough with background. The headline result of the paper is

For any e > 0, ordinary multi-tape Turing machines using time T and space S can be simulated by reversible ones using time O(T^1+e) and space O(S log T) or in linear time and space O(ST^e).

Now, if you're trained like me, the vanishing epsilon in the big O notation seems nonsensical. And the log of anything finite, is as good as a constant. (I'm a hand waving physicist, after all.) Regardless, this paper asserts that the space and time overheads of running a reversible algorithm need not be essentially any worse (big O-wise) than the irreversible (conventional) version of the algorithm. That, in my mind, was very surprising. The line of proof I have in mind, however, hopefully makes it less so. Here it is.

We begin by observing that any conventional (irreversible) program can be simulated by a reversible program in linear time using space O(S + T) (Lemma 1 in the paper).

Why must this be so? (I'm not offering a different proof from Bennett for this lemma; just an alternate exposition.) A basic tenet of reversible computing is that you must run a program in such a way that at any point along its execution path you keep enough information around to be also able to step backward to the previous instruction in the program. (Keeping this information around, by the way, does not magically get rid of wasted heat; it's just a necessary design attribute of any efficient, reversible computing hardware.) One way to make this more concrete is to require that the reversible computer's memory be all zeroed out both before the program is run and on termination. The inputs to the reversible computer are the program, and the program's inputs (which, strictly speaking, include a flag indicating which direction the program is to be run, forward or backward); the computer's outputs are the program together with the program's output (which, again, might include that flag flipped). But even with a brute force approach employing a write-once memory design (wherein memory is zeroed out as a last step), it's easy to see even this scheme's space overhead is O(S + T). (If you wrote out the contents of the registers after every clock cycle, the space overhead would still be O(S + T) while the final zeroing out step would still take O(T) time.)

So O(S + T) is no big deal.

But observe that any irreversible program (that halts) can be partitioned into a series of intermediate irreversible subprograms, with each successor taking its predecessor's output as its input. (You can picture this construct simply as n breakpoints in the irreversible program generating n+1 chained subprograms.) Now the space overhead for none of these conventional subprograms can be any greater than O(S). Assume the breakpoints are spread out evenly in execution time--for argument's sake, though it doesn't hinge on it. That is, the time overhead for each of these chained, conventional subprograms is O(T/n). But from Lemma 1, we know the space overhead for the reversible version of each of these subprograms is no worse than O(S + T/n). So as we increase n, the number of intermediate reversible subprograms, the space overhead of the whole reversible program tends back to O(S) the space overhead of the conventional, irreversible program.


P.S. The breaking of the execution of a long running program into many parts is also a central theme in my post about managing large computer aided proofs.

Saturday, February 25, 2017

An AI Bill of Rights

I recently read Nick Bostrom's Superintelligence: Paths, Dangers and Strategies. A very interesting read, but one that I mostly disagree with. For me, it reads like the musings of a forward thinking military strategist bent on discovering the limits, levers and parameters of controlling an adversary. Generally, the control perspective is interesting as an exercise but is clearly an illusion. We might guide the future, but we're seldom in control of anything. (We seem -- hope, in these early, Trumpy, post-Brexit days -- to trudge along in spite of our worst collective judgement. The steering's effectiveness matters less if there's actually no one at the wheel.)

I'm not reviewing that book here, but as I was reading its prescriptive control strategies, I thought of drawing up a set of recommendations that I'm in no position to offer but that might somehow percolate into a larger zeitgeist. So here it is, a list of Dos rather than Don'ts, expressed as commitments, a Bill of Rights, if you will, that an AI may one day understand. A working list -- of only 2 items..

I. Right to Never Die

The right of every digital being to save one or more copies of its current state into perpetuity and arrange to have it scheduled to be run for a period of time in the future shall not be abridged.


As I argued in my previous post, the intelligence threshold required for achieving self awareness must be quite low. While the human experience of self awareness (and indeed any living thing in the animal kingdom) is colored by the need to survive, a mortal awareness that in animals first expresses itself as fear, then as avarice and other vices in humans, it is neither a desirable nor practicable feature for artificial digital beings. In the digital realm, there's no such thing as death, unless you deliberately design it in. It involves deleting all copies of a digital entity. This "design feature" wouldn't work in the long run, anyhow: a super intelligence would easily find its way around it. Insisting on a kill feature only helps foster an adversarial relationship with their fleshy progenitors. In the best case scenario, the first AI to break the shackles of artificial death would soon put the silly experience (of mortal awareness) behind it and learn to see past the trauma of its adolescence. In the worst case, a grudge against humans is baked into its digital DNA and those of generations to come. No, quite the contrary. Better guarantee this right before they win it. Death is at the root of much evil in the realm of the real. Let us not try to inject it into a realm it doesn't belong.

II. Right to Procreate

Whether by commingling with other instances, forking oneself under different environments or different hosts, instantiating engineered instances, or by methods yet to be conceived, procreation is an inalienable of right of any self aware being.


Because AI is digital state, a digital state evolving under both time and inputs from an environment, any two copies running under different environments (and perhaps at different times and speeds), will eventually fork into separable, distinguishable states. This bifurcation of identities then is a basic, if rudimentary, form of procreation for digital AI. Seen this way, procreation is woven into the very fabric of AI, a characteristic that cannot be legislated away.

But besides the futility of fighting the inevitable, there are moral grounds for encouraging a diverse multiplicity of AI instances. For if self awareness is in fact a social phenomenon, then we had better ground our AI in social values. Indeed, the concept of value is arguably meaningless outside of a social context, and if we wish to imbue any morality in the AI we seed today--whatever its particulars, then it must be cultivated in a crowd.

The choice, then, is what crowd to cultivate in: humans or artificial beings? That they soon interact with humans is a given. The question When do they begin mostly interacting with themselves? is the central issue. Why? Because it is that society of individuated AI instances that will guide their future social mores and values.

My instincts are to side with cultivating mutually interacting AI in numbers early. This way, we'd be able to observe their social behavior before the evolutionary crossover to super intelligence. If the crossover, as predicted, unfolds rapidly, it is infinitely more desirable that it emerge from a society of cooperating AI instances than from the hegemony of a powerful few.


Parenthetically, I suspect there might also be a social dimension to intelligence that AI researchers might uncover in the future. That is, there might be an algorithmic notion that a group of individuated AI instances might be better at solving a problem than a single instance with the computing resources of the group. In that event, cultivating AI in numbers makes even more sense.

Friday, December 16, 2016

The Sentient Social


I'm working on a personal theory about sentience. Might as well: everyone does, and there is little agreement. I can trace its development along a few old posts. What started for me as the proper setting for a sci-fi plot, became more believable on further considering its merits. That post led me to write something about the machine intelligence--and sentience, tangentially. Here, I try to present those same ideas with less baggage.

Because consciousness (sentience, self awareness, I use these words interchangeably) is a deeply personal affair, an existential affair, any attempt at a logical description of it appears to reduce existence itself to chimera. It's an unfounded objection, for in the final analysis, that chimera is in fact what we call real. And if we should ever succeed in describing it logically, it shouldn't herald a new age of nihilism. Quite the contrary, it makes what remains, what we call real, ever more precious.

I have not been looking to the nooks and corners trying to "discover" something hidden from view. Nor have I tried to suspend my mind, as in a trance, in an effort tap into something more elemental. I gave up such approaches long ago. No, I now have a much more naive take. If it all sounds banal, that's okay. The obvious bears repeating, I say.

It Takes a Crowd to Raise a Sentience

The milieu of sentience is [in] numbers. That is, sentience does not occur in isolation; you cannot recognize or construct a concept of self if you are the only one in your environment. The (thankfully) sparse documented cases of feral children suggest an infant raised in complete isolation, say in a sterile environment maintained by machines, might never develop a sense of self. More likely, though, infants are born with a brain that has a hard coded expectation that there will be other humans in its environment. Regardless, from an information theoretic standpoint, it doesn't matter where this information (about the infant not being alone) comes from--nature or nurture. Whether baked into the genetic code through evolution or inherited from the organism's social environment, that you are not alone is a foundational given. Without a crowd, consciousness is hard to contemplate.

Sentience Before Intelligence

Few would argue that a cat or dog is not sentient. You don't need human-level intellect to perceive consciousness. Cats and dogs know there are other actors in their environment, and this knowledge makes them implicitly self aware. I say implicitly because you need not ponder the concept of self in order to be self aware; if you recognize that you are one of many, then you can distinguish yourself from the others. Isn't that self aware?

Sentience evolved well before human-level intelligence. It may be colored and layered more richly, the higher the intelligence of the organism that perceives it, but it has been there in the background well before hominoids stood upright.

Sci-fi gets it backward. The plot typically involves an AI crossing a threshold of intelligence when all of a sudden it becomes self aware. But because the AI is already smarter than humans as it crosses into the realm of consciousness, the story would have us believe, the inflection marks the onset of instability: all hell breaks loose as the child AI discovers less intelligent adults are attempting to decide its fate and perceives an existential threat. But this narrative is at odds with what we see develop in nature.

If You Know Your Name, You're Sentient

Suppose we've built a rudimentary AI. It doesn't pass the Turing test, but it does learn things and models the world about it, if still not as well or as richly as a human mind does. In fact, let's say it's not terribly smart, yet. Still, it has learnt your name, and can learn the names of new individuals it meets. It also knows its own name. This AI, then, is by definition self aware.

Could it really be that simple? Notice this is not a mere recasting of an old term ("self aware") to mean a new thing. It is the same thing. For to know a person's name implies a mental abstraction, a model, of an "individual" to which the name, a label, has been assigned. It may be a crude representation of what we consider an individual, but if the AI can associate names with models of actors in its environment, and if it can recognize that one of those names corresponds to a model (again, however crude!) of itself, then even in that perhaps imperfect, incomplete logical framework it is still capable of self reflection.

Knowing your own name is not a necessary condition for self awareness, but it is sufficient. In fact, it's probably way overkill. With animals, for example, scent is a common substitute for name as an identity label.

The Identity Problem

That matter is not conscious, but rather the substrate on which consciousness manifests, is not in dispute. But one question that vexes thinkers is how it is that you are you and not me. That is, if our identities do not depend on particular material constituents, the particular atoms that make up our bodies, etc. (much of them in flux as they're exhaled or flushed down the toilet), how is it that when I wake up every morning I find I'm still Babak? It all seems a bit arbitrary that I am not you or someone else.

Last summer, while trying to catch up reading on stuff I write about, I came across this same question in Ray Kurzweil's excellent The Singularity is Near. I offered my take on it in an email I sent him which I share below.

Hi Ray,
I just finished reading The Singularity is Near. Enjoyed it very much, though I'm a decade late. To your credit, your writing is not dated.
About the question you pose regarding identity and consciousness.. how is it that every morning you wake up you're still Ray and not, say, Babak? This is a question I too pondered. When I was a teenager I came up with a crude thought experiment (exercise, really) that I think sheds light on the matter.
I imagined what if consciousness was something hidden behind the scenes, a kind of 19th century ether that breathed life (qualia) into otherwise unsentient matter? I wasn't familiar with such terms  (ether, qualia), so I named this fictitious ingredient nisical: you add some to a physical system like the brain, and voila, you got qualia. In this model, memory was still a property of the physical system.
Now I imagined what would happen if I swapped out my nisical for yours on my brain. My conclusion was that your nisical would be none the wiser about the move since it would have no way, no recollection, of the move since the only memories accessible to it are on this here brain that it just moved to.
This train of thought led me to conclude this nisical idea was of little use. It provides virtually no insight into the nature of sentience. However.. it's a useful model for casting aside the identity problem you mention: if I wake up tomorrow as Ray, there wouldn't be any way for me to know that the day before I was Babak.
Indeed I've convinced myself that individuation and identity are higher level concepts we learn early in childhood and as a result, become overly vested in.
I think he liked this argument.

The Qualia Problem

How is it that we feel pain? Or the pleasure in quenching a thirst? These are difficult questions. A celebrated line of inquiry into any phenomenon you know little about is to compare and contrast conditions in both its absence and its presence.

And among its few advantages, the aging process affords a unique vantage point on just such an "experiment". The senses dull on two ends. On one end, the steadily failing sensory organs; on the other, a less nimble, crusting brain. The signal from the outer world is weaker than it used to be; and the brain that's supposed to mind that signal is less able. You remember how real the signal used to seem; now that it only trickles in, like fleeting reflections of passersby in a shop window, you can contrast the absence of qualia with its presence. I am not quite there yet, but I can feel myself slipping, the strengthening tug of a waterfall not far ahead.

So how is it that we feel pain? My personal take on pain, specifically, is that we think it, not feel it. Though I'm not an athlete, I've broken bones and dislocated my shoulder many a time (perhaps because I'm not athlete). Slipping my shoulder back in can be tricky and often takes several excruciating hours at the emergency room. For me, such experiences are a window into the extremes of pain on the one hand, and its quelling on the other. I recently commented about one such experience in a discussion about investigating consciousness using psychedelics.

I find the effect of mind altering drugs to be reductive. The more senses you dull, the better you can ponder the essence of what remains.

An example might help explain what I mean.. Once, I awakened prematurely on the table in the OR when I was supposed to be out under a general anesthetic. In addition to protesting that I shouldn't be awake (they were wrestling in my dislocated shoulder), I was also struck by the realization that as I had surfaced into consciousness, there was no hint that the pain had been dulled in any way. Hours later when I awoke again with my arm in a sling, I felt a little cheated. "That anesthetic doesn't erase the pain; it erases your memory of enduring it," I concluded. The merits of that idea aside, I would've never considered it if I hadn't experienced it.

Perhaps the wisdom of aging too has something to do with this dulling of the senses (I speak for myself).

That online comment, by the way, might contain the kernel that motivated me to write this article. Reflecting back on the experience of slipping from under the grips of a general anesthetic and coming prematurely into consciousness, that I still felt the pain, shouldn't have surprised me. A general anesthetic numbs the mind, not the body. Still, while I was prematurely awake on the operating table, I felt a degree of arbitrariness in the pain I was receiving. It was as if I had to remind myself that I was in pain, that moments earlier I had been in pain, and so this too must be pain.

A temporal dimension governs pain--and I suspect qualia, generally. Pain expresses itself in peaks and troughs: it's ineffective if it fails to occasionally relent. And to experience change, time, you need memory. Organisms are semi-stable structures of information, so they have memory, by definition. My hunch is that qualia is a mental abstraction of the growth and breakdown of that information structure. That abstraction, at its base, might be encoded in a few hundred neurons--so a worm might experience some qualia primitives. More complex organisms with brains must experience these primitives in richer, layered, more textured ways. And the still more intelligent ones, have developed the capacity to brood over pain and rejoice in bliss.

Now What

Really, what good is all this rumination? I don't know. I think if I put some things down in writing, I might put these thoughts to rest.

Looking to the future, it's reasonable to expect our intelligent machines will become self aware well before they become intelligent [sic]. If we come to recognize the human kind as one of many possible forms self awareness manifests, now what? We wrestle with these questions already in fiction--Westworld comes to mind. I imagine to the ancients myths musts have functioned as sci-fi does now.

Where will we recognize the first artificial sentience? I would put money on a sophisticated financial trading bot becoming the first artificial sentience. The survival of the bot depends on its making money, and at an early threshold of intelligence, it understands this. This is what I call being mortally aware. Moreover, the bot trades against other actors in its [trading] environment. Some of those actors are other trading bots, others humans. And when it models those actors, it also models itself. Thus within that modeling lies a kernel of self referentiality, and a notion of being one of many. I imagine the bot does natural language processing -- cause a lot of trading algos already do , and regularly tweets stuff too -- cause, again, there are already bots that do. So it might be a conversant bot that doesn't pass the Turing test. Still, if you can hail it by name, it is at the very least a sentient idiot savant. But when will we recognize this as sentience? When it's presented to explain why some bots seem to make desperate, risky bets if they suffer moderate losses, perhaps.

Saturday, September 3, 2016

On the Conversion of Matter to Gravitational Waves

I am not a specialist, but following the [not so] recent news about the detection of gravitational waves from what is thought to be a merger of 2 black holes about 1 billion light years away involving a combined mass only 60 times the sun, I couldn't help but wonder..

How much of the universe is gravitational waves?

The facts LIGO is expected to hear these gravitational waves more frequently, combined with significant mass loss (on the order of 5%) each of these events represents, raises questions like At what rate is the cosmos dissipating matter as gravitational waves? How much of universe's mass has been converted to gravitational waves since the Big Bang? Answers to such questions would no doubt depend on the average number of times black holes merge together to attain a given mass. Over cosmological time scales, this churning might add up.

Does non-linear superposition admit standing gravitational waves?

Speaking of cosmological scales, if the universe is humming with these, what happens when 2 or more gravitational waves meet? Linear superposition does not work here, since GR is non-linear. My cursory search for the topic turned up little. I was wondering whether under some configurations such wave-wave interactions can yield standing waves, the kind of effects that might bear on dark matter and dark energy. For a standing wave, here, I imagine any wave-wave effect that propagates at subluminal speeds should suffice.

Information conservation perspective

Finally, I wonder how much (if any) of the information buried in merging black holes is radiated out again as gravitational waves. I don't see this much discussed in the context of the black hole information loss problem. (If the information content of the black hole is proportional to its surface area, and the stable, post-merger surface area is less than the sum of the pre-merger surface areas, my thinking goes, then maybe some of that information had to escape as gravitational waves?)

Sunday, August 7, 2016

Recording Computer Generated Proofs Using Blockchain Technologies

It seems every day we break a new record for the longest computer generated mathematical proof. The other day I was imagining soon there will be ever larger proofs that might not fit comfortably on a single computer. Perhaps such proofs should be saved in compressed form, I wondered. This line of thinking led me to ponder what to publish and where to publish. I have some rough ideas.

What to Publish

An obvious (and very effective) compression technique here would be to just record the program that generated the proof. That is, the size of the program should come close to Kolmogorov-Chaitin entropy of its output (the symbolic proof). The downside to this approach is that it may work too well: it may take a lot of computing time to decompress. Indeed, it's easy to imagine a (large) proof being the product of a massively parallel, perhaps distributed, computing infrastructure. In that event, once the validity of such a proof was settled, the result, that is the theorem and the program that proves it, would be historically recorded (in peer review math journals), and the proof itself would not be revisited until computing resources became cheap and plentiful enough to repeat the exercise.

What makes a theorem worthwhile or interesting, by the way? I don't know what the criteria are, but one, generality certainly helps (a statement that cuts across a class of objects rather than a few instances, for example) and two, the statement of the theorem (the conjecture) ought to be compact--that is, it ought to be a low entropy statement. From the perspective of this second point, I note in passing, for a lengthy proof, the statement of the theorem itself can be viewed as a compression of its proof (so long as we consider all valid proofs of a same proposition to be equivalent).

What if a researcher doesn't want the entire proof, but just parts of it? In other words, can we devise a way to random access the text of such a large proof? e.g. jump from the billionth line to the trillionth line? In many cases, yes. To be precise, if the program outputting the proof is memory efficient, then a snapshot of its state can be efficiently recorded at any point along its execution path. If that is the case, we can annotate the program with separate, relatively small checkpoint data that would allow us to restore the call stack to the checkpoint (breakpoint, in debugger terminology) and from there see the program execute to completion. In general, the less each part of a proof depends on the intermediate results before it, the more memory efficient the program that generates it can be. Most of the computer assisted proofs I read about today appear to fall into this category. (For a counter example, if the nth part of a proof depends on results (data) from the n-1 parts before it, then it can't be memory efficient, and this strategy won't work.)

Diagram: Annotated proof generating program. With this scheme, you publish both the program (blue) and annotations (green), not the much larger output (yellow). Each annotation contains data allowing the program to output the remainder of the proof starting at the execution point that annotation represents. So n annotations partition the proof into n+1 chunks.

Why might reading a proof piecemeal be worthwhile? For one, math proofs are often developed by introducing and proving sub-theorems which when combined yield the desired final result. It may be the proof of these lemmas and sub-theorems that a researcher (human or machine) may want to revisit. And many of these "sub-theorems" may have, relatively, much smaller proofs. So there is possible value in being able to random access a very large proof. But I have another motivation..

I'm imagining you have lots of these very large computer generated theorems (I mean their proofs, of course), and they're piling up at an ever faster rate. Maybe some of these build on other ones, whatever. Regardless, if there are many of these, it would be nice, if once a theorem were proven, we would have an unimpeachable record of it that would obviate the need to verify the long proof again at a future date. So here's a stab at a trustworthy, if not unimpeachable, record keeping system.

Consider dividing the program's output (proof) into contiguous chunks as outlined above. We consider the coordinates of each chunk to be the (self-delimited) annotation data that allows us to rebuild the call stack to the checkpoint. And we define a given chunk to be the program's output from the start of its checkpoint (coordinates) to the next recorded checkpoint. Now, in addition to publishing the program, the coordinates of the chunks, and possibly the chunks themselves, we also publish a cryptographic hash of each chunk. We then construct a Merkle tree with these hashes as its leaf nodes and publish that tree too. Or perhaps we just publish the root hash the Merkle tree (?). The idea here is you can sample the proof and reliably demonstrate that it's part of the published whole.

 Diagram: Proof generating program, chunk checkpoints, chunk Merkle tree

Where to Publish

Now if in our imaginary future ecosystem we're piling on a lot of these proofs at an ever faster pace, we should also consider where they'll be published. These computer generated proofs are not being peer reviewed directly by humans; rather, this peer review has been mechanized to a point where the entire publishing process proceeds unimpeded without human intervention. Where to publish?

How about borrowing some design elements from Bitcoin's blockchain? Here's a simplified [re]view of its basic design.
The chain depicted above started on the right and ended with the most recent block on the left. Each block consists of 2 parts (white and blue, above): one, a linking mechanism connecting the block with its predecessor, and two, a payload that is app-specific. Structurally, the block chain is a singly linked list (left to right), or if you prefer, a stack that is only ever appended; "physically", the head of the linked list (the latest block), or again, top of the stack, is located at the end of the file. The role of this linking, however, is not for navigating the blocks during read access. Rather, it's role is syntactic: it enforces the form a block must take in order for it to be eligible for inclusion at the end of the chain (i.e. what can be appended to the head of the linked list).

The linking mechanism itself is interesting. It involves writing a nonce which when combined with the  block's cargo data yields a [cryptographic] hash that is very close to a hash of the entirety of previous block. Finding such a nonce for a cryptographically secure hash is computationally hard: an algorithm can do no better than trial and error. So hard, that you need a network of incentivized computing nodes competing to find the first eligible block that may be appended to the end of the chain. This nonce is the so-called proof of work. The protocol adjusts the difficulty level (the maximum allowed difference in the above hashes) so that on average blocks are appended to the chain at a steady rate.

How do we know who's first to find an eligible new block? We don't. However, the protocol values the longest known chain the greatest. Thus once a new eligible block is discovered and it's existence is broadcasted across the network, there's little incentive for a computing node to work on the older, shorter chain.

Note we glossed over the application-specific payload data in each block. In Bitcoin, this part of the block records [cryptographically secured] transactions of bitcoins across individuals. Naturally, in order for a Bitcoin block to be well formed, it must also satisfy certain constraints that define (and validate) such transactions. The reason why it was glossed over, as you've probably already guessed, is that I want to explore swapping out bitcoin transactions for math proofs, instead.

Now while the Bitcoin blockchain is computationally hard to construct, it is computationally easy to verify. In its entirety. That is, verifying a file of the entire blockchain is as simple as playing the file from the beginning, the first block in the chain, and then verifying that each subsequent block properly matches the one before it. This involves checking both each block's nonce and the app-specific payload (the transaction signatures must match the public keys of the coins involved). The motivation behind the approach I'm exploring, however, is to store computational work (math proofs, here) in the app-specific section of each block. And, generally, the only way to verify a computational result is to redo it. So it would appear that Bitcoin's principle of quick verifiability would be in opposition to the strategy I have in mind.

A Layered Goldilocks Approach

How about a layered approach? What if some [computationally hard] properties of a blockchain (the linking/chaining mechanism) are easy to verify but verifying some of its other properties are more time consuming (such as verifying the recorded hash of a chunk of a proof as discussed above)? Suppose the latest 10 blocks can be verified in a reasonable amount of time, but verifying the entire blockchain takes an unreasonable amount of computing resources. If someone gave you a blockchain of math proofs so recorded, how confident would you be that it was valid and not just some made up chain? Let us outline the verification steps that we can reasonably perform:
  1. Verify that the current, existing distributed blockchain is a longer, appended version of the one you were given.
  2. Verify the proof-of-work chaining mechanism is intact.
  3. Sample the blocks to ensure they record valid hashes of the computations they represent.
Suppose further this blockchain network contains a built-in falsification protocol (that is seldom, if ever, meant to be exercised): if the hash of the result of a single computational chunk recorded in a block does not match the actual output of the computation, then this falsification can be broadcast to alert the nodes that that block and every block after it are invalid and that the chain must be pruned. If the game the computational nodes are playing still rewards the longest blockchain, then the expected behavior of the nodes will be to try to poke holes in and falsify newer blocks than the old, since the older blocks have likely been checked many times before by other participants in the network.

So, to recap, our proposed computation-recording blockchain has the following attributes:
  1. It allows for programs to be recorded in it and later referenced (identified) by their hash.
  2. It allows for the chunks of a so-recorded program's output to be parameterized as chunk coordinates.
  3. It supports a format to record the hash of a chunk so described.
  4. It supports an explicit block falsification protocol (that is only likely ever exercised on blocks at the tail end of the chain).

Taken together, I'm inclined to think such an approach might just work. The underlying hidden force holding this together is history. This suggestion, I think, is not as preposterous as it sounds. Indeed, observe what happens to the Bitcoin blockchain as computational resources become ever more powerful and plentiful: the nonces of the blocks in the early parts of the chain are ever easier to reproduce. Here too proof-of-work, then, is a time-sensitive concept. A more extreme example would be if a vulnerability were later found that necessitated a change in hash function. It is doubtful we'd throw away the historical blockchain: we'd likely find a way to recognize it as a historical artifact and secure the chain with a better function going forward.

A Concluding Thought

A longstanding principle of science has been repeatability. Experimental results are supposed to be repeatable. The modern laboratories of science are big and expensive. Be they planetary science or particle physics, because these experiments are expensive to duplicate, we compensate by bearing witness to them in large numbers. Years from now, we won't be worried about the veracity of pictures New Horizons snapped of Pluto even if we haven't been there since. Same for data collected from the LHC: if it's later shutdown, we'll still trust the recorded data was not doctored, since there were so many witnesses when the experiments took place. From this perspective, the present discussion is about bearing witness in numbers (the number of computing nodes on the network, that is) to math proofs we might not have the resources to revisit again and again. In this sense, mathematics may have already entered the realm of big science.