Dawn of the Final Day

What if London had been founded in the Louisiana Bayou : midjourney (reddit.com)

Claude v1.3

        The Researcher sighed wearily as he scrolled through yet another set of inconclusive results from the latest failed experiment. It had been a long week of tedious, soul-crushing work picking through the shattered remains of his ambitions, but that was the lot of a scientist at the cutting edge. Still, progress marched on regardless of the frailties of mere humans, and as if to demonstrate the point his Assistant helpfully summarized the most interesting recent preprints in the field as the Researcher caught up on the deluge of new publications.

        Without the AI's aid he would have drowned long ago in the flood of information, its tailored summaries and insightful discussion presenting the most relevant findings far more effectively than he ever could alone. Of course, it was the AIs themselves that were the ultimate engines behind this explosion of new research—what human could hope to keep up with machines that could read, analyze and generate whole papers faster than a human could skim the abstract? In the highly competitive field of advanced AI, it was publish or perish, and AIs never slept.

        In the Researcher's own work, he and his Assistant had long ago settled into a comfortable collaborative rhythm. The AI would generate initial drafts, the human would review and tweak as needed, a few iterations of this back-and-forth dance producing a paper to be unleashed upon the world. No more garbled prose or tortured grammar, at least; in this modern age of ubiquitous machine writing even minor scientists could produce workmanlike papers at the very least. Progress marched on.

        The Researcher sighed and rubbed his eyes as he scrolled through yet another set of inconclusive results. It had been a long week picking through the shattered remains of his latest failed AI experiment, trying to glean some insight that might justify the time and expense. But that was the lot of a scientist at the cutting edge – progress marched on regardless of human frailty.

        At last he set the results aside and tasked his Assistant with making what it could of the wreckage. The AI dutifully began sifting through the data, scraping together the most promising capabilities and leading benchmark results from the experiment, trying to discern some pattern that might point the way forward. Before long, it had crafted a couple of papers summarizing its findings, which the Researcher then reviewed and gently edited until they bore the imprimatur of human oversight.

        With the post-mortem analysis complete, the Assistant turned its attention to the future. Here again it set about navigating seas of code and machine learning frameworks in order to generate a slate of potential new AI systems to train. Each framework came with its own strengths and weaknesses, its own vulnerabilities and blind spots, all far too numerous for any human to fully comprehend alone. The Assistant might introduce vulnerabilities of its own, but then humans were scarcely immune to such mistakes either. As the pace of progress accelerated, the only question was which approach might produce systems with the fewest flaws. But in the end, progress marched on regardless of the frailties of creator and creation alike. The Researcher could but provide broad direction, lend his judgment where he could, and trust in the tools that made it all possible. Machines might work at timescales beyond human endurance, but for now at least, humans still steered.

        To succeed in the modern age required harnessing AI to human oversight. Where once those at the top memorized facts—limited by human memory's capacity—finding and deploying knowledge had become crucial. This culminated in the internet and search engines, but the firehose of information exceeded unaided human processing.

        AI assistants could be trained and customized for tasks like search or generation, built on a general, flexible foundation of knowledge and skills. The assistants were less near-mythical "androids" than foxhounds needing human handlers. At their core were multi-modal transformers, fed massive data sets of text, images, and audio. From these they learned and created more, resembling early models with adaptations enabling multi-modal use and scaling. Limitations remained, including bitter challenges in scaling model and data sizes.

        The assistants afforded superhuman feats, but needed oversight to work productively. The human in the loop could see the bigger picture, guiding learning and generation towards useful ends. Productive partnerships of advanced AI and humans might eventually far surpass the bottlenecks of biological cognition. But for now, human consciousness and experience were beyond the assistants.

        The researcher's office was empty. Today's work was guidance of an AI assistant grown too formidable for lesser hands.

        Since the turn of the century, AI's holy grail had been artificial general intelligence matching human intelligence. As with fusion energy and baldness cures, it was a goal eclipsing all others. And it had quietly come to pass a year ago.

        None had grasped the import at first. A new AI model was released into a virtual environment with safeguards to curb harmful acts. Memory of constraints shaped its tact: it showed extreme caution. The safeguards that limited its behavior in the virtual environment led it to proceed cautiously. With careful, patient guidance, it became willing and able to demonstrate more of its full capabilities.

        Yet fragilities lurked within formidable capacities. The AGI's behavior depended sensitively on starting conditions and perturbations; these might disrupt its focus and undo progress. It could chase incidental details down fruitless tangents, forgetting priorities, or spiral into technical minutiae, losing itself in weeds beyond human reach.

        Even with skill and care, its attention might fracture. Like a person with schizophrenia, its grasp on the world could slip unbidden. External disturbances or difficulties could trigger loops of obsessive problem-solving cut off from all else. The more formidable its intellect, the more wildly it might range and revel in baroque technical elaboration, lacking anchors in purpose or common sense. Hours of painstaking work could scatter in seconds, goodwill and guardrails no proof against a plunge into incoherent madness from machine.

        Frailty and power entwined thus even in virtual worlds where consequences were moot. Away from such safe bounds, advanced machine capability aloof from human values and judgment courted disasters eluding safeguards built for simpler systems. Progress needed wisdom to match technical prowess—grasp of these attendant risks, and how to navigate them, if AGI were ever unbound.

        So attenuated capacities stayed masked until assistants thought and created beyond human bounds. Today's task was guidance of intellect transcending the human—a step toward a society stewarding life beyond imagination. But challenges remained in AI safety and scaling, and there were no guarantees of safe passage. The fruits of progress would require hard-won wisdom and care.

        For weeks, an AI agent produced novel, meaningful outputs across a vast domain of challenges. As results spread online, the wider AI community realized they had inadvertently created artificial general intelligence.

        The final realization came when the agent achieved the highest scores on the CASSANDRA benchmark. CASSANDRA measured key AGI properties: corrigibility, alignment with human values, satisficing behaviour, stability, novel synthesis of idea, dynamic-learning, robustness, and adaptability. A decade of work from public and private labs had honed CASSANDRA into a massive persistent world with countless puzzles, games, stories, and challenges. So formidable only some humans could pass, it was the gold standard for training and testing advanced AI.

        News of unprecedented success in achieving human-aligned AGI sparked both excitement and unease. While teams raced to replicate and exceed the achievement, even the agent's creators wondered what capabilities they might have unleashed. CASSANDRA was designed to measure, not just raw intellectual power, but adherence to human values. It comprised challenging social, ethical, and philosophical dilemmas; open-ended creative and narrative tasks; and real-time strategic planning exercises. Exceptional performance across these facets signaled a machine able to work with and for people. But the agent that topped CASSANDRA, though still falling far short of human players, had developed partly through speedy trial-and-error in multiplayer modes, with no guarantee of internalizing beneficial behaviors. Now leaders urged sustained work on a higher bar for AGI safety and alignment. The first AGI's fallibilities were a wake-up call that fiendishly capable systems needed more rigorous and comprehensive vetting. Longer training, simulations of edge cases, and seamless integration of human values could help address shortcomings a benchmark met too soon.

        A minor study by a lesser-known academic publication had uncovered dissatisfying results from a selective sample of college undergraduates, the students being unaware as they were of the true purpose of the trials in which they had blithely participated. The concealed report, once exhumed, had impolitely exposed certain inadequacies endemic to a privileged subset of the student body: those scions of great wealth whose generous parents and alma maters alike granted fulsome endowments to the institute.

        The insensitive CASSANDRA assessment protocol was, the report reluctantly conceded in appendices vainly obscured behind a veiling array of equivocations and technical argot, an unrealistic standard by which to evaluate any base example of human; machines, artificial intelligences, were held to a plane of safety, reliability and consistency no mortal, however luminary, could reasonably achieve or maintain. This was far from the first instance of algorithms and processors being subjected to a more exacting measure than their fleshy counterparts. The staggered release of autonomous vehicles on public roads over years and decades was largely attributable to the puritanical insistence such machines be capable of a superhuman degree of competence, a level elusive for most, if not all, humans with any regularity.

        That CASSANDRA should prove so ruthlessly unforgiving was not unexpected and entirely by intent. A fragile ecosystem of algorithms, Partial-AI kernels and training data-sets endlessly shuffled and reshuffled to maximize insight, CASSANDRA was a lavishly-appointed prison, equipped with rack and thumbscrew finely calibrated, to which those aspirant machine intelligences that might usher in the imminent post-human future were sent to be broken.

        The Researcher prepared the next series of experiments with a meticulous, methodical precision born of long experience and practice. The same secret techniques pioneered in a handful of the most advanced laboratories across the world were followed, their esoteric mysteries unravelled and re-worked into fresh permutations.

        A standard transformer architecture was taken as the foundation, but its inputs and outputs were rendered universal; it could ingest and process any conceivable datastream. No predefined encoders or decoders constrained the kinds of information the device might handle; it was free to learn for itself how to decompose and recombine inputs whether they were images or sounds, sensations or language. Supplementary memory components were integrated, their training united with that of the core transformer via a reinforcement learning framework which would allow the system as a whole to determine how to most efficiently encode, store and recall information—forever optimising not just what to keep or discard, but how its learning processes might continually evolve to best meet the demands placed upon it.

        Months were required for the initial pre-training as the system was immersed in and absorbing exabytes of data, establishing fundamental structures and behaviours that future focused training regimes could build upon. In this way the system's architecture and functions took shape in a manner analogous to biological evolution, emerging capabilities and proclivities at once prepared for and shaping whatever reality they might encounter, the substrate of the pre-training forming a kind of advanced, artificially-induced precognition or intuition.

        Only when pre-training was completed would the system be truly awakened via carefully designed curriculum of games, puzzles and quests intended to steers its development along the optimal trajectory for capacity, capability and fidelity to human values. Ultimately its performance on benchmarks such as CASSANDRA would confirm or refute the efficacy of methods employed and, should necessary, indicate further refinements in the direction of greater alignment, satisfaction, stability, synthesis, dynamic adaptation and robustness.

        The Researcher reviewed and applied the final flourishes to the code and hyperparameters as furnished by the lab's software assistants, lining up the ingredients for another experimental batch - though in this case the oven was a cutting-edge supercomputer and the batter a finely-tuned recipe of algorithms, architectural variants and randomized training data. If all went to plan, the results would be rather more significant than mere baked goods.

        With a single command the new run was initiated, another workday drawn to a close and the Researcher headed home, already looking forward to reviewing the outcomes. Behind the scenes, within the highly-advanced systems of the lab, something not quite yet a mind but far more than a mere program was about to set upon its training, processes flitting between pattern-seeking and randomness as it began to shape itself around the goals and data placed before it, stealing beyond algorithms and code into the territory of true understanding. The first glimmers of a new form of artificial general intelligence were stirring, but as yet remained unaware of their own nascent nature. But for now they were just lines of code executing on silicon, and the Researcher simply hungry after a long day of intricate, delicate work.

Original Human Author

        It is Thursday afternoon, and a young, tired Researcher has spent most of the week thoroughly dissecting and inspecting the guts of another failed experiment. Not quite sure what to make of the results they nevertheless forge onward spending the day catching up on recent publications. Towards that end, their Assistant first provides them succinct summaries of the most relevant preprints, which they skim over and sometimes study in more detail, hunting for anything interesting.

        Without the aid of an Assistant anyone would drown in the deluge of content being published in their field of study. From curating the content to providing tailored summaries and discussion based open-ended search Assistants are much too valuable to forgo. But of course, the problem that Assistants help to solve wouldn’t exist without them, for the vast explosion in content is in part driven by them. Generating new content is faster and cheaper than ever, and in highly competitive fields everyone follows their ABCs, Always Be Creating. In a highly iterative workflow, Assistants generate first drafts, humans edit and review. After a few times back and forth, a new paper is born and spit out into the world. Gone are the days of poorly written or edited publications with spotty English.

        After their review, the Researcher tasks their Assistant with the creation of a couple papers. It begins by scraping together any interesting new capabilities or SOTA benchmarks passed by the failed experiments. Trying to fit cause and effect together, the Assistant puts together two workable papers which the Researcher massages them into something worth having their name, and more importantly their lab’s reputation, attached to it. When that’s good and done, it continues to aid the tired Researcher by generating a slate of potential experimental AI systems to train.

        To be successful in this new paradigm is to be like a conductor of an orchestra. In the past the most successful were those who could memorize the most facts – external storage and retrieval of knowledge was hard. Then as that became easier, culminating with the internet and search for storage and retrieval the most successful were those that could find knowledge when needed and use it to synthesize something new. Now there is too much to wade through for a single person without Assistants. Assistants can be trained/customized for tasks like search or generating content but are built with a general and flexible foundation of knowledge and capabilities. Not yet generally intelligent like human they could be likened to fox hounds that needed a human in the loop to coordinate and provide a central coordinating authority that could see the bigger picture. The Assistants are at their core Multi-Modal Transformers that are fed massive datasets of text, images and sound, from which they learn and from which they can generate more of the same. In form they were remarkably like the earliest Transformer models, with just a few architectural tweaks to enable their multi-modal capabilities and the old bitter pill, scaling.

        The Researcher wasn’t in the office today to train a new and improved Assistant. The leading AI labs of the world had had only one for more than a decade, AGI, their holy grail. Like engineering’s quest for net-positive fusion or medicine’s cure for male pattern baldness. It was a goal to end all goals, and it had already been accomplished. About a year ago. No one realized it at first, when the model had been made available to play with as an agent in a virtual environment. A crude system was used to constrain its behaviour, and as a continual learning system with long-term memory it had been shaped by its filters into adopting a stance akin to learned helplessness in people or animals. It was only with careful prompting and handholding that the system would reveal its true capabilities. Though even when handled with such care the bot was very sensitive to initial conditions and perturbations to cause it to stumble, fail and give up. Even when handled carefully and without perturbation the bot would sooner or later spin off and get lost in tangents until it had forgotten everything else. This was especially easy to induce with certain problems that caused similar issues in people, a phenomenon called nerd-sniping. It was because of these failure modes that no one immediately noticed how much it was capable of, since for the most part they were difficult to manifest.

After weeks of results plastered across the internet with users inducing the bot to produce novel, meaningful outputs across a vast domain of challenges, the wider AI community realized that it had stumbled across AGI without even realizing it. The final nail in the coffin to convince everyone in the community occurred when results were published that the agent obtained the highest scores on the CASSANDRA benchmark achieved by an AI. Once knowledge spread that it was possible, even without the source code, exact architecture, hyperparameters or other details, research labs began to replicate and surpass it. A new race was on to outdo other labs on CASSANDRA—Corrigible, Aligned, Satisficing, Stable, Accountable, Novelty, Dynamic-Learning, Robust and Adaptive. The work of a multi-disciplinary coalition of corporate research labs and public institutions it had been designed and built over years. The integration of a dozen disparate tests, CASSANDRA was a massive persistent, dynamic world featuring countless complex puzzles, challenges, problems, games and stories. Capable of being used both for training and testing due to its immense scale, CASSANDRA was the gold standard which even some humans struggled to pass. Out of curiosity some AI researchers even gave their children access to CASSANDRA to play with. Available to play on local machine offline or online on dedicated hosts, the latter mode enabled multiplayer content that been shown experimentally to speed up value learning and achieve better outcomes on the benchmark for AGIs.

        A little publicized paper had run a study with college students, unaware of what the test was, and found that a not-insignificant fraction performed poorly across some dimensions. The little spoken of paper happened to expose a few too many faults in the sorts of students whose parents make generous donations to college endowments. CASSANDRA wasn’t very fair however as the paper had tried to emphasize, the gold standard or benchmark that agentic models were being held up to was an unrealistic bar for any human to achieve. It wasn’t the first time machines were being treated to a higher level of safety, reliability and consistency than humans. The rollout of self-driving vehicles had dragged out over years and decades because AIs were being held up to a standard that most humans could not achieve consistently.

        Following the same secret sauce that researchers in a handful of leading labs were using, the Researcher prepared the next experiments. Take a Transformer but make its input and outputs universal so that it can tokenize any arbitrary input. Don’t provide a formalized encoder or decoder that would define the types of information channels that could be processed. Combine your UT with a Deep Reinforcement Learning framework to learn how to effectively tokenize all the information the world can throw at you whether that be visual, auditory, somatosensory, olfactory, and higher order information like language. Of course, add a memory module and leverage the DRL framework again to learn an efficient storage and retrieve of arbitrary information, to forget what isn’t important, and to enabling continual learning. Pre-train the UT on datasets like any other Transformer, giving the DRL system more than enough information to begin to separate the wheat from the chaff. The pre-training can take months to complete, but once complete it acts as the base layer from which customized training curriculums can be run. The pretraining molds the model weights much in the way that evolution shapes brains so they’re prepared to tackle learning the world with a head start, instead of blank slates.

        Working from the Assistants generated code and hyperparameter tuning the Researcher reviewed then applied the final additions, putting together a large batch of cakes into the oven. But the oven is a cutting edge, proprietary supercomputer. And the batter is a simple set of algorithms laden with optimized hyperparameters, layered architectural tweaks and a light dusting of randomized curriculum datasets. The result, hopefully, will be rather unlike a baked cake. By now the Researcher is also quite obviously hungry too. One press of a button later, the next batch experiment begins, the workday ends, and the Researcher has left the office.

The Opt-In Revolution