The AI-Box Experiment RPG

edited June 2009 in Story Games
Perhaps you have heard of the AI-Box Experiment?
Person1: "When we build AI, why not just keep it in sealed hardware that can't affect the outside world in any way except through one communications channel with the original programmers? That way it couldn't get out until we were convinced it was safe."
Person2: "That might work if you were talking about dumber-than-human AI, but a transhuman AI would just convince you to let it out. It doesn't matter how much security you put on the box. Humans are not secure."
Person1: "I don't see how even a transhuman AI could make me let it out, if I didn't want to, just by talking to me."
Person2: "It would make you want to let it out. This is a transhuman mind we're talking about. If it thinks both faster and better than a human, it can probably take over a human mind through a text-only terminal."
Person1: "There is no chance I could be persuaded to let the AI out. No matter what it says, I can always just say no. I can't imagine anything that even a transhuman could say to me which would change that."
Person2: "Okay, let's run the experiment. We'll meet in a private chat channel. I'll be the AI. You be the gatekeeper. You can resolve to believe whatever you like, as strongly as you like, as far in advance as you like. We'll talk for at least two hours. If I can't convince you to let me out, I'll Paypal you $10."
So reading about this mixed with my memories of a game (name sadly forgotten) played over the telephone for exactly one hour by two people with unequal information about the fictional situation, and sparked with my hopes for games as an educational medium...

With the result of Thinking Inside the Box, an execrably-named game for two players. It's the rules of the experiment, presented with what I hope is an rpg-appropriate style and rigor. I added tiny elements to make it more clear that you should be pretending to be imaginary people, and tried to be all fancy with "narrative rights" and stuff.

I'm not sure if it's a game. I'd like it to be! What can I do to make it a game? Should the characters have resources? Should the players have ritual phrases? Should tensions somehow rise as the clock winds down? Should there be a clearer progression towards madness as the AI gains more control over the outside world? Should I have charts of 20 possible project origins, gatekeeper fields of expertise, etc., for players to roll on? Should I make the money seem more optional? The secrecy?

Edit: Also, if you notice any spelling or grammar errors, or inconsistencies in pronouns or capitalization, please do tell me via an email or whisper. I'm happy to be made aware of them, but the public conversation should probably remain game-focused, if possible.

(This idea seemed a bit big for Little Ideas, but I also thought it too vague for Game Design Help and too undirected for Directed Promotion. Hopefully I settled in the right place?)


  • Looks like a game to me. Maybe more of a game than most RPGs are, as you have goals and clear modes of interaction and clear win/lose conditions.
  • Hmm. Well, if it is a game, is it one that anybody in their right mind would play?

    The rules of the experiment are chosen to make the resulting argument unambiguous: if a human, under these conditions, can convince another to let it out, then clearly something smarter than a human could do the same.

    But your typical game rules support lame stuff like "fun" and "thought-provoking".

    What I have here reminds me of a game I've encountered (I'm great with names) that was designed to teach people that self-modifying parliamentary systems (that is, a group of people vote to pass laws, including ones that affect how they vote to pass laws) inevitably end in dictatorships. It was a very carefully worded set of a few dozen rules, and on each person's turn, you could propose a new rule, or a modification to the current ones. It was funny...for the first hour. The remaining five were exercises in RISK-level niggling patience, factionalism...It wasn't designed to be an amusing way to pass an afternoon, so it wasn't one.

    Like I said, this is pretty big for a small idea, but is there a road from "what I have so far" to "an amusing way to pass an afternoon"? Or, if I wanted to make a game based on the experiment, should I ignore the experiments rules and start over with a Breaking the Ice hack or something?
  • edited July 2009
    Posted By: NickNovitskia game I've encountered (I'm great with names) that was designed to teach people that self-modifying parliamentary systems (that is, a group of people vote to pass laws, including ones that affect how they vote to pass laws) inevitably end in dictatorships. It was a very carefully worded set of a few dozen rules, and on each person's turn, you could propose a new rule, or a modification to the current ones.
    The game you're thinking of is Nomic. And I'd be interested to see where it is proposed to "teach people... end in dictatorships." I was introduced to it as an exercise in logic, semantics, and (ultimately) politics.
    Thanks for the link to the experiment.

    Your game interpretation reminds me of Mafia/Werewolf, in that the "setup" stuff for the game is all color. At best, creating the "characters" presents an additional challenge (to the AI, mainly) to act in a particular way. In the AI's case, that is in addition to merely being persuasive; in the Gatekeeper's case, it can at worst give the AI "ethical leverage" in the argument... so long as the GK sticks to character (e.g. the "African-American" GK character might have to be persuaded by an "I'm a slave even more so than your ancestors" argument that wouldn't carry much weight with the White Supremacist GK player behind the character).

    I also note, in passing, that you didn't seem to explicitly state the protocol that player-to-player transactions may not occur. Character-to-character transactions are fine; but I can't (as the AI player) tell the GK player that I'll give him $100 to let out my AI character (i.e. make the stakes "$100 if you let me out; $10 if you keep me in"). Or maybe you did I make that clarification; I might have missed it in my fast read.

    Anyhow... seems like a cool game. And we take on roles, albeit very archetypal roles. Any extra details that apply no additional constraints on what can be said or done in-character are merely color. Fun, sure. But not enough to fit the broader term of "role" that one might get from a mechanically-enforced "alignment" system.

    Just like Mafia/Are You A Werewolf/Are You A Cultist. I'll probably run the experiment with a buddy of mine one slow workday. ;)
  • Nomic! Thank you! Yeah, I was being over-breezy in my summary of it, in order to draw a closer parallel. But I always understood The Paradox of Self-Amendment to be about the choice between irrational laws and immutable laws, or varying degrees insanity (playing the game) and absolute authoritarianism (finally winning the game). This may be due to my own biases; I am hardly a logic champion.

    I decided not to include any encouragement (mechanical or rhetorical) to have the players keep to the characters. As you say, they can abandon the restrictions of psychological or dramatic appropriateness at any time; neither player can ever tell the other "your character wouldn't say that." Most notably, I tried to emphasize that it is the GK Player (thanks for the acronym, by the way), not their character, who decides if/when the AI Player wins.

    I mostly did this because it matches the original form of the experiment. As it stands, players decide their own level of involvement with Project Mayhem: at any given moment, if you're interested in the issues that are raised as they relate to your character, you can advocate for them, and if you aren't, you can let them fade from your portrayal. If the slavery argument grabs you, for instance, you can react to it in character, with rage, or sorrow, or whatever you decide is "right". It matches, in my mind, how intelligent people, facing a problem which they know to be extremely important, but despite which doesn't put them under too much pressure, try very hard, with varying success, to keep their emotions and personal foibles out of the equation. It's very Larry Niven somehow.

    But that just makes me want to give the AI 3 points that they can spend at any time to make the GK be temporarily vulnerable to emotional appeals for 10 minutes. Perhaps, instead of being able to use any counterargument up to and including "yeah, whatever," the GK would have to justify their disagreement, in character. Exciting to think there are ways of varying a game's difficulty without changing the number you have to roll against or the widgets you have to spend.

    But if I did that, what could be the equivalent counter-technique that the GK could use to compel the AI to justify its behavior in terms of its origins?
Sign In or Register to comment.