How do you test your crunchy bits?

edited March 2010 in Game Design Help
Say you've just come up with a moderately complex resolution system. What's your approach to testing the math? Roll a bunch of sample conflicts to see how it feels? Plug numbers into a spreadsheet? Put it on your blog for your smart friends to poke at?

Comments

  • All of them. I also like the idea of taking some of your favourite scenes from books and movies and seeing how they resolve (if of the appropriate genre). I was quite impressed with the Supernatural RPG in that the example of play was straight out of one of the episodes.
  • My process goes something like this:

    1. Just imagine situations that I expect to happen in the game. Difficult situations, rare situations, strange situations...when presented with the situation, how do I respond as a player or GM, if using the system in question? What happens? Do the mechanics handle well, at least in theory? Does it sound too complex?

    2. If it passes the first test of negativity (is it failing to work at all?) then I submit it to "does it do what I want it to do?".

    3. If it's still working, I roll up a sample or two.

    4. If the mechanic is still alive and kicking, then it's playtesting, playtesting, playtesting until you hit all the glaringly obvious problems that you overlooked in theory.

    To be frank, it rarely gets to step 2. at all. And step 4. often means going back to the drawing board.
  • All of the above, plus rope in your friends for some live testing.

    Remember that the system has to stand up all the way out to the most unlikely fringe results, not just when things work out as they usually will, so be sure to test all the edges.

    -Vincent
  • Also, just think what happens if you have a string of reeeally bad rolls. Some players get hit with some very bad luck, does that suck horribly for them then?
  • Posted By: Ron HammackWhat's your approach to testing the math? Roll a bunch of sample conflicts to see how it feels? Plug numbers into a spreadsheet? Put it on your blog for your smart friends to poke at?
    I used to try all of these methods before seriously testing with friends. The results were the same: my calculations "told" me that these were sound mechanics, but they weren't fun. These days, I tell people not to trust math. Math cannot make a bad mechanic good, but it can delude you into thinking one is. Today, my immediate response to making a new mechanic is to call up a friend for a quick test.

    - Ryan
  • Ron,
    I have a process.
    First, I make a spreadsheet that has highest possible roll versus lowest possible roll
    Then I take the same one and do avg vs avg.

    I set it up so that there is a different row for each stat. And all the numbers are the same, except for that one stat.

    Then the final column has how long it will take to "win" using those numbers.

    I compare those, tweak the mechanics and generally try and see where it goes right/wrong.

    Then, once that is done, I roll vs myself a couple times to make sure it look at least as good on paper.

    Then, I roll versus another player to see how they use it differently than me. In this instance, it is just a bunch of conflicts in a row in order to test the mechanic.

    If all that susses out, I fix my playtest doc and start playtesting.

    Dave M
  • Honestly, I usually just roll out a couple of iterations until I feel ok about it, and wait for playtest to stress test it. I do usually write out what happens at the extremes, as well as the averages just to cover those bases. Most of my mechanics are beyond my math skills to do real probabilistic calculations, in any case.

    As for edge cases, I'll often err on the side of "in this extremely rare case, X won't work right, so here's a note about it in the text" as opposed to "I will revise my mechanics to eliminate this edge case," because (in my experience) fixing extreme edge cases can often make the core mechanic different to a degree that means it doesn't do what I want anymore for general use.
  • If the game is simple enough, genetic programming.

    You build "characters" at random in some way. Sometimes you encode the rules, sometimes it is more abstract.

    You build "strategies". This depends on the game, and is the art-rather-than-science part of the coding. But the idea is that the strategy guides how the character makes use of its characteristics in some way.

    You make a whole bunch of random characters. You assign each one a strategy at random.

    Then you have them "fight", usually in random pairs, using the game's conflict mechanics (assuming they exist). You somehow "score" the result (even more of an art-rather-than-science). You look at the scores for patterns. (Hint: an important metric to look at is how long fights take. You expect, for example, that "evenly matched" bouts should last longer and huge mismatches should end quickly. If they don't something is wrong.)

    This type of thing is usually best at telling you when things fail. In an extreme case, for example, if one strategy is always winning no matter what the character is like, that is probably a broken mechanic.

    Some tweaks include things like testing a handful of weak characters against one strong one and see what happens. That sort of think.

    All of this sounds harder than it actually is. Even very rudimentary code can uncover problems quickly.

    Another thing to do, especially if you have strange dice mechanics, is to run Monte Carlo simulations on them to guess the probabilities. Usually a million or so rolls will tell you what you need to know.
  • First cut: Prototype. Invent some guys, have them fight.
    Second cut: Play. Tons and tons of play.

    The first catches gross errors. The second catches fine ones.
  • edited March 2010
    Does roleplaying affect your math?

    For example, in Dogs the fictional weight of a raise often has as much to do with if someone will give as the numbers themselves. If this is the case in your game, definitely keep this in mind. You can test the numbers by themselves but that might not gel with actual play.

    That said, I like to test in a spreadsheet or any scripting language. I will run fringe and average scenarios 1000s of times and look for patterns.

    In addition to playing with friends, play with people who have never roleplayed before.

    And something I experimented with that was hugely successful... find super hardcore Texas Hold'em or Magic the Gathering players and have them try your game. They will find the loopholes very quickly!
  • edited March 2010

    Does roleplaying affect your math?

    Yeah, this is an important design-level thing, too.

    P1: I shoot Balthazar dead. (raises with a 17)
    P2: Oh, shit! That uses up my 10! I needed that for my raise! But Balthazar doesn't deserve to die. (sees with a 10 and a 7)

    versus

    P1: I shoot Balthazar dead. (raises with a 17)
    P2: That little shit? I take the blow. (sees with a 4, 3, 2, 2, 1, 1, 1, 1, 1, 1)

    Remember that the interesting thing about conflict is the choices. If choice doesn't add complexity to your conflict resolution system, you've just got a system that tells you that something happens, not a conflict resolution system.

    (There might be cases where you want a system that tells you what happens, rather than a system of choices. That's almost certainly not what you want, unless you're Dave Berg.)

  • I use lots and lots of simulation, generally at 'single task' level, whatever that might be for the task at hand. Run 100,000 iterations and you've got a very good idea of how the dice will fall. Look at the mean, the mode, the median, how often the extremes come up and so on, and check that they are all in accordance with what you wanted from the mechanic. Where two characters are in conflict, you can also look at the above when an 'average' character fights with a 'weak' character or a 'strong' one, and check that the step up a category is neither too large nor too small.

    What you can't do that way, in general, is test mechanics that involve decisions during the resolution. As Joshua rightly points out, decisions can often hinge on the context in a way that is not amenable to such simple simulation. Wordman's approach is one way of attempting it, but there's always the risk you'll miss one strategy a priori, or underestimate the influence that context will have. Where the choice comes before the roll, or after it, the statistics are often straight forward. Where decisions come during the roll, it's a lot trickier. That said, decisions during the course of rolling do not guarantee an interesting system.

    Actually setting people loose on the mechanics is something I see as the final layer of polish, as you should have already knocked off the jagged corners by then. Otherwise, you're wasting some of your precious testing time on something you should have already dealt with.
  • Another approach is to take a hard look at your moderate complexity and find ways to make it not complex at all. There's a point of diminishing returns related to your goals, but in my mind the effort to distill a procedure to its most elemental form is very well spent. You'll reduce the amount of iterative testing necessary to get a feel for the universe of outcomes, for one thing. It'll be easier to teach, for another.
  • I typically spreadsheet it. That worked well for Console. Sometimes I can just eyeball things, but more often I need to calculate how many hits will it take before the average monster knocks down the Fighter.

    In a more complex system (like ORE) I need to roll things out a few times. Preferably a few dozen.
  • Posted By: jenskotAnd something I experimented with that was a hugely successful... find super hardcore Texas Hold'em or Magic the Gathering players and have them try your game. They will find the loopholes very quickly!
    Yeh, these people are gold (I would add really serious D&D 3rd/4th ed types as well). They will find it fun to destroy your puny designer story math.
  • edited March 2010
    Posted By: jenskotDoes roleplaying affect your math?

    For example, in Dogs the fictional weight of a raise often has as much to do with if someone will give as the numbers themselves. If this is the case in your game, definitely keep this in mind. You can test the numbers by themselves but that might not gel with actual play.
    In a genetic programming system, this is where the "strategy" part comes in. Even very basic strategy encodings can be illuminating. You can do things like "this strategy raises 30% of the time". As you say, this doesn't tell you about the roleplaying "weight" of an action, with the right types of strategies, it can be simulated, even if you are not simulating the role-playing itself.

    Another thing you can do is add a factor like "roleplaying weight" into a particular conflict. Then strategies can say things like "if the stakes are high, do X, otherwise do Y".
  • 1. I design some moderately complex mechanic.
    2. I roll dice a bunch of times to see if it feels fun to me. If not, back to #1.
    3. I model the probabilities in Excel or Troll, or sometimes in Java (I wrote a permutations package that helps me with stuff).
    4. I tune until things happen about as often as I want them to -- in theory.
    5. I playtest.

    The modeling step has stopped me dead. I mean that where it felt fun when I was randomly rolling some dice, the modeling showed me that certain ideas were not viable. Like a "roll n d20s and take the highest" mechanic, which seemed pretty cool, vanishes pretty quickly into 19's and 20's after only a handful of dice. If I were to use that mechanic, I'd have to carefully limit the pool size.
  • If you're following Jason's advice, make sure that you don't lose things in the simplification that are desirable as well. A certain level of crunch fosters engagement.
  • Thanks for all the replies, everyone! I've decided to run some simulations, partially as an excuse to dust off the old compiler and cobble together some amateurish code. I'd already looked at the statistics of individual dice rolls, but because my system is a "usually you roll this many dice, but you can spend a resource to roll that many dice instead, but you might want to save that resource to protect this other resource, especially because you can do this thing now to get a bigger payoff on your next roll, provided you can stay in the fight that long and, oh yeah, if you roll doubles it means something else entirely" kind of thing, just looking at the odds of individual dice rolls wasn't really showing me the big picture, or helping me to understand what kind of shape the character would be in after a conflict spanning several rolls.

    I'm finding it very instructive so far, even though I haven't implemented any kind of strategy yet beyond just rolling the dice and hoping for the best.
  • Usually I'd ask some close friends to playtest those mechanics. First I tell them about my ideas, and hear their imput. Then if it's positive, we just try a sample conflict and see if it works.
    But that's maybe also possible because when I design a conflict resolution mechanic, I always keep it minimalistic, because I'm not smart with mathematics at all (to my deepest regret). So maybe it would be more complicated to use this kind of testing with more complex maths... Hmmm... not a very useful post, I'm afraid, but I hope my tuppence could be of some help.
  • Posted By: Ron HammackThanks for all the replies, everyone! I've decided to run some simulations, partially as an excuse to dust off the old compiler and cobble together some amateurish code.
    What language? I'm not sure why, but I've found that this kind of stuff seems easier to do in Python for some reason. Hidden in this (mostly bad) blog post is some sample code that can get you started.

    Also, it's been mentioned, but bears repeating: doing this kind of simulation is mostly useful for detecting mechanical failure. Even something that seems to work in simulation may not necessarily work at the table. The process is just one tool in the box.
  • Posted By: WordmanAlso, it's been mentioned, but bears repeating: doing this kind of simulation is mostly useful for detecting mechanicalfailure. Even something that seems to work in simulation may not necessarily work at the table. The process is just one tool in the box.
    Yes, definitely. Math isn't going to make a bad mechanic good, but it will definitely tell you that your good fun mechanic needs tweaking to not be broken in that case over there that just didn't happen to get noticed during playtesting.
  • I thought I had the perfect combat-system for my game way back in 94 or something, and the last 12 versions of the game has shown me repeatedly that I could not have been more wrong.
    And it is not just about the system itself but the game mechanics that cluster around it; background, training, experience, technology and magic all need to "move with the time" as playtesters thrash the system soundly with ideas that I could never come up with, let alone test for in theory. I mean, it was perfect when I started, right?
  • Honestly, this is my process:

    I write it out long hand, if i'm sick of doing that work before i've fully explained myself, the mechanic is too hard, scrap it.

    Assuming it survives...

    I have my not-really-much of a gamer pal actually do whatever it is i'm describing (here, roll these dice, play with these cards), if they're bored or don't understand what they are doing (with zero in-game context) then it's not fun, scrap it.

    Assuming it survives...

    I put it aside, and try to solve it from another angle completely. If there were dice, now we can't use dice. If it was a foot race, now it's a marksmanship challenge followed by popular vote. I only do this once, to repeat it would be maddening. Is this better? does it better serve the design? Did it pass the first two steps better or just as well? No, then maybe it's not the way.

    Assuming it survives...

    I see if i can't steal good ideas from my peers. First i try to steal from either Luke or Jared. Then I cast a broader theft net. Did that help, did it make it worse?

    Assuming i add some things that worked, and maybe got rid of some things that didn't...

    I just throw the whole thing out and slightly modify the mechanics of a basic children's game like Crazy 8s or Duck Duck Goose, and that's what ends up going to print.
  • I play conflicts against myself. Then I type it up and post it around. Then I test it with friends, not at a game*. Then I test it in games.

    *Most ideas die right here, or at posting, when my friends pick them to little bits. Very few make it to playtest. Playtesting only kills about one in five of those that make it there, and it's always a shocker.
Sign In or Register to comment.