Escaping Wargaming: How the Purpose of Rules Has Changed in the Tabletop World

This entry was posted in history, RPG, tabletop on by .

I want to start by saying, I have a strong fondness for wargames. I cut my teeth on board games and tactical wargames long before I got excited about tabletop RPGs. Wargames have rules, generally well defined rules, and that’s a lot easier for a kid who’s a bit awkward around other people.

The origins of RPGs in wargaming are far before my time (I’m not that old!) but I’ve seen the echos of how wargaming has shaped the industry right through to modern times. There’s a series of unspoken assumptions that people make about rules and how the rules you have reflect on a game that’s slowly shifted over the years. New, truly innovative games have chipped away at old ideas and prejudices as people began to accept that a game can have a short book with few rules and produce a consistent and enjoyable experience.

In the 80’s and 90’s the RPG writer’s favorite example of why we have rules seemed to be a kid’s game of cops and robbers. “It’s like that,” they’ll say, “but we have rules so that when one person says ‘I hit you!’ the other person can’t just say ‘No, you missed!’” The problem is, even kids don’t quite behave that way. Sure they fight and disagree, but kids have a great sense of narrative arcs and they absorb archetypal stories and characters like sponges. When I ran games for kids it was awesome how hard they worked to get things like “the anti-hero’s redemption through heroic sacrifice” to happen. They knew that was how the stories were supposed to go.

Beyond that, most of us aren’t children. The unspoken point behind the cops and robbers example is that children can be petulant and angry when they don’t get their own way. Assuming that you need iron clad rules to adjudicate every conflict without wiggle room means the players and GM you’re modeling are so immature they never got past that development stage. Do we really think gamers are incapable of looking beyond their immediate wants to the desires of other people or the needs of the group? Or that they’re totally incapable of separating their in-character persona from their out-of-character self?

None of this was important in war games because players weren’t actors in the unfolding drama. You might want that cavalry unit to win a skirmish, but you weren’t personally invested in the survival of one of those little riders. You were focused on the larger battle and the tactical puzzle of how to beat your opponent. The rules were used to abstract away physical and psychological things on the battlefield, since recreating a battle at 1:1 scale with actual military gear and people killing each other isn’t practical.

The focus of many wargames has traditionally been on understanding and recreating historical events, so rules supporting realism were valued over a game being fast or abstractly fun. It’d be especially bad if a wargame was prone to unrealistic numerical imbalances of power, since the overt goal is generally to give players a fair chance to display their tactical prowess as if they were commanding real units. These values have persisted for a long time in the tabletop community and you can still find gamers who put a very high premium on rules being “realistic”, “technically correct”, or “accurate”… even to the exclusion of being fun at times.

But in a tabletop game the role each person takes is very different than in a wargame and what it means to “win” is coincidentally different. What people think of as winning in RPGs varies about as much as what they think it means to win in real life. Designers have always tried to deal with this to some degree. We’ve seen cooperative tabletop games, adversarial tabletop games, and games that can be either depending on the player’s choices. This all sounds very flexible, but up until the last ten years or so many designers lost sight of why rules existed in their games.

Highly tactical games, like Shadowrun or editions of D&D like 3.0 and 3.5, focus on creating an air-tight set of “realistic” tactical rules… giving the players something they hopefully can’t subvert or unbalance too badly. This is often a reaction to the power-gaming that occurred in earlier versions of the rules. Other than power gaming issues, designers didn’t generally talk about what kinds of gameplay those rules were encouraging or what kind of gameplay they wanted to encourage. If you wanted a good experience at the table, you had to find a good GM or be a good GM.

To me it felt a lot like wearing a sweater 5 sizes too big to a formal dance. Sure, I’m not nude, but I’m really not dressed for the occasion. If I go to the dance with my best friends we’ll probably enjoy it, but the sweater isn’t helping me to have fun, just keeping me from being arrested for public indecency.

For a while there was push-back against highly tactical, complex games. Some games, like Big Eyes, Small Mouth (BESM), began trying to present simpler, more compact rules that let groups ignore the “realistic” complexities and get on with their gaming. Many of these games were still very generic and a lot of people looked down on them because they didn’t have highly structured systems. I got some side-eyes when I chose to run a BESM game and I’m pretty sure that a few of my friends filed it away mentally as, “well she’s just doing it because she can’t handle the rules in a real game like D&D.”

Some of these games also hearkened back to the subset of the very early old school RPGs that had much simpler rules. A few of these early games, like Tunnels & Trolls, are still around and started getting considerably more attention during this period.

Slowly game designers began to question, what game are these rules creating? How can I create the experience I want with different rules? And what experiences do gamers want anyway? This led to games like My Life with Master and Steal Away Jordan that pushed players into more narrative responsibilities and situations that were possibly less familiar or comfortable.

Over time these indie games got more and more traction and larger companies started producing rules that were tuned to helping players create the experience the designers wanted, rather than modeling reality. Not all of these were “rules light” or moved in the direction you might think. D&D 4.0 was built on rules geared to create a tactical combat game that valued a balanced and fun player experience rather than pure simulation or realism; it was a big deviation from it’s predecessors and upset a lot of people because of it. Around this time we also got a flood of narrative storytelling games like Shock and Fiasco that shifted the responsibility of narration entirely over to players, removing the GM from the game.

There are still people who prefer Shadowrun to Fiasco or the Leverage RPG, but the general attitudes have shifted. Games aren’t looked down on just because they have fewer or less traditional rules and people are starting to understand that a game can create other kinds of experiences if it uses rules designed for purposes other than simulation.

There may still be lessons in wargaming that can help us to grow, but we’re beginning to build games based on the unique needs of tabletop play, instead of living in wargaming’s shadow.

How True Are Your d20s?

This entry was posted in dice on by .

Old black and white photograph of stacks of dice from TSR, Koplow, Armory, ‘oriental imports’, and GameScience. There are two stacks for each company. Each pair of stacks is uneven, except for the pair of GameScience Stacks

Photograph of Lou Zocchi, a large white man, balding with grey hair, wearing dark rimmed glasses and a t-shirt.

Lou Zocchi

Lou Zocchi is a man who cares deeply about dice. Zocchi’s well practiced speech on on dice quality is famous. It’s fairly entertaining if you have 20 minutes to spare. Part of his argument is the above photograph. Zocchi stacked twenty-sided dice from several companies. Each stack places the same numbers on the top and bottom. For example, one stack might have 1 placed on top of 20 repeatedly, while the next stack might have 9 placed on top of 12 repeatedly. Based on the height of the stacks, it appears that everyone’s dice are irregularly shaped. Everyone’s, except for Zocchi’s.

Did Zocchi pick and choose for best effect? If it was accurate at one point, is it still? I’m pretty sure that the photograph dates to the late 1980s or early 1990s. One of the companies, TSR, hasn’t existed since 1997. Is this still a fair comparison? Eva and I set out to find out.

Image of stacks of dice from Crystal Caste, Chessex, Koplow, and GameScience. The stacks are assembled from multiple photographs.

(Click to see it in all of its obsessive glory. You can also see the monstrous 6,219×1,920 image.)

Photograph of two twenty sided dice from GameScience, one opaque green and one translucent red. The 7 face in on top of both, and flashing is visible along one edge of the 7 face.

Flashing on GameScience dice

These are stacks of twenty-sided dice from Crystal Caste, Chessex, Koplow, and GameScience. There are 20 from each company, 10 opaque and 10 translucent or transparent. I lost one of the Chessex translucent dice, so there are only 9 in those stacks. We stacked each set of 10 dice three times, once stacking 20s and 1s, once stacking 12s and 9s, and once stacking 11s and 10s. For the GameScience dice, we also stacked the 14s and 7s, as the 7s have the worst of the flashing.

There is an clear problem with the Crystal Caste dice. We didn’t notice anything odd when we originally purchased them, but their elongated shape is obvious when you know to look for it. Chessex has similar, but less severe, irregularities. Koplow’s dice hold up well in this test, especially the translucent dice. GameScience’s dice are incredibly consistent… except for the 14/7 sides.

There is substance to Zocchi’s claims, although Koplow is a serious challenger. But the photographs are really a publicity stunt, colorful, not rigorous science. So we broke out a digital caliper and measured the distance between every opposite pair of faces for every one of the 79 dice.

Once we had the measurements, I analyzed them. For each individual die I calculated the largest difference between the widths of each pair of sides. Across all of the dice for a given manufacturer, I then calculated the minimum, average, and maximum differences. I also calculated the standard deviation across all of the face pairs across all of the dice for a manufacturer. If a company’s dice are uneven, it will show up here. I broke Crystal Caste into two groups because it became clear that their translucent and opaque dice are wildly different.

Differences in paired face widths by manufacturer in inches

Company Min Avg Max StdDev
Chessex 0.014 0.020 0.027 0.010
GameScience 0.002 0.005 0.009 0.003
Crystal Caste 0.016 0.026 0.044 0.022
…CC Opaque 0.016 0.017 0.021 0.006
…CC Translucent 0.028 0.035 0.044 0.012
Koplow 0.006 0.010 0.020 0.006

The numbers reinforce what is visible in the photograph. Crystal Caste’s translucent dice are the most irregular. Crystal Caste’s opaque and Chessex’s entire line are more uniform, but aren’t in the same class as Koplow and GameScience. Koplow’s dice are very uniform, but GameScience trumps everyone else.

We avoided the flashing on the GameScience dice as it was hard to get repeatable results measuring on the flashing. Based on the stacking test it’s clear that the flashing adds a significant amount of irregularity, so I recommend sanding off the flashing.

I was also interested in seeing how consistent a manufacturer’s dice are. Low consistency suggests uneven manufacturing and makes an entire line of dice suspect. For each pair of faces I calculated the standard deviation across all dice for a manufacturer, then identified the largest value for each manufacturer.

Width standard deviation by manufacturer

Company Max 1-20 2-19 3-18 4-17 5-16 6-15 7-14 8-13 9-12 10-11
Chessex 0.014 0.014 0.010 0.006 0.006 0.006 0.006 0.010 0.009 0.008 0.007
GameScience 0.003 0.003 0.003 0.003 0.003 0.002 0.003 0.003 0.003 0.002 0.003
Crystal Caste 0.036 0.036 0.028 0.014 0.013 0.015 0.016 0.029 0.028 0.014 0.013
…CC Opaque 0.002 0.002 0.001 0.001 0.001 0.001 0.002 0.001 0.001 0.002 0.002
…CC Translucent 0.006 0.006 0.003 0.003 0.005 0.004 0.003 0.004 0.006 0.002 0.004
Koplow 0.008 0.008 0.006 0.005 0.004 0.007 0.005 0.005 0.005 0.007 0.003

These results are more surprising. Chessex’s dice are the least consistent. Koplow’s are much more consistent. Crystal Caste’s translucent dice are surprisingly quite consistent. GameScience makes highly consistent dice. The biggest surprise is Crystal Caste’s opaque dice, which were the most consistent. This leads me to conclude that Crystal Caste’s misshapenness is not the result of inconsistent manufacturing processes, nor something as random as being tumbled as Zocchi claims. I believe Crystal Caste has malformed molds or master dice. When Eva bought the dice, she was told that the opaque dice, which we found to be more regular, were from the first few batches of a new set of molds. If Crystal Caste has moved to these new molds, it’s possible that their quality has risen to roughly the same level as Chessex’s.

Clearly the shape of a die impacts how it lands, but it’s hard to say how much it affects the randomness. Sure, GameScience dice seem better than the others, but are Koplow’s random enough for actual play? Determining this requires actually rolling the dice, a task Eva and I plan to undertake in the future. In the meanwhile, check out this fascinating article suggesting that GameScience’s dice are measurably more random than Chessex’s… except when it comes to rolling 14, the side opposite the flashing! There is also some good research on showing a distressing bias toward 1 on Chessex and Games Workshop six-sided dice.

Added 2013-02-26: Matthew J. Neagley has done some more analysis of our measurements over at Gnome Stew.

For myself, I was impressed with Koplow’s dice, but they fail to arrange the numbers so that the sum of opposites sides equals the largest number on the die plus one. That won’t bother some people, but it drives me crazy. I foresee more GameScience dice in my future.

Raw Data and Results

The raw data for the dice width measurements and calculated results are available as a Google Drive spreadsheet.


The Dice

Eva purchased the dice in 2009, directly from each company’s booth at Gen Con. She attempted to get an assortment of colors to avoid bias from a single batch of dice. Eva told each company about our plans to measure the dice and asked if there were particular dice they wanted us to use. They uniformly said to choose whichever dice she liked. The GameScience staff reminded her to stack them on the side with the flashing as well.

The dice are unmodified and have not been subjected to significant wear and tear since their purchase. We have not used them for any games. They spent most of their life sitting in plastic baggies in a storage tub on a shelf.. We left the flashing on the GameScience dice.

The Photographs

Photograph of a framework built out of Legos designed to hold dice in two stacks.I built a measurement framework to hold the dice out of precision engineered Danish scientific aparatus: Legos. The framework has uneven legs so that it leans backward, simplifying stacking dice. The framework has two “grooves” into which dice can be stacked.

I mounted the camera onto a tripod pointed into the corner of a built-in shelf. I pushed a blue Lego baseplate with a smooth center areas into the corner. I placed the framework onto the smooth center area and pushed it into the corner where it was stopped by the Lego studs at the end. By pushing the blue plate into the corner of the shelf and the framework into the corner of the plate, I could ensure consistent placement between photographs.

I put sets of dice into each groove, 10 at a time (9 for the Chessex translucent), stacking so that a given pair of numbers was always on the tops and bottoms: 20/1, 12/9, 11/10, and for GameScience only 14/7. Koplow dice are numbered differently, there is no 12/9 pairing, instead I stacked 12/2.

I stacked the dice so that the triangle of the face on top roughly aligned with the triangle on the face touching it, causing the dice to alternate in facing. I marked each stack with a label sitting at the bottom of the framework. (You can see the labels, a bit of pink, at the bottom of some of the stacks.) I repeated for each pair of sides. The same dice are used in each case, but the order of the dice was not preserved. I ordered the dice using the scientific principle of “whatever I happened to grab,” with the exception of striving to ensure that the top most die was reasonably visible against the background.

I loaded the photographs into the GNU Image Manipulation Program, sliced the photographs into individual stacks, and grouped the stacks by manufacturer and translucency. I used the top and bottom edges of Legos at the top and bottom of the framework for alignment. No scaling was done. Because the groves in the framework were very close, edges of the adjacent stack are visible in the final image.

The Measurements

Eva and I measured the dice using Wixey Digital calipers model wr100, accurate to a thousandth of an inch. In the case of the 7 face on the GameScience dice, the face with worst flashing, we avoided the flashing as much as we could to ensure repeatable results.