How True Are Your d20s?

This entry was posted in dice on by .

Old black and white photograph of stacks of dice from TSR, Koplow, Armory, ‘oriental imports’, and GameScience. There are two stacks for each company. Each pair of stacks is uneven, except for the pair of GameScience Stacks

Photograph of Lou Zocchi, a large white man, balding with grey hair, wearing dark rimmed glasses and a t-shirt.

Lou Zocchi

Lou Zocchi is a man who cares deeply about dice. Zocchi’s well practiced speech on on dice quality is famous. It’s fairly entertaining if you have 20 minutes to spare. Part of his argument is the above photograph. Zocchi stacked twenty-sided dice from several companies. Each stack places the same numbers on the top and bottom. For example, one stack might have 1 placed on top of 20 repeatedly, while the next stack might have 9 placed on top of 12 repeatedly. Based on the height of the stacks, it appears that everyone’s dice are irregularly shaped. Everyone’s, except for Zocchi’s.

Did Zocchi pick and choose for best effect? If it was accurate at one point, is it still? I’m pretty sure that the photograph dates to the late 1980s or early 1990s. One of the companies, TSR, hasn’t existed since 1997. Is this still a fair comparison? Eva and I set out to find out.

Image of stacks of dice from Crystal Caste, Chessex, Koplow, and GameScience. The stacks are assembled from multiple photographs.

(Click to see it in all of its obsessive glory. You can also see the monstrous 6,219×1,920 image.)

Photograph of two twenty sided dice from GameScience, one opaque green and one translucent red. The 7 face in on top of both, and flashing is visible along one edge of the 7 face.

Flashing on GameScience dice

These are stacks of twenty-sided dice from Crystal Caste, Chessex, Koplow, and GameScience. There are 20 from each company, 10 opaque and 10 translucent or transparent. I lost one of the Chessex translucent dice, so there are only 9 in those stacks. We stacked each set of 10 dice three times, once stacking 20s and 1s, once stacking 12s and 9s, and once stacking 11s and 10s. For the GameScience dice, we also stacked the 14s and 7s, as the 7s have the worst of the flashing.

There is an clear problem with the Crystal Caste dice. We didn’t notice anything odd when we originally purchased them, but their elongated shape is obvious when you know to look for it. Chessex has similar, but less severe, irregularities. Koplow’s dice hold up well in this test, especially the translucent dice. GameScience’s dice are incredibly consistent… except for the 14/7 sides.

There is substance to Zocchi’s claims, although Koplow is a serious challenger. But the photographs are really a publicity stunt, colorful, not rigorous science. So we broke out a digital caliper and measured the distance between every opposite pair of faces for every one of the 79 dice.

Once we had the measurements, I analyzed them. For each individual die I calculated the largest difference between the widths of each pair of sides. Across all of the dice for a given manufacturer, I then calculated the minimum, average, and maximum differences. I also calculated the standard deviation across all of the face pairs across all of the dice for a manufacturer. If a company’s dice are uneven, it will show up here. I broke Crystal Caste into two groups because it became clear that their translucent and opaque dice are wildly different.

Differences in paired face widths by manufacturer in inches

Company Min Avg Max StdDev
Chessex 0.014 0.020 0.027 0.010
GameScience 0.002 0.005 0.009 0.003
Crystal Caste 0.016 0.026 0.044 0.022
…CC Opaque 0.016 0.017 0.021 0.006
…CC Translucent 0.028 0.035 0.044 0.012
Koplow 0.006 0.010 0.020 0.006

The numbers reinforce what is visible in the photograph. Crystal Caste’s translucent dice are the most irregular. Crystal Caste’s opaque and Chessex’s entire line are more uniform, but aren’t in the same class as Koplow and GameScience. Koplow’s dice are very uniform, but GameScience trumps everyone else.

We avoided the flashing on the GameScience dice as it was hard to get repeatable results measuring on the flashing. Based on the stacking test it’s clear that the flashing adds a significant amount of irregularity, so I recommend sanding off the flashing.

I was also interested in seeing how consistent a manufacturer’s dice are. Low consistency suggests uneven manufacturing and makes an entire line of dice suspect. For each pair of faces I calculated the standard deviation across all dice for a manufacturer, then identified the largest value for each manufacturer.

Width standard deviation by manufacturer

Company Max 1-20 2-19 3-18 4-17 5-16 6-15 7-14 8-13 9-12 10-11
Chessex 0.014 0.014 0.010 0.006 0.006 0.006 0.006 0.010 0.009 0.008 0.007
GameScience 0.003 0.003 0.003 0.003 0.003 0.002 0.003 0.003 0.003 0.002 0.003
Crystal Caste 0.036 0.036 0.028 0.014 0.013 0.015 0.016 0.029 0.028 0.014 0.013
…CC Opaque 0.002 0.002 0.001 0.001 0.001 0.001 0.002 0.001 0.001 0.002 0.002
…CC Translucent 0.006 0.006 0.003 0.003 0.005 0.004 0.003 0.004 0.006 0.002 0.004
Koplow 0.008 0.008 0.006 0.005 0.004 0.007 0.005 0.005 0.005 0.007 0.003

These results are more surprising. Chessex’s dice are the least consistent. Koplow’s are much more consistent. Crystal Caste’s translucent dice are surprisingly quite consistent. GameScience makes highly consistent dice. The biggest surprise is Crystal Caste’s opaque dice, which were the most consistent. This leads me to conclude that Crystal Caste’s misshapenness is not the result of inconsistent manufacturing processes, nor something as random as being tumbled as Zocchi claims. I believe Crystal Caste has malformed molds or master dice. When Eva bought the dice, she was told that the opaque dice, which we found to be more regular, were from the first few batches of a new set of molds. If Crystal Caste has moved to these new molds, it’s possible that their quality has risen to roughly the same level as Chessex’s.

Clearly the shape of a die impacts how it lands, but it’s hard to say how much it affects the randomness. Sure, GameScience dice seem better than the others, but are Koplow’s random enough for actual play? Determining this requires actually rolling the dice, a task Eva and I plan to undertake in the future. In the meanwhile, check out this fascinating article suggesting that GameScience’s dice are measurably more random than Chessex’s… except when it comes to rolling 14, the side opposite the flashing! There is also some good research on showing a distressing bias toward 1 on Chessex and Games Workshop six-sided dice.

Added 2013-02-26: Matthew J. Neagley has done some more analysis of our measurements over at Gnome Stew.

For myself, I was impressed with Koplow’s dice, but they fail to arrange the numbers so that the sum of opposites sides equals the largest number on the die plus one. That won’t bother some people, but it drives me crazy. I foresee more GameScience dice in my future.

Raw Data and Results

The raw data for the dice width measurements and calculated results are available as a Google Drive spreadsheet.

Methodology

The Dice

Eva purchased the dice in 2009, directly from each company’s booth at Gen Con. She attempted to get an assortment of colors to avoid bias from a single batch of dice. Eva told each company about our plans to measure the dice and asked if there were particular dice they wanted us to use. They uniformly said to choose whichever dice she liked. The GameScience staff reminded her to stack them on the side with the flashing as well.

The dice are unmodified and have not been subjected to significant wear and tear since their purchase. We have not used them for any games. They spent most of their life sitting in plastic baggies in a storage tub on a shelf.. We left the flashing on the GameScience dice.

The Photographs

Photograph of a framework built out of Legos designed to hold dice in two stacks.I built a measurement framework to hold the dice out of precision engineered Danish scientific aparatus: Legos. The framework has uneven legs so that it leans backward, simplifying stacking dice. The framework has two “grooves” into which dice can be stacked.

I mounted the camera onto a tripod pointed into the corner of a built-in shelf. I pushed a blue Lego baseplate with a smooth center areas into the corner. I placed the framework onto the smooth center area and pushed it into the corner where it was stopped by the Lego studs at the end. By pushing the blue plate into the corner of the shelf and the framework into the corner of the plate, I could ensure consistent placement between photographs.

I put sets of dice into each groove, 10 at a time (9 for the Chessex translucent), stacking so that a given pair of numbers was always on the tops and bottoms: 20/1, 12/9, 11/10, and for GameScience only 14/7. Koplow dice are numbered differently, there is no 12/9 pairing, instead I stacked 12/2.

I stacked the dice so that the triangle of the face on top roughly aligned with the triangle on the face touching it, causing the dice to alternate in facing. I marked each stack with a label sitting at the bottom of the framework. (You can see the labels, a bit of pink, at the bottom of some of the stacks.) I repeated for each pair of sides. The same dice are used in each case, but the order of the dice was not preserved. I ordered the dice using the scientific principle of “whatever I happened to grab,” with the exception of striving to ensure that the top most die was reasonably visible against the background.

I loaded the photographs into the GNU Image Manipulation Program, sliced the photographs into individual stacks, and grouped the stacks by manufacturer and translucency. I used the top and bottom edges of Legos at the top and bottom of the framework for alignment. No scaling was done. Because the groves in the framework were very close, edges of the adjacent stack are visible in the final image.

The Measurements

Eva and I measured the dice using Wixey Digital calipers model wr100, accurate to a thousandth of an inch. In the case of the 7 face on the GameScience dice, the face with worst flashing, we avoided the flashing as much as we could to ensure repeatable results.