Summary: I use ”PR” to measure how strong backgammon apps play. The higher a PR, the weaker an app. A PR of 2 is world champion level. A PR of 30 is beginner level. I’m using a superhumanly strong PC program called ”Extreme Gammon 2” to find out how strong apps are.
A little history: up till the early 90s, backgammon programs played a shitty game of backgammon. They were based on human-written rules, and often completely failed to understand a position. Then, shortly after a scientific paper on ”temporal difference learning”, the commercial PC program ”Jellyfish” changed everything – it based on a neural network (the size of the brain of a Jellyfish, thus the name) and played against itself again and again, learning from its mistakes, until it could compete against the best human backgammon players.
It took a long time, until machine learning could crack Go and Chess as well (with Alphago and Alphazero), but in Backgammon, what followed were backgammon programs, and later backgammon apps that outplay even the best human players.
But how do you measure whether a certain player outplays another player? In the long term, winning more games, like in chess, is all that matters. But Backgammon is a game of luck. I have beaten the strongest backgammon apps, and I never would beat any top chess or Go app.
Two terms come into play: equity and PR.
Equity
Very unexpectedly for a dice game, backgammon theory is full of probabilities and likelihoods and expected values. At any time in a game, you’re so-and-so likely to win the game. And if a single game is worth a point, you can calculate what fraction of a point you can expect to still win in a certain position.
Before the first roll, it’s 50/50, and it’s equally likely that you or your opponent wins a point.
When you’re way behind and your opponent is way ahead in the race, it’s 0/100, and your opponent is likely to win the point and you lose a point.
Your equity and the equity of your opponent always add up to 0.
In a normal game situation, it’s somewhere in between these extremes. To measure where you are, the term ”equity” is used. Your equity says how many points / what fraction of a point you probably will win. The details are not important for the remaining article (but can help you in getting better at backgammon). The important bit is: there is a single number which says how likely you are win from any given position. And this number can be used to measure playing strength much more precisely as actual wins.
Suppose you’re in a position where your equity is 0.1. You’re a bit ahead. Now you make a move, and your equity drops to -0.1, while the best move that you haven’t found raised the equity to 0.2. Thus, your move cost you 0.3 points of equity. You made a -0.3 blunder.
You can use ”average equity loss” as a measurement of playing strength. A good player will lose, say, 0.01 points of equity per move on average, while a poor one will lose more than 0.05 points per move.
Now only two more things are needed for you to know what the PR of a human or app is.
- Forced moves don’t count. You’re not a better player if you can make 20 forced moves in a row. Only positions in which you can decide between different moves with different equity changes count.
- People don’t like numbers like 0.02. Therefore PR is your average equity loss per decision * 500. This gives very nice values:
PR | Explanation |
0 to 2.5 | World champ |
2.5 to 5 | World class |
5 to 7.5 | Expert |
7.5 to 12.5 | Advanced |
12.5 to 17.5 | Intermediate |
17.5 to 22.5 | Casual |
22.5 to 30 | Beginner |
Above 30 | Distracted |
The arguably strongest human player, Masayuki Mochizuki (“Mochy”) currently ranks at PR 2.46. The best apps that you will see in my reviews rank at PRs between 0 and 2.
(I’m only ranking at a lowly PR 12, celebrating matches where I’m at 5, sometimes finding myself in complicated back games or so performing at something like 20).
In my reviews I do my best to determine a good estimation of an app’s PR. If your PR is, say, 15, you won’t learn to improve your game from an app that plays on your level. You should at least go for an app with PR 10, or why not just go for an even stronger one?
How do I determine the strength of an app? I play several matches against each app. And then face one of three levels of happiness:
- Totally happy: If the app supports this, I then export the matches to file, and feed them into the mighty ”Extreme Gammon 2”, the arguably strongest PC program on this planet, analyzing on its highest ”Roller++” setting. XG2 will then determine the PR of the app, which is fairly reliable for all but the best apps.
- Not entirely unhappy: If an app doesn’t support exporting matches, I play matches on my tablet and transcribe them into XG2 as I do (a VERY BORING job) to get the ratings.
- Unhappy: sometimes there’s the situation that you can’t move your roll. Some apps don’t let you click to acknowledge that but just play on. If I stay out for several moves in such a recorded way it will e.g. roll and move 10 moves in 10 seconds. There’s no way to transcribe this so I abort the match and note the PR the app got until then. Not very reliable.
So, hope this was not too boring to you, and explained what I mean in a review if I write about an app performing only at PR 20.