Example 8: Bayesian Skill Rating
Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. 2014. Probabilistic programming. In Proceedings of the on Future of Software Engineering (FOSE 2014). ACM, New York, NY, USA, 167-181. DOI=10.1145/2593882.2593900 doi.acm.org/10.1145/2593882.2593900
"Online gaming systems such as Microsoft’s Xbox Live rate relative skills of players playing online games so as to match players with comparable skills for game playing. The problem is to come up with an estimate of the skill of each player based on the outcome of the games each player has played so far. A Bayesian model for this has been proposed ." (Gordon et al., 2014)
"In the figure (left) we show how this model, called TrueSkill, can be expressed as a probabilistic program.
We consider 3 players, A, B and C, whose skills are given by variables skillA, skillB and skillC respectively, which are initialized by drawing from a Gaussian distribution with mean 100 and variance 10. Based on the outcomes of some number of played games (which is 3 games in this example), we condition the skills thus generated. The first game was played between A and B, and A won the game. This is modeled by assigning to the two random variables perfA1 and perfB1 denoting the performance of the two players in the first game, and constraining that perfA1 is greater than perfB1 using an observe statement. Note that the performance of a player (such as player A) is a function of her skill, but with additional Gaussian noise introduced in order to model the variation in performance we may see because of incompleteness in our model (such as the amount of sleep the player got the previous night). The second game was played between B and C, and B won the game. The third game was played between A and C, and A won the game.
Using this model, we want to calculate the joint probability distribution of these random variables, and use this to estimate the relative skills of the players. Note that each observe statement constrains performances in a game, and implicitly the skills of the players, since performances depend on skills. Such a rating can be used to give points, or match players having comparable skill for improved gaming experience. In this example, the skills skillA, skillB, and skillC that are obtained by inference (using the techniques in Section 6) are: skillA = Gaussian(105.7, 0.11), skillB = Gaussian(100.0, 0.11), skillC = Gaussian(94.3, 0.11). Since A won against both B and C, and B won against C, the resulting distribution agrees with our intuitive assessment of their relative skills.
The PROB-code snippet from Gordon et al. is translated by us to a functional CHURCH-program to clarify its semantics. We added gamma-priors for the various standard deviations. The generative model is contained in the CHURCH function "take-a-sample". The number of samples taken was set to 10.000 in this run. This number can be increased to get a better precision of estimates. The sampling method used is the simple-to-understand 'rejection sampling'. The screen-shot presented was generated by using the PlaySpace environment of WebCHURCH.