Creating The Right Knobs

I recently participated in Softwareskills’ Liar’s Dice competition, and since people have expressed interest to hear about how I managed to win, I decided to summarize the process and results.

soft

As I prize I got 500SEK at Teknikmagasinet (Swedish store), a USB memory and this nice piece of paper 🙂

Liar’s Dice and the Competition

Liar’s dice is a game where each player starts with six dice. One player starts by announcing a bet. A bet consists of a number and a face, for instance four fives. The next player can then either challenge the bet or raise the bet. You can read about the details at https://en.wikipedia.org/wiki/Liar%27s_dice. Softwareskills’ made a competition about writing the best AI. I noticed that it is not clear to everyone what an AI is in this context, and how to get started writing one. So let’s discuss that briefly.

Simplified View of an AI

An AI (actually Intelligent Agent [IA], but I’ll continue to call it AI) in its simplest form can be seen as a program that given a state responds with a valid action. A good AI must, in addition to producing valid actions, produce the best valid action given a state. What best means is the real challenge which typically requires mathematical analysis and creativity to figure out. In some cases you might calculate what the best action is, but it might not be easy given that you don’t know how your opponents behave. In the case of Liar’s dice you don’t know your opponents’ dice and they might behave differently depending on what dice they have. Let’s have a look at how I approached this challenge.

The Development of My AI

First off, I’m not developing AIs on a day to day basis. I had some experience from before, but not much. I started by drawing the “Simplified View of an AI” on a piece of paper, this made it easier for me to break the problem down into the main components:

  1. Generate all valid actions given a state
  2. Put a score on all valid actions
  3. Pick the valid action with the highest score

Let’s look at each problem separately, starting with how to generate valid actions.

Action Generation

To generate actions you need to create a function that given the current state and the rules of the game can produce all valid actions. This is pretty boring work that I wanted to get through as quickly as possible to move on to what could actually get me a good place on the top list. Softwareskills’ provided a model with a track that has 27 positions. A die on a position corresponds to a specific bet with the face of the die (some positions require the bet to be a star, where star = six). Now, either the bet can be challenged or raised. A raise can either be of a “higher” face on the same position or any face at a later position. So to generate valid actions I separated the problem into the following sub problems:

  1. Find the position of the current bet on the “track”
  2. Given the position and face:
    1. Get all bets with a higher face on the same position
    2. For all subsequent positions, generate bets with all possible faces

This allowed me to quite quickly generate all possible actions. Now, to value all the possible bets, provide them with a score, I needed to revisit some basic statistics.

Statistics

When I first saw the competition I thought it was just a matter of calculating the optimal action using statistics. I asked myself whether I was more likely to win if I challenged the previous bet or if I raised the bet, and if I raise, how much should I raise the bet. Based on these questions I created new questions which I could directly “solve”, namely:

  1. Given my known dice, and the number of unknown dice in game, what is the probability of the previous bet to be true?
  2. Given my known dice, and the number of unknown dice in game, what is the probability of my raised bet to be true?

Simple was my first thought before I actually got started. I was easy to come up with a solution, and it could even work decently. The only problem was that it was wrong. And to reach the top of the toplist the number of afforded mistakes is pretty low. It was not easy to figure out whether the statistics was properly calculated or not. When I was stuck at a score far from the best, I challenged my code by doing static analysis, which in this case meant verifying the mathematics. I realized, by reasoning, that it was wrong. This led me to reach for my old book from school (Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences by J. Susan Milton, Jesse Arnold) as well as to look at a couple of lectures at Khan Academy (https://www.khanacademy.org/math/probability). After a couple of hours studying combinations, permutations and the good old “number of favorable events / number of total outcomes” as well as experimenting with simple examples, I managed to get it right. I jumped to the top but not to the first place. Something was still missing.

Getting Desperate

I wanted to go for a clean solution based on simple mathematics. But analysing the games made it clear that my AI was often put into bad situations, where no action was particularly good, and that it challenged the bet too often. It was not really playing the game, it was still just following the rules but a bit more intelligently. I started to add a lot of knobs now. For instance weighted averages and factors and terms of various kinds. I created an massive amount of mathematical functions whose graphs looked approximately as I wanted. Many linear (y=kx+m) as well as some polynomials of a higher degree. I experimented by taking into account everything I could think of, such as if I have a larger or smaller part of the game than the previous player or next player, if the game just started, the previous bets of other players and many other things. The factors could influence the score of an action. But I didn’t know how much they should do so. So I created knobs, factors that I could adjust to make the impact higher or lower. Then I tried to find the optimal values for the knobs.

func

Function used for deciding how much to trust other opponent’s bets. Eg. the second bet is trusted to 100%. Note that this function was combined with other functions to calculate the final trust factor.

I basically created a big optimization problem which I only could try to solve empirically. I did this in iterations. I created a lot of knobs, and realized that I couldn’t grasp what I actually did. Then I removed all of them and tried to create a new set of knobs. Repeated this a couple of times until I finally overtook the throne, meaning I finally conquered first place on the toplist. Even if I wouldn’t have won, I feel that this kind of competition is very valuable. To share why I think so I will briefly mention some things I learned from participating.

Lessons Learned

When I started the competition I didn’t even know there was a prize. This was just a nice bonus. I entered the competition for the challenge. I was sure that I would learn something. Afterwards I think my biggest gain is refreshing statistics skills. It’s quite fascinating how easy it was to do so despite not having worked with it for a long time. I also learnt that it is important to carefully measure the results when you have knobs, or parameters, that you can adjust. With the risk of using too big words, I would say that it taught me to be a slightly better scientist by clearly showing the necessity of careful measurements when empirically looking for optimal parameters. From a software engineering point of view it made it even clearer that correctness of the core is very important, and just because something seems correct it might still have bugs in it. Proper testing of the basic units (function in this case) is very valuable. And just because your solution is based on mathematics it doesn’t mean it will be right or wrong. It might still be slightly wrong.

Furthermore, I learnt the value of keeping the code in a at least decent state at all time, to not hinder oneself from experimenting due to the effort being too high. And finally, I learnt the value of to kill your darlings over and over. Even if you are at second place, when you’re stuck, start over and just keep the generic core which lets you build working solutions quickly. Don’t stick to your good solution just because you feel that you invested so much time in it. Challenge it! You managed to come up with it once, and probably learnt from it, so the next time you might come up with something even better.