CodinGame: Building an AI to play (and win) at "Penguins"

This November, I came back to the US from visiting South Korea, and while I was self quarantining, I decided to get back into solving puzzles on CodinGame. A game called "Penguins" popped up as the weekly challenge, so I decided to give it a shot. I worked on building an AI for a few days, and made decent progress, but then took a break to compete in the Fall Challenge. After the Fall Challenge was over, I decided to come back to the Penguins game and try to get my AI to first place.

I finally achieved my goal on December 3rd! I did this using a variation of Monte Carlo Tree Search with a custom evaluation function for scoring the gamestate.

Leaderboard (December 2020)

KnightMoves, that's me!

What is Penguins?

Penguins is a game played on a hexagonal grid, and the goal is to move your penguins around the board and eat as many fish as possible. As you move around, the ice breaks, so you have to be careful not to get trapped or stranded. It's based on the boardgame "Hey, That's My Fish!"

Penguins: Midway through a 4-player game

The game can be played with 2, 3, or 4 players, and the AI has to be able to play against any of these numbers of opponents.

In the first stage of the game, players place their penguins, and then in the next stage they take turns moving them around. Whenever a penguin moves, it eats the fish on the ice block it's standing on (1, 2, or 3 fish), and then that ice block breaks and becomes water. Penguins can move any number of tiles in a straight line in any direction, but they can't move through water or other penguins.

Also, penguins can kill other penguins by bumping them off the ice to their untimely deaths. Or maybe they just swim around in the water. Either way, they're out of the game.

Full game played with 3 players.

Algorithm

I used a Tree Search algorithm based on Monte Carlo Tree Search (MCTS) with Upper Confidence Bounds Applied To Trees (UCT). MCTS works well in games where you don't have good heuristics for an evaluation function (like Go), but it is possible to write a good heuristic evaluation function for penguins, so I replaced the random playouts with scores from my evaluation function.

Notable Features of Search

The game has a fairly high branching factor (b=~50), so doing an exhaustive search with an algorithm like minimax, even to a shallow depth, is not feasible within the 50ms per turn time limit enforced by the game.

To make matters worse, for a 4 player game, you need to search to at least depth 4 to consider all of your opponents' responses for a single turn. That's 50^4 = ~6 million nodes. AND you can't even use alpha-beta pruning with a multiplayer version of minimax like the maxn algorithm. That's not promising.

For that reason, MCTS seemed like a good fit. It's a variable depth search. And because it explores the tree asymmetrically, promising moves should be evaluated more deeply than non-promising ones.

When scoring a node, usually scores for MCTS are either 1 or 0, representing a win or loss. In vanilla MCTS, getting a 1 or 0 is not a particularly strong signal, because that win is the result of a random playout. What if we're investigating a promising move, and 9 out of 10 of the opponent's responses appear to lead to a win, but one response leads to a loss. Is that loss just a random loss? Or is it a very strong counter-move? The only way to find out is to explore deeper.

When using an evaluation function, we actually do have an idea of whether the move is good or bad. If our penguin dies, we can send a strong negative signal back up the tree, (e.g. -10, or -100, or more), which tells the search algorithm more strongly to stop considering this branch.

For that reason, my code scales the scores in around a [-10, 10] range. I did not modify the basic UCT formula, so more tuning could be done to improve the balance of exploitation and exploration, but initial results are good.

Notable Features of Evaluation (Scoring)

In the beginning of the game, the penguins can move around fairly freely, but towards the end the game the board starts to break up into smaller "islands". This means penguins can get cut off and run out of moves.

Old version of my AI getting cut off really badly by LeRenard: https://www.codingame.com/replay/511095579 (frame 21 is the death blow)

My evaluation function assumes any penguins sharing an island will split up the fish there. For penguins that are on the border of 2 or more islands, they have to choose which island they will belong to for scoring purposes.

In the beginning of the game, there's just one giant island, so it's good to have some additional scoring heuristics, like how many reachable squares your penguins have, to guide their movement.

Another important feature of my scoring algorithm was a "Tempo Bonus". This removes score oscillations that result of one player having gone first, and therefore having more moves. I believe this is critical to the Tree Search operating effectively.

For example, if we're first player, and we have to decide where to place our first penguin, the algorithm will try every valid spot and give it a score. OMG! the evaluation function thinks every possible move is amazing! After we placed our first penguin, we have points and our opponent has none!

Then the tree search will explore the most promising move, but when it considers the opponent's responses, bummer! Most of the opponent responses lead to a game that's tied. This move sucked after all!

This leads to the search algorithm exploring wider instead of deeper, because it now wants to go back and explore the original possible moves, instead of looking deeper down the tree for the first promising move. All the initial moves were great! Our position was so much better before that darn opponent got their turn!

The Tempo Bonus takes this into account, which keeps the scores more stable over the course of turn. Even if it doesn't know exactly what the opponent will do yet, it knows they will do something.

3-Player and 4-Player Games

Once I got the Tree Search with my Island Evaluation Function working, my code started doing very well in 1v1 battles, but was performing poorly in 3-player and especially 4-player games.

I was originally trying to maximize my score with respect to the first place player, ignoring the others. This did not work well.

In one game, my penguin decided to make a risky move and go for some fish near the edge, because it considered that I was in last place, so I wouldn't be targeted. Of course the second place player immediately bumped my penguin into the water, even though it didn't improve their position against the first place player. My search algorithm hadn't considered that possibility.

The final version of the algorithm compares the scores of the current player against the scores of each opponent, giving the most weight to the difference between that player and the first place player.

I also adjusted my code to be more paranoid when it came to preserving my little penguins' lives. It considers it more likely that an opponent will kill my penguins than another random player's penguins. It also gives a stronger penalty when my penguin dies, compared to when another player's penguin dies.

Conclusion

Overall I'm happy with the way my code turned out, and I really enjoyed the challenge. (Thank you to LeRenard for making it!). There are still some improvements I could make in the future, but for now I think I will set my sights on improving my standing in Ultimate Tic Tac Toe, one of the Holy Grails of CodinGame competitions.

P.S.

Q. Why didn't I use a Neural Network for my evaluation function? Neural Networks solve everything!

A. CodinGame has a fairly conservative file size limit, so it's not really feasible to use a NN for their challenges.

Search This Blog

Useful Use of Cat