Thursday, 16 July 2015

Pacman agent using Minimax and AlphaBeta

Last week I worked on creating a Pacman agent using basic Artificial Intelligence techniques.

I started with watching the first 7 lectures of the MIT 6.034F Artificial Intelligence OCW. It included topics like DFS,BFS,Hill Climbing,Branch and Bound,Minimax ,Alpha-Beta Pruning and A* search.
Now, I wanted to put my newly learnt knowledge to use so I searched a bit online and tumbled upon  the Stanford's Pacman assignment which seemed really interesting.

In this assignment I was given the code for the pacman game and I had to implement the Pacman agent to win the game and score as much points as possible.
So, I implemented minimax algorithm with alpha-beta pruning and currentScore as the evaluation function which gave better results than the default reflex agent.This pacman  agent was able to win almost 5/10 times in mediumClassic with random ghosts and depth=5. As can be seen in the video below, with currentScore as evaluation function, the pacman gets stuck in places where it has no food around it and it cannot decide where to go as it has no sense of where the food/capsules are, So , although it manages to not die, it keeps getting stuck. A better evaluation function should solve this problem.

A better evaluation function would be to get a linear combination of minimum distance to food,no of food left,no of capsules left, minimum distance from ghosts etc.
But the coefficients of these parameters which indicate the weight of the parameters with respect to each other need to be decided. For my evaluation function I arrived at the coefficients with my intuition and a little bit of trial and error. Although i came upon this blog where someone had used linear regression to arrive at the correct value of the coefficients which I think is awesome and would like to explore some other time.
The new evaluation function gives much better results as can be seen in the video below. It does not get stuck in empty spaces or near uneaten food. With this evaluation function the pacman agent was able to win almost 8/10 times.
I won't post the code for what I wrote because this assignment is still used in many colleges and it would be unethical to post the solution online.(and because arriving at the solution yourself and seeing your pacman kick ass of those ghosts is fun too).
The pacman simulations are shown in the video below:

Future work(for me) for pacman:
1)ML(linear regression,SVM) to arrive at the coefficients.
2)Reinforcement learning.
3)Genetic algorithms.
4)Learning to Play Pac-Man: An Evolutionary, Rule-based Approach
5)Learning to Play Using Low-Complexity Rule-Based Policies:Illustrations through Ms. Pac-Man

No comments:

Post a Comment

Math of Intelligence : Logistic Regression

Logistic Regression Logistic Regression ¶ Some javascript to enable auto numberi...