# [artificial intelligence] project 2: multi agent search

Keywords: AI

## Question 2 (5 points): Minimax

Firstly, it is clear that there are two kinds of nodes in the pattern tree of minimax algorithm: Max node and mini node. Because minimax algorithm is a pessimistic algorithm, that is, we need to find the maximum value in the minimum income we can obtain, so we are the max node and the other is the mini node.
In Question2, we need to solve the Pac Man game with minimax algorithm. First, initialize a max node maxval, and traverse all feasible solutions from the initial position of the pac man. At this time, the value to be solved should be the value on the mini node and compared with the initialized Max node. If value > maxval, update the value of maxval, write down the best choice, and return to the best choice after traversal.
Through analysis, we can know that to perform the above operations, two functions need to be defined, one is to obtain the value getMin of the mini node, and the other is to obtain the value getMax of the max node. The ideas of these two functions are similar. They are all used to calculate the next action through recursive calls. The termination condition is that the depth reaches the limit or there is no feasible next step.

````  "*** YOUR CODE HERE ***"
#initialization
maxVal = -float('inf')
bestAction = None
# From the pac man's original position, traverse all feasible next steps
for action in gameState.getLegalActions(0):
value = self._getMin(gameState.generateSuccessor(0, action))
if value is not None and value > maxVal:
maxVal = value
bestAction = action
return bestAction
def _getMax(self, gameState, depth=0, agentIndex=0):
# Obtain all legal operations of pac man in the next step
legalActions = gameState.getLegalActions(agentIndex)
# Termination conditions
if depth == self.depth or len(legalActions) == 0:
return self.evaluationFunction(gameState)
maxVal = -float('inf')
for action in legalActions:
value = self._getMin(gameState.generateSuccessor(agentIndex, action), depth, 1)
if value is not None and value > maxVal:
maxVal = value
return maxVal

def _getMin(self, gameState, depth=0, agentIndex=1):
# Get the next legal operation of the ghosts
legalActions = gameState.getLegalActions(agentIndex)
# Termination conditions
if depth == self.depth or len(legalActions) == 0:
return self.evaluationFunction(gameState)
minVal = float('inf')
# ergodic
for action in legalActions:
# If the current is the last ghost, the next round is to calculate the pac man's behavior, that is, call the MAX function
if agentIndex == gameState.getNumAgents() - 1:
value = self._getMax(gameState.generateSuccessor(agentIndex, action), depth + 1, 0)
else:
value = self._getMin(gameState.generateSuccessor(agentIndex, action), depth, agentIndex + 1)
if value is not None and value < minVal:
minVal = value
return minVal
```

## Question 3 (5 points): Alpha-Beta Pruning

Alpha beta algorithm is the optimization of minimax algorithm. Minimax algorithm is an exhaustive algorithm, which needs to traverse all nodes. Alpha beta algorithm can improve the efficiency of the algorithm by pruning and subtracting unnecessary nodes. In this algorithm α Represents the maximum lower bound of all possible solutions at present, β Represents the minimum upper bound of all possible solutions at present. In the process of solving, α and β Will gradually approach. If for a node, a α > β In this case, it shows that this point will not produce the optimal solution. Therefore, pruning is completed without extending it.
The code of alpha beta algorithm is similar to that of minimaxs algorithm, but the judgment condition is to judge the current value and α,β Relationship.

```"*** YOUR CODE HERE ***"
# Expand from the root node to find the MAX value
return self._getMax(gameState)

def _getMax(self, gameState, depth=0, agentIndex=0, alpha=-float('inf'),
beta=float('inf')):
# Termination conditions
legalActions = gameState.getLegalActions(agentIndex)
if depth == self.depth or len(legalActions) == 0:
return self.evaluationFunction(gameState), None
# Traversing the possible next step of Pac Man
maxVal = None
bestAction = None
for action in legalActions:
# Traverse all ghosts
value = self._getMin(gameState.generateSuccessor(agentIndex, action), depth, 1, alpha, beta)
if value is not None and (maxVal == None or value > maxVal):
maxVal = value
bestAction = action
# according to α-β Pruning algorithm, if v > β， Returns v directly
if value is not None and value > beta:
return value, action
# according to α-β Pruning algorithm needs to be updated here α Value of
if value is not None and value > alpha:
alpha = value
return maxVal, bestAction

def _getMin(self, gameState, depth=0, agentIndex=0, alpha=-float('inf'),
beta=float('inf')):
# Termination conditions
legalActions = gameState.getLegalActions(agentIndex)
if depth == self.depth or len(legalActions) == 0:
return self.evaluationFunction(gameState), None
# Traverse the next possible step of the current ghost
minVal = None
bestAction = None
for action in legalActions:
if agentIndex >= gameState.getNumAgents() - 1:
# It's much different from minimax α and β Value of
value = self._getMax(gameState.generateSuccessor(agentIndex, action), depth + 1, 0, alpha, beta)
else:
# If it is not the last ghost, continue to traverse the next ghost, that is, agentIndex+1
value = \
self._getMin(gameState.generateSuccessor(agentIndex, action), depth, agentIndex + 1, alpha, beta)
if value is not None and (minVal == None or value < minVal):
minVal = value
bestAction = action
# according to α-β Pruning algorithm, if v< α， Returns v directly
if value is not None and value < alpha:
return value, action
# according to α-β Pruning algorithm needs to be updated here β Value of
if value is not None and value < beta:
beta = value
return minVal, bestAction`
```

## Question 4 (5 points): Expectimax

stay α-β In the pruning algorithm, we cut unnecessary search branches, α-β The pruning algorithm defines a( α-β) In the process of our downward search, there may be more than α If the value is large, we return it immediately. We don't need to know its definite value, so we cut it. We don't know the node sorting of cutting. If possible, the children of max node are arranged in descending order and the children of min node are arranged in ascending order, then we can search for the least number of nodes, but this obviously doesn't hold, and we can't know in advance. So we Using iteration depth limit and loop deepening, search down k steps and replace with the maximum expected value to avoid missing better cases.

```    def _getExpectation(self, gameState, depth=0, agentIndex=0):
legalActions = gameState.getLegalActions(agentIndex)
# If the search depth exceeds the limit or there is no next step, the evaluation function value is returned
if depth == self.depth or len(legalActions) == 0:
return self.evaluationFunction(gameState)
# Total initialization utility values
totalUtil = 0
numActions = len(legalActions)
# Poll the current ghost for all possible next steps
for action in legalActions:
# Similarly, if it is the last ghost, the next step is to calculate the MAX value of the pac man and count it into the total utility
if agentIndex >= gameState.getNumAgents() - 1:
totalUtil += self._getMax(gameState.generateSuccessor(agentIndex, action), depth + 1, 0)
# Otherwise, go through the ghosts one by one, calculate the Expectation value, and count it into the total utility
else:
totalUtil += self._getExpectation(gameState.generateSuccessor(agentIndex, action), depth,
agentIndex + 1)
# Finally, we need to average all possible utility values in the next step and return them
```

## Question 5 (6 points): Evaluation Function

In this problem, we need to design a better evaluation function to make PAC eaters score higher and more efficient.
A better evaluation function should take into account both the position of beans and the score of monsters, so that bean eaters can eat more beans and avoid collision with monsters. Manhattan distance is used here.

``` # Distance assessment of nearest beans
distancesToFoodList = [util.manhattanDistance(newPos, foodPos) for foodPos in newFood.asList()]
if len(distancesToFoodList) > 0:
score += WEIGHT_FOOD / min(distancesToFoodList)
else:
score += WEIGHT_FOOD
```
```# Ghost distance assessment
for ghost in newGhostStates:
distance = manhattanDistance(newPos, ghost.getPosition())
if distance > 0:
if ghost.scaredTimer > 0:  # If scared, add points
score += WEIGHT_SCARED_GHOST / distance
else:  # If not, decrease points
score += WEIGHT_GHOST / distance
else:
return -INF  # Pacman is dead at this point
```

Posted by Bigun on Thu, 16 Sep 2021 09:55:31 -0700