[artificial intelligence] project 2: multi agent search

Keywords: AI

Question 2 (5 points): Minimax

Firstly, it is clear that there are two kinds of nodes in the pattern tree of minimax algorithm: Max node and mini node. Because minimax algorithm is a pessimistic algorithm, that is, we need to find the maximum value in the minimum income we can obtain, so we are the max node and the other is the mini node.
In Question2, we need to solve the Pac Man game with minimax algorithm. First, initialize a max node maxval, and traverse all feasible solutions from the initial position of the pac man. At this time, the value to be solved should be the value on the mini node and compared with the initialized Max node. If value > maxval, update the value of maxval, write down the best choice, and return to the best choice after traversal.
Through analysis, we can know that to perform the above operations, two functions need to be defined, one is to obtain the value getMin of the mini node, and the other is to obtain the value getMax of the max node. The ideas of these two functions are similar. They are all used to calculate the next action through recursive calls. The termination condition is that the depth reaches the limit or there is no feasible next step.

`  "*** YOUR CODE HERE ***"
    #initialization
    maxVal = -float('inf')
    bestAction = None
    # From the pac man's original position, traverse all feasible next steps
    for action in gameState.getLegalActions(0):
        value = self._getMin(gameState.generateSuccessor(0, action))
        if value is not None and value > maxVal:
            maxVal = value
            bestAction = action
    # Finally, return to the best choice
    return bestAction
def _getMax(self, gameState, depth=0, agentIndex=0):
    # Obtain all legal operations of pac man in the next step
    legalActions = gameState.getLegalActions(agentIndex)
    # Termination conditions
    if depth == self.depth or len(legalActions) == 0:
        return self.evaluationFunction(gameState)
    maxVal = -float('inf')
    for action in legalActions:
        # Start with the first ghost
        value = self._getMin(gameState.generateSuccessor(agentIndex, action), depth, 1)
        if value is not None and value > maxVal:
            maxVal = value
    return maxVal

def _getMin(self, gameState, depth=0, agentIndex=1):
    # Get the next legal operation of the ghosts
    legalActions = gameState.getLegalActions(agentIndex)
    # Termination conditions
    if depth == self.depth or len(legalActions) == 0:
        return self.evaluationFunction(gameState)
    minVal = float('inf')
    # ergodic
    for action in legalActions:
        # If the current is the last ghost, the next round is to calculate the pac man's behavior, that is, call the MAX function
        if agentIndex == gameState.getNumAgents() - 1:
            value = self._getMax(gameState.generateSuccessor(agentIndex, action), depth + 1, 0)
        else:
            value = self._getMin(gameState.generateSuccessor(agentIndex, action), depth, agentIndex + 1)
        if value is not None and value < minVal:
            minVal = value
    return minVal

Question 3 (5 points): Alpha-Beta Pruning

Alpha beta algorithm is the optimization of minimax algorithm. Minimax algorithm is an exhaustive algorithm, which needs to traverse all nodes. Alpha beta algorithm can improve the efficiency of the algorithm by pruning and subtracting unnecessary nodes. In this algorithm α Represents the maximum lower bound of all possible solutions at present, β Represents the minimum upper bound of all possible solutions at present. In the process of solving, α and β Will gradually approach. If for a node, a α > β In this case, it shows that this point will not produce the optimal solution. Therefore, pruning is completed without extending it.
The code of alpha beta algorithm is similar to that of minimaxs algorithm, but the judgment condition is to judge the current value and α,β Relationship.

"*** YOUR CODE HERE ***"
    # Expand from the root node to find the MAX value
    return self._getMax(gameState)[1]
    
def _getMax(self, gameState, depth=0, agentIndex=0, alpha=-float('inf'),
            beta=float('inf')):
    # Termination conditions
    legalActions = gameState.getLegalActions(agentIndex)
    if depth == self.depth or len(legalActions) == 0:
        return self.evaluationFunction(gameState), None
    # Traversing the possible next step of Pac Man
    maxVal = None
    bestAction = None
    for action in legalActions:
        # Traverse all ghosts
        value = self._getMin(gameState.generateSuccessor(agentIndex, action), depth, 1, alpha, beta)[0]
        if value is not None and (maxVal == None or value > maxVal):
            maxVal = value
            bestAction = action
        # according to α-β Pruning algorithm, if v > β, Returns v directly
        if value is not None and value > beta:
            return value, action
        # according to α-β Pruning algorithm needs to be updated here α Value of
        if value is not None and value > alpha:
            alpha = value
    return maxVal, bestAction

def _getMin(self, gameState, depth=0, agentIndex=0, alpha=-float('inf'),
            beta=float('inf')):
    # Termination conditions
    legalActions = gameState.getLegalActions(agentIndex)
    if depth == self.depth or len(legalActions) == 0:
        return self.evaluationFunction(gameState), None
    # Traverse the next possible step of the current ghost
    minVal = None
    bestAction = None
    for action in legalActions:
        if agentIndex >= gameState.getNumAgents() - 1:
            # It's much different from minimax α and β Value of
            value = self._getMax(gameState.generateSuccessor(agentIndex, action), depth + 1, 0, alpha, beta)[0]
        else:
            # If it is not the last ghost, continue to traverse the next ghost, that is, agentIndex+1
            value = \
            self._getMin(gameState.generateSuccessor(agentIndex, action), depth, agentIndex + 1, alpha, beta)[0]
        if value is not None and (minVal == None or value < minVal):
            minVal = value
            bestAction = action
        # according to α-β Pruning algorithm, if v< α, Returns v directly
        if value is not None and value < alpha:
            return value, action
        # according to α-β Pruning algorithm needs to be updated here β Value of
        if value is not None and value < beta:
            beta = value
    return minVal, bestAction`

Question 4 (5 points): Expectimax

stay α-β In the pruning algorithm, we cut unnecessary search branches, α-β The pruning algorithm defines a( α-β) In the process of our downward search, there may be more than α If the value is large, we return it immediately. We don't need to know its definite value, so we cut it. We don't know the node sorting of cutting. If possible, the children of max node are arranged in descending order and the children of min node are arranged in ascending order, then we can search for the least number of nodes, but this obviously doesn't hold, and we can't know in advance. So we Using iteration depth limit and loop deepening, search down k steps and replace with the maximum expected value to avoid missing better cases.

    def _getExpectation(self, gameState, depth=0, agentIndex=0):
        legalActions = gameState.getLegalActions(agentIndex)
        # If the search depth exceeds the limit or there is no next step, the evaluation function value is returned
        if depth == self.depth or len(legalActions) == 0:
            return self.evaluationFunction(gameState)
            # Total initialization utility values
        totalUtil = 0
        numActions = len(legalActions)
        # Poll the current ghost for all possible next steps
        for action in legalActions:
            # Similarly, if it is the last ghost, the next step is to calculate the MAX value of the pac man and count it into the total utility
            if agentIndex >= gameState.getNumAgents() - 1:
                totalUtil += self._getMax(gameState.generateSuccessor(agentIndex, action), depth + 1, 0)
            # Otherwise, go through the ghosts one by one, calculate the Expectation value, and count it into the total utility
            else:
                totalUtil += self._getExpectation(gameState.generateSuccessor(agentIndex, action), depth,
                                                  agentIndex + 1)
        # Finally, we need to average all possible utility values in the next step and return them
        return totalUtil / float(numActions)

Question 5 (6 points): Evaluation Function

In this problem, we need to design a better evaluation function to make PAC eaters score higher and more efficient.
A better evaluation function should take into account both the position of beans and the score of monsters, so that bean eaters can eat more beans and avoid collision with monsters. Manhattan distance is used here.

 # Distance assessment of nearest beans
    distancesToFoodList = [util.manhattanDistance(newPos, foodPos) for foodPos in newFood.asList()]
    if len(distancesToFoodList) > 0:
        score += WEIGHT_FOOD / min(distancesToFoodList)
    else:
        score += WEIGHT_FOOD
# Ghost distance assessment
    for ghost in newGhostStates:
        distance = manhattanDistance(newPos, ghost.getPosition())
        if distance > 0:
            if ghost.scaredTimer > 0:  # If scared, add points
                score += WEIGHT_SCARED_GHOST / distance
            else:  # If not, decrease points
                score += WEIGHT_GHOST / distance
        else:
            return -INF  # Pacman is dead at this point

Posted by Bigun on Thu, 16 Sep 2021 09:55:31 -0700