In the age of AI, play "airplane battle" with your face. PaddleHub makes you a face control player in seconds

Keywords: Mobile Python pip

Is it out to play airplane games with mobile phones in the AI era? Flying oarPaddle Hub takes you to experience different ways of playing games.

Since the birth of the first game in the world, video games need to rely on the handle and buttons. Whether it's PC games or XBOX, PS and other host games, the controller and handle are indispensable.

Until 2009, Microsoft released the first generation of Kinect, which took human detection as game control, completely subverted the single operation of the game, created a precedent of freeing hands, and made the concept of human-computer interaction more thoroughly displayed. But after that, it's a pity that Microsoft abandoned Kinect completely in 2018!

In the popular game culture, the mainstream way of human-computer interaction is still inseparable from the handle. Even in the mobile era of mobile phones and pads, most mainstream games still use the interaction way of virtual keyboard and virtual handle.

In fact, a large part of the driving force of human development comes from the human nature of "laziness", so it would be very interesting for me to play games in a simpler and free interactive way. Although we still have a long way to go from brain computer interface and thinking control, with the development of deep learning, we believe that in the near future, the interaction mode will also change dramatically.

Based on this, I try to do a face to plane game project!

Effect display

 

Use the camera of the ordinary computer to capture the human movement (here mainly the head), and then convert it into the control of the game.

Turn left head: the plane flies to the left

Turn right head: the plane flies to the right

Head up: plane up

Bow: aircraft down

Open mouth: drop the bomb!

The operation is simple and joyful. It is said that it can cure the cervical spondylosis of procedural apes

And the implementation of all this is very simple, as long as you use Flying oarPaddle The deep learning model encapsulated by the Hub is enough to obtain the angle detection of the head, and then link to the game control!

No need for the advanced technology concept of artificial intelligence, absolutely Xiaobai students can do it easily!!!


Implementation method

 

There are three steps to realize the airplane game:

  • use Paddle The face and mark localization model in the Hub implements head motion monitoring.
  • Pygame is used to implement the main program of the airplane game. (here, we use the simplest and easy-to-use Pygame, which is usually used for the initial experience of Python)
  • Add the head movement monitoring module to the game.

 

Next, I will give you a detailed introduction to the specific code implementation.

01 installation PaddleHub

 

  1. install Flying oar.
  2. install Paddlehub.

pip install paddlehub

What is used in this game is Paddle The face and mark localization model in the Hub is installed Paddle The Hub can be called directly later!

For detailed introduction of the model, please refer to:

https://www.paddlepaddle.org.cn/hubdetail?name=face_landmark_localization&en_category=KeyPointDetection

02 implement the main program of the game

 

Here, I use the airplane playing game made by Pygame when I first learned Python. In terms of material, there are many pictures, airplane models and background music on the Internet, which are very easy to get (because they are entry-level ones)

pip install pygame

For specific documents and materials, please refer to AI Studio:

https://aistudio.baidu.com/aistudio/projectdetail/405645

There are pictures, music and fonts in the folder. pygame's various modules and functions are used to define the parameters of various game contents, such as the time when the enemy aircraft appears, the direction of motion, the speed of motion, collision and other event monitoring, etc., which will not be covered here.

Then start to implement the most important game body file, define how the whole game starts, how to cycle, how to operate, how to end.

In the original program, I used the space bar and the up and down keys to control the aircraft. The corresponding program segments are as follows:

if bomb_num and event.key == K_SPACE:
    bomb_sound_use.play()
    bomb_num -= 1
key_pressed = pygame.key.get_pressed()
if key_pressed[K_w] or key_pressed[K_UP]:
    myplane.move_up() # Plane up
elif key_pressed[K_s] or key_pressed[K_DOWN]:
    myplane.move_down() # Plane down
elif key_pressed[K_a] or key_pressed[K_LEFT]:
    myplane.move_left() # The plane flies to the left
elif key_pressed[K_d] or key_pressed[K_RIGHT]:
    myplane.move_right() # The plane flies to the right

 

03 will Paddle The head movement monitoring module of Hub is added to the game

 

1. Add the class of face recognition and head posture recognition, first find the position of the face in the video image through face detection.

In the first version of the program, ultra light fast generic access detector 1MB was used_ 640, although the accuracy is higher, the resource consumption is too large when combined with the game program, which affects the speed. The second edition listened to experts' suggestions and reduced to ultra light fast generic access detector 1MB_ 320, the accuracy is enough in fact, while greatly improving the smoothness of the game!

 

class MyFaceDetector(object):
    """
    //Custom face detector
    """

    def __init__(self):
        self.module = hub.Module(name="ultra_light_fast_generic_face_detector_1mb_320")
        self.alpha = 0.75
        self.start_flag = 1

    def face_detection(self, images, use_gpu=False, visualization=False):
        # Run with GPU, use_gpu=True, and CUDA is set before running the entire tutorial code_ VISIBLE_ Devices environment variable
        result = self.module.face_detection(images=images, use_gpu=use_gpu, visualization=visualization)
        if not result[0]['data']:
            return result

        face = result[0]['data'][0]
        if self.start_flag == 1:

            self.left_s = result[0]['data'][0]['left']
            self.right_s = result[0]['data'][0]['right']
            self.top_s = result[0]['data'][0]['top']
            self.bottom_s = result[0]['data'][0]['bottom']

            self.start_flag = 0
        else:
            # Weighted average the position of the face detection frame of the previous frame and the current frame to stabilize the face detection frame
            self.left_s = self.alpha * self.left_s + (1 - self.alpha) * face['left']
            self.right_s = self.alpha * self.right_s + (1 - self.alpha) * face['right']
            self.top_s = self.alpha * self.top_s + (1 - self.alpha) * face['top']
            self.bottom_s = self.alpha * self.bottom_s + (1 - self.alpha) * face['bottom']

        result[0]['data'][0]['left'] = self.left_s
        result[0]['data'][0]['right'] = self.right_s
        result[0]['data'][0]['top'] = self.top_s
        result[0]['data'][0]['bottom'] = self.bottom_s

        return result

 

Then, the movement state of the head is determined by the head posture recognition.

In the first version of the program, the calculation of Euler angle is used to obtain the motion state of human head, but the calculation is very complex, which is difficult to understand for people who are not good at the mathematical basis. In the second edition, the calculation method of head motion state is greatly simplified. Only seven of the 68 key points of face identified by facelandmark localization are used to achieve the desired effect, and the algorithm is more concise and clear, and the actual effect is very smooth!

class HeadPostEstimation():
    """
    //Head posture recognition
    """

    def __init__(self, face_detector=None):
        self.module = hub.Module(name="face_landmark_localization", face_detector_module=face_detector)


    def get_face_landmark(self, image):
        """
        //Predicting the coordinates of 68 key points of human face
        images(ndarray): Pixel data of single picture
        """
        try:
            # Select GPU to run, use_gpu=True, and CUDA is set before running the entire tutorial code_ VISIBLE_ Devices environment variable
            res = self.module.keypoint_detection(images=[image], use_gpu=True)
            return True, res[0]['data'][0]
        except Exception as e:
            logger.error("Get face landmark localization failed! Exception: %s " % e)
            return False, None

    def get_lips_distance(self, face_landmark):
        """
        //From face_ landmark_ Check the distance between upper and lower lips in localization test results
        """

        lips_points = np.array([
            face_landmark[52], face_landmark[58]
        ], dtype='float')

        head_points = np.array([
            face_landmark[25], face_landmark[8]
        ], dtype='float')

        lips_distance = np.sum(np.square(lips_points[0] - lips_points[1]))
        head_distance = np.sum(np.square(head_points[0] - head_points[1]))
        relative_distance = lips_distance / head_distance
        return relative_distance

    def get_nose_distance(self,face_landmark):
        """
        //From face_ landmark_ The position of the nose is obtained from the localization test results, so as to judge the head movement
        """

        nose_point = np.array([
            face_landmark[31]
        ], dtype='float')

        cheek_points = np.array([
            face_landmark[3], face_landmark[15]
        ], dtype='float')

        left_distance = np.sum(np.square(nose_point[0] - cheek_points[0]))
        right_distance = np.sum(np.square(nose_point[0] - cheek_points[1]))
        nose_position_h = left_distance/(left_distance+right_distance)

        nose_position_v = nose_point[0][1]-cheek_points[0][1] # Obtain the height relative value of nose and cheek locating points as the judgment of head up / head down

        return nose_position_h, nose_position_v


    def classify_pose(self, video):
        """
        video A generator for generating pictures
        """

        for index, img in enumerate(video(), start=1):
            self.img_size = img.shape

            success, face_landmark = self.get_face_landmark(img)

            if not success:
                logger.info("Get face landmark localization failed! Please check your image!")
                continue

            if not success:
                logger.info("Get rotation and translation vectors failed!")
                continue

            # Calculate lip distance
            lips_distance = self.get_lips_distance(face_landmark)

            # Calculate left and right nose position
            nose_position_h, nose_position_v = self.get_nose_distance(face_landmark)

            # Convert to camera displayable format
            img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

            # Forecast video box is displayed locally, AIStudio project does not support displaying video box
            #cv2.imshow('Pose Estimation', img_rgb)

            return nose_position_h, nose_position_v, lips_distance

Start the camera for head monitoring during game initialization.

# Using the head to control the aircraft
    face_detector = MyFaceDetector()
    # Turn on the camera
    capture = cv2.VideoCapture(0)

    def generate_image():
        while True:
            # frame_rgb is a frame of video data
            ret, frame_rgb = capture.read()
            # Press q to exit
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
            if frame_rgb is None:
                break
            frame_bgr = cv2.cvtColor(frame_rgb, cv2.COLOR_RGB2BGR)
            yield frame_bgr
        capture.release()
        cv2.destroyAllWindows()

    head_post = HeadPostEstimation(face_detector)

In the original game program main loop, the control algorithm is replaced by the output after the head monitoring.

# Obtain head motion data and control the aircraft
nose_position_h, nose_position_v, lips_distance = head_post.classify_pose(video=generate_image)
#print(nose_position_h, nose_position_v, lips_distance) # This statement is used to see how the evaluation parameters are designed
if nose_position_h < 0.22:
    myplane.move_left() # Due to the mirror relationship in camera demonstration, please set to myplane.move_right()
elif nose_position_h > 0.48:
    myplane.move_right() # Due to the mirror relationship in camera demonstration, please set to myplane.move_left()
elif nose_position_v < -40:
    myplane.move_up()
elif nose_position_v > -25:
    myplane.move_down()

# Opening your mouth is a bomb, dis_ Shut up if control < 0.045
if lips_distance < 0.045:
    flag = 1
if bomb_num and lips_distance > 0.055 and flag == 1:
    flag = 0
    bomb_sound_use.play()
    bomb_num -= 1

 

It's time to get everything ready, run with one click, and witness the miracle

 

 

After downloading all the codes and materials to the local area, you can start mani.py One click operation! (the computer should have a camera!)

You can also add the code snippet to your own game program, I believe that your creativity can bring more extraordinary presentation effects!

In the process of trying, the gap between the first version and the second version is very obvious. You can see the effect comparison:

https://www.bilibili.com/video/BV1uZ4y147ur

Joint discussion

There are several problems in the implementation process, which need further research and discussion:

  • Since the parameters are adjusted based on my own face, I don't know if the accuracy of other people's face control game will be affected.
  • Originally, I wanted to make another version of face and airplane overlapping effect presentation, but I haven't figured out how to achieve it in pygame framework.
  • The camera perspective and human perspective are mirror like, so in order to shoot video, I adjust them to be opposite to each other. In fact, I need to adjust them.

Future prospects

After the second edition, I wanted to make a third edition, using Paddle The human skeleton monitoring module of Hub realizes the control of aircraft through human motion, but this module has no way to directly access the real-time video image, so it's over.

But the good news is, I heard about the near future Paddle The Hub will further enrich the interfaces of each module, which should be implemented by then. Are there any small partners willing to try together?

Posted by pugg09 on Wed, 27 May 2020 05:48:03 -0700