IFLYTEK face key point detection competition - punch in 1

Keywords: neural networks Pytorch Deep Learning

During this period, I participated in the Coggle 30 days of ML punch in and signed up for the CV competition. The competition Title address is: https://challenge.xfyun.cn/topic/info?type=key -points-of-human-face&ch=dw-sq-1
Overview of competition questions:
- Face recognition is a biometric recognition technology based on human facial feature information. Finance and security are the two most widely used fields of face recognition. Face key point is a key technology in face recognition. Face key point detection needs to recognize the specified position coordinates of the face, such as eyebrows, eyes, nose, mouth and face contour. Given the face image, find four face key points, and the competition task can be regarded as a key point detection problem. Training set: 5000 face images, and specific face key points are given. Test set: about 2000 face images. Players need to identify specific key points.
First, complete the reading of data, filling the default value of key point data, and the reproduction from array to image. The missing measured value is filled from the previous line to the next line provided by dataframe. The reproduction of image adopts matplotlib library, and the code is as follows:

train_df = pd.read_csv('./Face key point detection challenge_data set/train.csv')
train_img = np.load('./Face key point detection challenge_data set/train.npy/train.npy')
test_img = np.load('./Face key point detection challenge_data set/test.npy/test.npy')

print(train_df.head())
print(train_img.shape)

### Default value filling based on dataframe
print(train_df.isnull().sum())
train_df.fillna(method='ffill', inplace=True)
print(train_df.isnull().sum())

### Reproduction of pictures 
plt.imshow(train_img[:,:,1])
plt.show()

The convolution layer and full connection layer are combined in the calculation. There are four convolution layers and four full connection layers, which are implemented by pytorch. The code is as follows:

import torch
import torch.nn as nn
import torch.nn.functional as F

class MLP(nn.Module):

    def __init__(self, output_dims):
        super(MLP, self).__init__()
        #self.input_dims = input_dims
        self.output_dims = output_dims
        self.conv1 = nn.Conv2d(
                        in_channels=1, out_channels=16, 
                        kernel_size=3,stride=2)
        self.conv2 = nn.Conv2d(
                        in_channels=16, out_channels=32, 
                        kernel_size=3, stride=2)
        self.conv3 = nn.Conv2d(
                        in_channels=32, out_channels=64,
                        kernel_size=3, stride=2)
        self.conv4 = nn.Conv2d(
                        in_channels=64, out_channels=64,
                        kernel_size=3, stride=2)                
        self.fc1 = nn.Linear(7744, 1600)
        self.fc2 = nn.Linear(1600, 800)
        self.fc3 = nn.Linear(800, 100)
        self.fc4 = nn.Linear(100, self.output_dims)

    def forward(self, X):
        X = F.relu(self.conv1(X))
        X = F.relu(self.conv2(X))
        X = F.relu(self.conv3(X))
        X = X.reshape(X.shape[0],-1)
        X = F.relu(self.fc1(X))
        X = F.relu(self.fc2(X))
        X = F.relu(self.fc3(X))
        out = self.fc4(X)
        return out

Next, simply preprocess the data and put it into the model for calculation. The preprocessing steps include the division of test data set and training data set and the transformation from array to tensor:

from torch.utils.data import Dataset, DataLoader, TensorDataset

Xtrain, Xtest, ytrain, ytest = train_test_split(
                    train_img.transpose(2, 0, 1), 
                    train_df.values.astype(np.float32), 
                    test_size=0.1)
                    
def in_out_creat(inputData, outputData):
    inputData = torch.FloatTensor(inputData).unsqueeze(1)
    outputData = torch.FloatTensor(outputData)
    return DataLoader(TensorDataset(inputData, outputData), 
            batch_size=para.batch_size, shuffle=True)

The last is the training of the model. The training parameters are stored in class, and a training function is written. The cos learning rate is used to train the model:

from torch.optim.lr_scheduler import CosineAnnealingLR

def train_model(model, trainLoader, vaildLoader, params):
    train_loss, vaild_loss = [], []
    loss_func = nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(),lr=params.lr)
    scheduler = CosineAnnealingLR(optimizer,T_max=10)
    for i in range(params.epochs):

        for ibatch, (X, y) in enumerate(trainLoader):
            model.train()
            optimizer.zero_grad()
            out = model(X)
            loss = loss_func(y, out)
            loss.backward()
            optimizer.step()
            train_loss.append(loss.item())
        
   
        for iv, (X, y) in enumerate(vaildLoader):
            model.eval()
            out = model(X)
            v_loss = loss_func(y, out)

            vaild_loss.append(v_loss.item())
        if i%2 == 0:
            print("train loss: {}, vaild loss: {}".format(
                    loss.item(), v_loss.item()))

trainLoader = in_out_creat(Xtrain, ytrain)
vaildLoader = in_out_creat(Xtest, ytest)
model = MLP(output_dims=8)

train_model(model, trainLoader, vaildLoader, para)

Training effect display, gradient descent

train loss: 113.74918365478516, vaild loss: 121.13554382324219
train loss: 123.68692779541016, vaild loss: 129.83938598632812
train loss: 68.00350189208984, vaild loss: 43.04329299926758
train loss: 46.17871856689453, vaild loss: 75.8116226196289
train loss: 25.16992950439453, vaild loss: 23.86362648010254

Posted by e_00_3 on Wed, 29 Sep 2021 17:53:47 -0700

Programmer Group

IFLYTEK face key point detection competition - punch in 1

Hot Keywords