Getting started with Pytorch (Teacher Li Hongyi's homework is in the spring of 2021)

Keywords: AI neural networks Pytorch Deep Learning

This code and data set are from Mr. Li Hongyi's HW1

Data Set Address

Reference code address

This article introduces Pytorch through Teacher Li's first assignment and the reference code provided. This is an introductory text and does not involve specific network design.

When we want to train a model with data, there are actually two main steps: reading the data, training the model. Then we follow this step to get started with pytorch.

Read Model

1. Use dataset and dataloader for data reading

This is the usage I see in the reference code and should be a more recommended one. (The following reads have been simplified to remove some special data processing)

from torch.utils.data import Dataset, DataLoader
import numpy as np
class COVID19Dataset(Dataset):
    ''' Dataset for loading and preprocessing the COVID19 dataset '''
    def __init__(self,
                 path):
        #Read required data from path (using pandas)
        df = pd.read_csv(path)
        #The data needs to be converted to the format required by pytorch
        data = torch.tensor(df.values, dtype=torch.float)
        #First column is ID, no data, remove
        data = data[:,1:]
        #You can take all the columns here, or you can filter them to use only useful columns
        feats = list(range(93))
        self.target = data[:, -1]
        self.data = data[:, feats]    


    def __getitem__(self, index):
        # Magic method that must be implemented to return data when training a model
        return self.data[index], self.target[index]


    def __len__(self):
        # Returns the length of the data, which is used later
        return len(self.data)

#Then use dataloader to mess up data, read batches, etc.
batch_size = 100
train_ds = DataLoader(ds, batch_size=batch_size, shuffle=True)
dev_ds = DataLoader(ds, batch_size=batch_size, shuffle=True)

2. More direct local approach

Reference Article
Or you can just scramble and read batch by batch without using it.

import pandas as pd
import torch
from torch import nn
path = './ml2021spring-hw1/covid.train.csv'
df = pd.read_csv(path)
dataset_tensor = torch.tensor(df.values, dtype=torch.float)
# Divide training set (60%), validation set (20%), and test set (20%)
random_indices = torch.randperm(dataset_tensor.shape[0])
traning_indices = random_indices[:int(len(random_indices)*0.6)]
validating_indices = random_indices[int(len(random_indices)*0.6):int(len(random_indices)*0.8):]
testing_indices = random_indices[int(len(random_indices)*0.8):]
traning_set_x = dataset_tensor[traning_indices][1:,feats]
traning_set_y = dataset_tensor[traning_indices][1:,-1:]
validating_set_x = dataset_tensor[validating_indices][1:,feats]
validating_set_y = dataset_tensor[validating_indices][1:,-1:]
testing_set_x = dataset_tensor[testing_indices][1:,feats]
testing_set_y = dataset_tensor[testing_indices][1:,-1:]

Training model

The training model will be more complex, so I won't make a fool of it, just summarize the steps.

  1. When training a model, the same dataset is trained multiple times (E-poch), and one training is divided into batch es.
  2. There are several modes to be aware of when training a model. Training mode (train), which is used for training models, calculates gradient update parameters when using this mode, etc. In a nutshell, this mode is used during training. The remaining modes are currently considered non-training mode for the time being. Generally, there is no need to calculate gradients or update parameters when using them.
  3. Select the appropriate loss function and the corresponding optimizer, which is used to update the parameters according to the loss function for each batch of training.
for poch in range(e_poch):
    model.train()#Training mode
    for x, y in train_ds:
        optimizer.zero_grad()#0 Gradient
        pred = model(x)
        mse_loss = model.cal_loss(pred, y.squeeze(-1))
        mse_loss.backward()  
        optimizer.step()

    model.eval()
    total_loss = 0
    for x, y in dev_ds:
        with torch.no_grad():
            pred = model(x)
            mse_loss = model.cal_loss(pred, y.squeeze(-1))
            total_loss += mse_loss
    total_loss = total_loss / len(dev_ds)

    if total_loss < mini_loss:
        mini_loss = total_loss
		print("poch %d find better model,MSE loss is %.4f\n" % (poch, mini_loss))

Your own simplified version of the full code

# PyTorch
from torch.utils.data import Dataset, DataLoader

import pandas as pd

import torch
import torch.nn as nn

#The model copy reference code here does not cover model building in this article
class NeuralNet(nn.Module):
    ''' A simple fully-connected deep neural network '''

    def __init__(self, input_dim):
        super(NeuralNet, self).__init__()

        # Define your neural network here
        # TODO: How to modify this model to achieve better performance?
        self.net = nn.Sequential(
            nn.Linear(input_dim, 32),
            nn.BatchNorm1d(32),  # Using BN to Accelerate Model Training
            nn.Dropout(p=0.2),  # Use Dropout to reduce over-fitting, note that you cannot precede BN
            nn.LeakyReLU(),  # Replace activation function
            nn.Linear(32, 1)
        )

        # Mean squared error loss
        self.criterion = nn.MSELoss(reduction='mean')
        # self.criterion = nn.SmoothL1Loss(size_average=True)

    def forward(self, x):
        ''' Given input of size (batch_size x input_dim), compute output of the network '''
        return self.net(x).squeeze(1)

    def cal_loss(self, pred, target):
        ''' Calculate loss '''
        regularization_loss = 0
        for param in self.parameters():
            # TODO: you may implement L1/L2 regularization here
            # Use L2 regular terms
            # regularization_loss += torch.sum(abs(param))
            regularization_loss += torch.sum(param ** 2)
        return self.criterion(pred, target) + 0.00075 * regularization_loss


class COVID19Dataset(Dataset):
    def __init__(self, path):
        df = pd.read_csv(path)
        self.data = torch.tensor(df.values, dtype=torch.float)
        self.target = self.data[:, -1:]
        self.data = self.data[:, 1:-1]

    def __getitem__(self, index):
        return self.data[index], self.target[index]

    def __len__(self):
        return len(self.data)


train_path = './ml2021spring-hw1/covid.train.csv'

ds = COVID19Dataset(train_path)
batch_size = 100
train_ds = DataLoader(ds, batch_size=batch_size, shuffle=True)
dev_ds = DataLoader(ds, batch_size=batch_size, shuffle=True)

e_poch = 10000

mini_loss = 1000
early_stop = 500

model = NeuralNet(93)
optimizer = torch.optim.Adam(model.parameters())

for poch in range(e_poch):
    model.train()
    for x, y in train_ds:
        optimizer.zero_grad()
        pred = model(x)
        mse_loss = model.cal_loss(pred, y.squeeze(-1))
        mse_loss.backward()  # TODO
        optimizer.step()

    model.eval()
    total_loss = 0
    for x, y in dev_ds:
        with torch.no_grad():
            pred = model(x)
            mse_loss = model.cal_loss(pred, y.squeeze(-1))
            total_loss += mse_loss
    total_loss = total_loss / len(dev_ds)

    if total_loss < mini_loss:
        stop = 0
        mini_loss = total_loss
        c
    else:
        stop += 1

    if stop > early_stop:
        break

Posted by benson on Sun, 17 Oct 2021 11:26:02 -0700