This code and data set are from Mr. Li Hongyi's HW1
This article introduces Pytorch through Teacher Li's first assignment and the reference code provided. This is an introductory text and does not involve specific network design.
When we want to train a model with data, there are actually two main steps: reading the data, training the model. Then we follow this step to get started with pytorch.
Read Model
1. Use dataset and dataloader for data reading
This is the usage I see in the reference code and should be a more recommended one. (The following reads have been simplified to remove some special data processing)
from torch.utils.data import Dataset, DataLoader import numpy as np class COVID19Dataset(Dataset): ''' Dataset for loading and preprocessing the COVID19 dataset ''' def __init__(self, path): #Read required data from path (using pandas) df = pd.read_csv(path) #The data needs to be converted to the format required by pytorch data = torch.tensor(df.values, dtype=torch.float) #First column is ID, no data, remove data = data[:,1:] #You can take all the columns here, or you can filter them to use only useful columns feats = list(range(93)) self.target = data[:, -1] self.data = data[:, feats] def __getitem__(self, index): # Magic method that must be implemented to return data when training a model return self.data[index], self.target[index] def __len__(self): # Returns the length of the data, which is used later return len(self.data) #Then use dataloader to mess up data, read batches, etc. batch_size = 100 train_ds = DataLoader(ds, batch_size=batch_size, shuffle=True) dev_ds = DataLoader(ds, batch_size=batch_size, shuffle=True)
2. More direct local approach
Reference Article
Or you can just scramble and read batch by batch without using it.
import pandas as pd import torch from torch import nn path = './ml2021spring-hw1/covid.train.csv' df = pd.read_csv(path) dataset_tensor = torch.tensor(df.values, dtype=torch.float) # Divide training set (60%), validation set (20%), and test set (20%) random_indices = torch.randperm(dataset_tensor.shape[0]) traning_indices = random_indices[:int(len(random_indices)*0.6)] validating_indices = random_indices[int(len(random_indices)*0.6):int(len(random_indices)*0.8):] testing_indices = random_indices[int(len(random_indices)*0.8):] traning_set_x = dataset_tensor[traning_indices][1:,feats] traning_set_y = dataset_tensor[traning_indices][1:,-1:] validating_set_x = dataset_tensor[validating_indices][1:,feats] validating_set_y = dataset_tensor[validating_indices][1:,-1:] testing_set_x = dataset_tensor[testing_indices][1:,feats] testing_set_y = dataset_tensor[testing_indices][1:,-1:]
Training model
The training model will be more complex, so I won't make a fool of it, just summarize the steps.
- When training a model, the same dataset is trained multiple times (E-poch), and one training is divided into batch es.
- There are several modes to be aware of when training a model. Training mode (train), which is used for training models, calculates gradient update parameters when using this mode, etc. In a nutshell, this mode is used during training. The remaining modes are currently considered non-training mode for the time being. Generally, there is no need to calculate gradients or update parameters when using them.
- Select the appropriate loss function and the corresponding optimizer, which is used to update the parameters according to the loss function for each batch of training.
for poch in range(e_poch): model.train()#Training mode for x, y in train_ds: optimizer.zero_grad()#0 Gradient pred = model(x) mse_loss = model.cal_loss(pred, y.squeeze(-1)) mse_loss.backward() optimizer.step() model.eval() total_loss = 0 for x, y in dev_ds: with torch.no_grad(): pred = model(x) mse_loss = model.cal_loss(pred, y.squeeze(-1)) total_loss += mse_loss total_loss = total_loss / len(dev_ds) if total_loss < mini_loss: mini_loss = total_loss print("poch %d find better model,MSE loss is %.4f\n" % (poch, mini_loss))
Your own simplified version of the full code
# PyTorch from torch.utils.data import Dataset, DataLoader import pandas as pd import torch import torch.nn as nn #The model copy reference code here does not cover model building in this article class NeuralNet(nn.Module): ''' A simple fully-connected deep neural network ''' def __init__(self, input_dim): super(NeuralNet, self).__init__() # Define your neural network here # TODO: How to modify this model to achieve better performance? self.net = nn.Sequential( nn.Linear(input_dim, 32), nn.BatchNorm1d(32), # Using BN to Accelerate Model Training nn.Dropout(p=0.2), # Use Dropout to reduce over-fitting, note that you cannot precede BN nn.LeakyReLU(), # Replace activation function nn.Linear(32, 1) ) # Mean squared error loss self.criterion = nn.MSELoss(reduction='mean') # self.criterion = nn.SmoothL1Loss(size_average=True) def forward(self, x): ''' Given input of size (batch_size x input_dim), compute output of the network ''' return self.net(x).squeeze(1) def cal_loss(self, pred, target): ''' Calculate loss ''' regularization_loss = 0 for param in self.parameters(): # TODO: you may implement L1/L2 regularization here # Use L2 regular terms # regularization_loss += torch.sum(abs(param)) regularization_loss += torch.sum(param ** 2) return self.criterion(pred, target) + 0.00075 * regularization_loss class COVID19Dataset(Dataset): def __init__(self, path): df = pd.read_csv(path) self.data = torch.tensor(df.values, dtype=torch.float) self.target = self.data[:, -1:] self.data = self.data[:, 1:-1] def __getitem__(self, index): return self.data[index], self.target[index] def __len__(self): return len(self.data) train_path = './ml2021spring-hw1/covid.train.csv' ds = COVID19Dataset(train_path) batch_size = 100 train_ds = DataLoader(ds, batch_size=batch_size, shuffle=True) dev_ds = DataLoader(ds, batch_size=batch_size, shuffle=True) e_poch = 10000 mini_loss = 1000 early_stop = 500 model = NeuralNet(93) optimizer = torch.optim.Adam(model.parameters()) for poch in range(e_poch): model.train() for x, y in train_ds: optimizer.zero_grad() pred = model(x) mse_loss = model.cal_loss(pred, y.squeeze(-1)) mse_loss.backward() # TODO optimizer.step() model.eval() total_loss = 0 for x, y in dev_ds: with torch.no_grad(): pred = model(x) mse_loss = model.cal_loss(pred, y.squeeze(-1)) total_loss += mse_loss total_loss = total_loss / len(dev_ds) if total_loss < mini_loss: stop = 0 mini_loss = total_loss c else: stop += 1 if stop > early_stop: break