Chapter II - Preliminary Application of pytoch

Keywords: neural networks Pytorch Deep Learning NLP

Preliminary application of pytoch

Build a neural network using Pytorch

  • Typical process of building neural network:
    • A neural network with learnable parameters is defined
    • Traversal training data set
    • Process the input data to flow through the neural network
    • Calculate loss value
    • The gradient of network parameters is back propagated
    • Update network weight with certain rules
  • The following is a neural network that defines a pytoch implementation
# -*- coding: utf-8 -*-
"""
Created on Tue Oct 19 15:50:59 2021

@author: Lancibe
"""

import torch
import torch.nn as nn
import torch.nn.functional as F

# Define network class
class Net(nn.Module):
    # Define initialization function
    def __init__(self):
        super(Net, self).__init__()
        # Define the first layer convolution neural network, the input channel dimension is 1, the output channel dimension is 6, and the convolution kernel size is 3 * 3
        self.conv1 = nn.Conv2d(1, 6, 3)
        # Second floor, 6, 16, 3
        self.conv2 = nn.Conv2d(6, 16, 3)
        # Layer 3 fully connected network
        self.fc1 = nn.Linear(16 * 6 * 6, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        
    def forward(self, x):
        # Execute the maximum pool operation in the pool window of (2, 2)
        # Activation layer and pool layer shall be added behind any convolution layer
        x = F.max_pool2d(F.relu(self.conv1(x)), (2,2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        `# After the treatment of convolution layer, the tensor enters the full connection layer, and the shape of the tensor needs to be adjusted before entering
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
        
    # Calculate the size of a tensor x after convolution
    def num_flat_features(self, x):
        size = x.size()[1:]
        num_features = 1
        for s in size:
            num_features *= s
        return num_features
    
net = Net()
print(net)
  • Output results:
Net(
  (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)
  • All trainable parameters in the model can be obtained through net.parameters().
params = list(net.parameters())
print(len(params()))
print(params[0].size())

10
torch.Size([6, 1, 3, 3])
  • Suppose the image size is 32 * 32
input = torch.randn(1,1,32,32)
out = net(input)
print(out)


tensor([[-0.1169, -0.1627,  0.0504, -0.0820, -0.0311, -0.0599,  0.0003, -0.0024,
          0.0026,  0.0187]], grad_fn=<AddmmBackward>)
  • After the output tensor is obtained, the gradient zeroing and back propagation operations can be performed
net.zero_grad()
out.backward(torch.randn(1,10))
  • The neural network constructed by torch.nn only supports the input of mini batches, not a single sample.
  • For example, nn.Conv2d needs a 4D Tensor(nSamlpes, nChannels, Height, Width). If the input has only a single sample driving, it needs to execute input.unsqueeze(0) to actively expand the 3D Tensor to 4D.

loss function

  • The input of the loss function is a pair:(output, target), and then calculate a value to evaluate the gap between output and target. Output is the value calculated by the neural network, and target is the target value.
  • There are different loss functions in torch.nn. For example, nn.mselos evaluates the gap between the input and the target value by calculating the mean square deviation.
  • The following is a usage example
input = torch.randn(1,1,32,32)
out = net(input)

target = torch.randn(10)
# Change the shape of target
target = target.view(1,-1)
criterion = nn.MSELoss()
loss = criterion(out, target)
print(loss)

tensor(1.1196, grad_fn=<MseLossBackward>)
  • Directional propagation has a chain called a computational graph
input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d
	-> view -> linear -> relu -> linear -> relu -> linear
    -> MSELoss
    -> loss
  • When loss.backward() is called, the whole calculation chart will automatically derive loss, and all attributes require_ Tensors with grad = true will participate in the gradient derivation operation and accumulate the gradient into the grad attribute in the tensor.
print(loss.grad_fn) # MSELoss
print(loss.grad_fn.next_functions[0][0]) # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) # ReLU

<MseLossBackward object at 0x000001CD8A7F9888>
<AddmmBackward object at 0x000001CD8A7F9688>
<AccumulateGrad object at 0x000001CD8A7F9688>

Back propagation*

  • Back propagation is very important, but when Python is used, back propagation is very simple. The whole operation is loss.backward()
  • Before performing back propagation, the gradient must be cleared first, otherwise the gradient will be accumulated between different batch data.
  • Example:
# Gradient zeroing is performed first in pytoch
net.zero_grad()
print('before backward')
print(net.conv1.bias.grad)
loss.backward()
print('after backward')
print(net.conv1.bias.grad)


before backward
None
after backward
tensor([ 0.0067, -0.0037,  0.0111, -0.0024, -0.0077,  0.0114])

Update network parameters

  • The simplest algorithm for updating parameters is SGD (random gradient descent)
  • The specific algorithm formula is: weight = weight - learning_rate * gradient
  • The following is the implementation of SGD using traditional Python code
learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)
  • The following is the official recommended Code:
# Import the package of optimizer. optim contains several common optimization algorithms, such as SGD, Adam, etc
import torch.optim as optim

# Creating optimizer objects through optim
optimizer = optim.SGD(net.parameters(), lr = 0.01)

# The optimizer performs a gradient zeroing operation
optimizer.zero_grad()

output = net(input)
loss = criterion(output, target)

# Back propagation of loss value
loss.backward()
# The update of parameters is performed through a line of standard code
optimizer.step()

Build a classifier using pytoch

Classifier task and data introduction

  • Construct a neural network classifier to classify different images, judge the input images and complete the classification.
  • CIFAR10 dataset is adopted
    • The size of each picture in the dataset is 3 * 32 * 32, and the first 3 represents color 3 channels.

Steps of training classifier

  1. Downloading CIFAR10 datasets using torchvision
  2. Define convolutional neural network
  3. Define loss function
  4. Training model on training set
  5. Test model on test set

Downloading datasets using torchvision

import torch
import torchvision
import torchvision.transforms as transforms #Adjust and convert the picture
  • Download the dataset and adjust the picture, because the output of torchvision dataset is in PILImage format, and the data is in [0,1], we convert it to the tensor format of standard data field [- 1,1].
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
  • Download data sets. The parameter meanings are: root directory, whether it is a training set, whether it is allowed to download, and so on
trainset = torchvision.datasets.CIFAR10(
    root='./data', train=True, download=True, transform=transform)
  • The next step is very important. You must package the downloaded data set in the data iterator. The parameters are: training set, how many pieces (batch data) to access at one time, whether to disrupt, and how many threads
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=4, shuffle=True, num_workers=2)
  • The following is the test data set. The reason for not disturbing is that it is not required.
testset = torchvision.datasets.CIFAR10(
    root='./data', train=False, download=True, transform=transform)

testloader = torch.utils.data.DataLoader(
    testset, batch_size=4, shuffle=False, num_workers=2)
  • Finally, specify the label
classes = ('plane', 'car', 'bird', 'cat', 'deer', 
           'dog', 'frog', 'horse', 'ship', 'truck')
  • A certificate verification error occurred during downloading. Use the global method to cancel certificate verification:
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
  • When running under Windows system, you will find an error Broken pipe. This is because of the problem of reading and writing thread files under windows. You need to put num in DataLoader method_ Workers is set to 0.
  • Display picture:
import numpy as np
import matplotlib.pyplot as plt

# Build a function to display pictures
def imshow(img):
    img = img / 2 + 0.5
    npimg = img.numpy() # matplot can only be applied if it is converted to numpy data first
    plt.imshow(np.transpose(npimg, (1,2,0)))
    plt.show()
    
# Read a picture from the data iterator
dataiter = iter(trainloader)
images, labels = dataiter.next()

# Show pictures
imshow(torchvision.utils.make_grid(images))
# print label
print(" ".join('%5s' % classes[labels[j]] for j in range(4)))


  cat  ship  ship plane

Define convolutional neural network

  • The only difference is that three channel 3-channel is used here
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # Define two convolution layers
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # Define pooling layer
        self.pool = nn.MaxPool2d(2, 2)
        # Define three fully connected layers
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10) #Only 120 and 84 of these parameters can be changed
        
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        # Transform the shape of x to fit the full connection layer
        x = x.view(-1, 16*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
    
net = Net()
print(net)


Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

Define loss function

  • Cross entropy loss function and random gradient descent optimizer are used
import torch.optim as optim

# Define the loss function and use the cross entropy loss function
criterion = nn.CrossEntropyLoss()
# Define an optimizer that uses a random gradient descent optimizer
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

Training model on training set

  • The optimization algorithms based on gradient descent need many rounds of iterative training
# Training model
# The overall dataset traverses two rounds
for epoch in range(2):
    running_loss = 0.0
    for i, data in enumerate(trainloader , 0):
        # data contains input image tensor inputs and label tensor labels
        inputs, labels = data
        
        # First, zero the optimizer gradient
        optimizer.zero_grad()
        
        # The input image tensor enters the network to obtain the output tensor outputs
        outputs = net(inputs)
        
        # The loss value is calculated by image output outputs and label labels
        loss = criterion(outputs, labels)
        
        # Back propagation + parameter update, standard process of standard code
        loss.backward()
        optimizer.step()
        
        # Print rounds and loss values
        running_loss += loss.item()
        if (i + 1) % 2000 == 0:
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss/2000))
            running_loss = 0.0
        
print('Finished Training')


[1,  2000] loss: 2.218
[1,  4000] loss: 1.939
[1,  6000] loss: 1.717
[1,  8000] loss: 1.580
[1, 10000] loss: 1.530
[1, 12000] loss: 1.488
[2,  2000] loss: 1.391
[2,  4000] loss: 1.405
[2,  6000] loss: 1.347
[2,  8000] loss: 1.359
[2, 10000] loss: 1.328
[2, 12000] loss: 1.297
Finished Training
  • Save model
# Save model
PATH = './cifat_net.pyh'
# Save the state Dictionary of the model
torch.save(net.state_dict(), PATH)

Test model on test set

  • The first step is to show some pictures in the test set
  • In the test phase, first remember to comment out the state Dictionary of the previously saved model, because the model will not be trained in the test phase. If the model is saved, it means that the model has learned nothing.
# test model 
dataiter = iter(testloader)
images, labels = dataiter.next()

# Print original picture
imshow(torchvision.utils.make_grid(images))
# Print real labels
print('GroundTruth: ', " ".join('%5s' % classes[labels[j]] for j in range(4)))
  • The second step is to load the model and predict the test image
# Instantiate class object
net = Net()
# Load the state Dictionary of the model saved in the training phase
net.load_state_dict(torch.load(PATH))

# Use the model to predict the picture
outputs = net(images)

# There are ten categories in total, and the category with the highest probability calculated by the model is used as the prediction category (greed)
_, predicted = torch.max(outputs, 1)

# Print the results of the forecast label
print('Predicted: ', ' '.join('%5s' % classes[predicted[j]] for j in range(4)))
  • It is normal to find that there are errors between the test results and the actual results. One of the reasons may be the small number of models.

  • You can see the performance of the model on all test sets:

correct = 0
total = 0
with torch.no_grad(): # Indicates that the code block only reads the model and does not change it
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        
print('Accuracy of the network on the 10000 test images: %d%%' % (
    100 * correct / total))



Accuracy of the network on the 10000 test images: 52%
  • 52% is a very normal data, indicating that the model has learned something. At the same time, be very vigilant. If the accuracy of the model is about 10%, it means that the model has learned nothing, that is, the accuracy of 10% guessed by all Mongolia.

  • This data is a very overall data. If we want to take a more detailed look at which categories the model performs better, we can calculate the accuracy separately

# Statistics for different categories
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1

for i in range(10):
    print('Accuracy of %5s : %2d%%' %
          (classes[i], 100*class_correct[i] / class_total[i]))
    
    
Accuracy of plane : 39%
Accuracy of   car : 58%
Accuracy of  bird : 30%
Accuracy of   cat : 13%
Accuracy of  deer : 36%
Accuracy of   dog : 67%
Accuracy of  frog : 69%
Accuracy of horse : 53%
Accuracy of  ship : 74%
Accuracy of truck : 78% 

Training model on GPU

  • In order to make real use of the excellent properties of Tensor in pytoch and accelerate model training, we can transfer the training process to GPU.
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

# Then transfer the model to the GPU
net.to(device)

# Finally, during training and testing, the picture and label tensor are transferred to GPU at each step
inputs, labels = data[0].to(device), data[1].to(device)

summary

# -*- coding: utf-8 -*-
"""
Created on Tue Oct 19 20:18:46 2021

@author: Lancibe
"""

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(
    root='./data', train=True, download=True, transform=transform)


trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=4, shuffle=True, num_workers=0)

testset = torchvision.datasets.CIFAR10(
    root='./data', train=False, download=True, transform=transform)


testloader = torch.utils.data.DataLoader(
    testset, batch_size=4, shuffle=False, num_workers=0)

classes = ('plane', 'car', 'bird', 'cat', 'deer', 
           'dog', 'frog', 'horse', 'ship', 'truck')

import numpy as np
import matplotlib.pyplot as plt

# Build a function to display pictures
def imshow(img):
    img = img / 2 + 0.5
    npimg = img.numpy() # matplot can only be applied if it is converted to numpy data first
    plt.imshow(np.transpose(npimg, (1,2,0)))
    plt.show()
    
# Read a picture from the data iterator
#dataiter = iter(trainloader)
#images, labels = dataiter.next()

# Show pictures
#imshow(torchvision.utils.make_grid(images))
# print label
#print(" ".join('%5s' % classes[labels[j]] for j in range(4)))


import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # Define two convolution layers
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # Define pooling layer
        self.pool = nn.MaxPool2d(2, 2)
        # Define three fully connected layers
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10) #Only 120 and 84 of these parameters can be changed
        
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        # Transform the shape of x to fit the full connection layer
        x = x.view(-1, 16*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

device = torch.device('cuda' if torch.cuda.is_available() else "cpu")
net = Net()
net.to(device)
#print(net)

import torch.optim as optim

# Define the loss function and use the cross entropy loss function
criterion = nn.CrossEntropyLoss()
# Define an optimizer that uses a random gradient descent optimizer
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)




# Training model
# The overall dataset traverses two rounds
for epoch in range(2):
    running_loss = 0.0
    for i, data in enumerate(trainloader , 0):
        # data contains input image tensor inputs and label tensor labels
        inputs, labels = data[0].to(device), data[1].to(device)
        
        # First, zero the optimizer gradient
        optimizer.zero_grad()
        
        # The input image tensor enters the network to obtain the output tensor outputs
        outputs = net(inputs)
        
        # The loss value is calculated by image output outputs and label labels
        loss = criterion(outputs, labels)
        
        # Back propagation + parameter update, standard process of standard code
        loss.backward()
        optimizer.step()
        
        # Print rounds and loss values
        running_loss += loss.item()
        if (i + 1) % 2000 == 0:
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss/2000))
            running_loss = 0.0
        
print('Finished Training')


# Save model
PATH = './cifat_net.pyh'
# Save the state Dictionary of the model
torch.save(net.state_dict(), PATH)


# test model 
dataiter = iter(testloader)
images, labels = dataiter.next()

# Print original picture
#imshow(torchvision.utils.make_grid(images))
# Print real labels
#print('GroundTruth: ', " ".join('%5s' % classes[labels[j]] for j in range(4)))

# Instantiate class object
#net = Net()
# Load the state Dictionary of the model saved in the training phase
net.load_state_dict(torch.load(PATH))

# Use the model to predict the picture
#outputs = net(images)

# There are ten categories in total, and the category with the highest probability calculated by the model is used as the prediction category (greed)
#_, predicted = torch.max(outputs, 1)

# Print the results of the forecast label
#print('Predicted: ', ' '.join('%5s' % classes[predicted[j]] for j in range(4)))

'''
correct = 0
total = 0
with torch.no_grad(): # Indicates that the code block only reads the model and does not change it
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        
print('Accuracy of the network on the 10000 test images: %d%%' % (
    100 * correct / total))
''' 


# Statistics for different categories
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1

for i in range(10):
    print('Accuracy of %5s : %2d%%' %
          (classes[i], 100*class_correct[i] / class_total[i]))

Posted by artnow on Tue, 19 Oct 2021 21:22:31 -0700