Preliminary application of pytoch
Build a neural network using Pytorch
 Typical process of building neural network:
 A neural network with learnable parameters is defined
 Traversal training data set
 Process the input data to flow through the neural network
 Calculate loss value
 The gradient of network parameters is back propagated
 Update network weight with certain rules
 The following is a neural network that defines a pytoch implementation
# * coding: utf8 * """ Created on Tue Oct 19 15:50:59 2021 @author: Lancibe """ import torch import torch.nn as nn import torch.nn.functional as F # Define network class class Net(nn.Module): # Define initialization function def __init__(self): super(Net, self).__init__() # Define the first layer convolution neural network, the input channel dimension is 1, the output channel dimension is 6, and the convolution kernel size is 3 * 3 self.conv1 = nn.Conv2d(1, 6, 3) # Second floor, 6, 16, 3 self.conv2 = nn.Conv2d(6, 16, 3) # Layer 3 fully connected network self.fc1 = nn.Linear(16 * 6 * 6, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): # Execute the maximum pool operation in the pool window of (2, 2) # Activation layer and pool layer shall be added behind any convolution layer x = F.max_pool2d(F.relu(self.conv1(x)), (2,2)) x = F.max_pool2d(F.relu(self.conv2(x)), 2) `# After the treatment of convolution layer, the tensor enters the full connection layer, and the shape of the tensor needs to be adjusted before entering x = x.view(1, self.num_flat_features(x)) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x # Calculate the size of a tensor x after convolution def num_flat_features(self, x): size = x.size()[1:] num_features = 1 for s in size: num_features *= s return num_features net = Net() print(net)
 Output results:
Net( (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1)) (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1)) (fc1): Linear(in_features=576, out_features=120, bias=True) (fc2): Linear(in_features=120, out_features=84, bias=True) (fc3): Linear(in_features=84, out_features=10, bias=True) )
 All trainable parameters in the model can be obtained through net.parameters().
params = list(net.parameters()) print(len(params())) print(params[0].size()) 10 torch.Size([6, 1, 3, 3])
 Suppose the image size is 32 * 32
input = torch.randn(1,1,32,32) out = net(input) print(out) tensor([[0.1169, 0.1627, 0.0504, 0.0820, 0.0311, 0.0599, 0.0003, 0.0024, 0.0026, 0.0187]], grad_fn=<AddmmBackward>)
 After the output tensor is obtained, the gradient zeroing and back propagation operations can be performed
net.zero_grad() out.backward(torch.randn(1,10))
 The neural network constructed by torch.nn only supports the input of mini batches, not a single sample.
 For example, nn.Conv2d needs a 4D Tensor(nSamlpes, nChannels, Height, Width). If the input has only a single sample driving, it needs to execute input.unsqueeze(0) to actively expand the 3D Tensor to 4D.
loss function
 The input of the loss function is a pair:(output, target), and then calculate a value to evaluate the gap between output and target. Output is the value calculated by the neural network, and target is the target value.
 There are different loss functions in torch.nn. For example, nn.mselos evaluates the gap between the input and the target value by calculating the mean square deviation.
 The following is a usage example
input = torch.randn(1,1,32,32) out = net(input) target = torch.randn(10) # Change the shape of target target = target.view(1,1) criterion = nn.MSELoss() loss = criterion(out, target) print(loss) tensor(1.1196, grad_fn=<MseLossBackward>)
 Directional propagation has a chain called a computational graph
input > conv2d > relu > maxpool2d > conv2d > relu > maxpool2d > view > linear > relu > linear > relu > linear > MSELoss > loss
 When loss.backward() is called, the whole calculation chart will automatically derive loss, and all attributes require_ Tensors with grad = true will participate in the gradient derivation operation and accumulate the gradient into the grad attribute in the tensor.
print(loss.grad_fn) # MSELoss print(loss.grad_fn.next_functions[0][0]) # Linear print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) # ReLU <MseLossBackward object at 0x000001CD8A7F9888> <AddmmBackward object at 0x000001CD8A7F9688> <AccumulateGrad object at 0x000001CD8A7F9688>
Back propagation*
 Back propagation is very important, but when Python is used, back propagation is very simple. The whole operation is loss.backward()
 Before performing back propagation, the gradient must be cleared first, otherwise the gradient will be accumulated between different batch data.
 Example:
# Gradient zeroing is performed first in pytoch net.zero_grad() print('before backward') print(net.conv1.bias.grad) loss.backward() print('after backward') print(net.conv1.bias.grad) before backward None after backward tensor([ 0.0067, 0.0037, 0.0111, 0.0024, 0.0077, 0.0114])
Update network parameters
 The simplest algorithm for updating parameters is SGD (random gradient descent)
 The specific algorithm formula is: weight = weight  learning_rate * gradient
 The following is the implementation of SGD using traditional Python code
learning_rate = 0.01 for f in net.parameters(): f.data.sub_(f.grad.data * learning_rate)
 The following is the official recommended Code:
# Import the package of optimizer. optim contains several common optimization algorithms, such as SGD, Adam, etc import torch.optim as optim # Creating optimizer objects through optim optimizer = optim.SGD(net.parameters(), lr = 0.01) # The optimizer performs a gradient zeroing operation optimizer.zero_grad() output = net(input) loss = criterion(output, target) # Back propagation of loss value loss.backward() # The update of parameters is performed through a line of standard code optimizer.step()
Build a classifier using pytoch
Classifier task and data introduction
 Construct a neural network classifier to classify different images, judge the input images and complete the classification.
 CIFAR10 dataset is adopted
 The size of each picture in the dataset is 3 * 32 * 32, and the first 3 represents color 3 channels.
Steps of training classifier
 Downloading CIFAR10 datasets using torchvision
 Define convolutional neural network
 Define loss function
 Training model on training set
 Test model on test set
Downloading datasets using torchvision
import torch import torchvision import torchvision.transforms as transforms #Adjust and convert the picture
 Download the dataset and adjust the picture, because the output of torchvision dataset is in PILImage format, and the data is in [0,1], we convert it to the tensor format of standard data field [ 1,1].
transform = transforms.Compose( [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
 Download data sets. The parameter meanings are: root directory, whether it is a training set, whether it is allowed to download, and so on
trainset = torchvision.datasets.CIFAR10( root='./data', train=True, download=True, transform=transform)
 The next step is very important. You must package the downloaded data set in the data iterator. The parameters are: training set, how many pieces (batch data) to access at one time, whether to disrupt, and how many threads
trainloader = torch.utils.data.DataLoader( trainset, batch_size=4, shuffle=True, num_workers=2)
 The following is the test data set. The reason for not disturbing is that it is not required.
testset = torchvision.datasets.CIFAR10( root='./data', train=False, download=True, transform=transform) testloader = torch.utils.data.DataLoader( testset, batch_size=4, shuffle=False, num_workers=2)
 Finally, specify the label
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
 A certificate verification error occurred during downloading. Use the global method to cancel certificate verification:
import ssl ssl._create_default_https_context = ssl._create_unverified_context
 When running under Windows system, you will find an error Broken pipe. This is because of the problem of reading and writing thread files under windows. You need to put num in DataLoader method_ Workers is set to 0.
 Display picture:
import numpy as np import matplotlib.pyplot as plt # Build a function to display pictures def imshow(img): img = img / 2 + 0.5 npimg = img.numpy() # matplot can only be applied if it is converted to numpy data first plt.imshow(np.transpose(npimg, (1,2,0))) plt.show() # Read a picture from the data iterator dataiter = iter(trainloader) images, labels = dataiter.next() # Show pictures imshow(torchvision.utils.make_grid(images)) # print label print(" ".join('%5s' % classes[labels[j]] for j in range(4))) cat ship ship plane
Define convolutional neural network
 The only difference is that three channel 3channel is used here
import torch.nn as nn import torch.nn.functional as F class Net(nn.Module): def __init__(self): super(Net, self).__init__() # Define two convolution layers self.conv1 = nn.Conv2d(3, 6, 5) self.conv2 = nn.Conv2d(6, 16, 5) # Define pooling layer self.pool = nn.MaxPool2d(2, 2) # Define three fully connected layers self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) #Only 120 and 84 of these parameters can be changed def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) # Transform the shape of x to fit the full connection layer x = x.view(1, 16*5*5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x net = Net() print(net) Net( (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1)) (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (fc1): Linear(in_features=400, out_features=120, bias=True) (fc2): Linear(in_features=120, out_features=84, bias=True) (fc3): Linear(in_features=84, out_features=10, bias=True) )
Define loss function
 Cross entropy loss function and random gradient descent optimizer are used
import torch.optim as optim # Define the loss function and use the cross entropy loss function criterion = nn.CrossEntropyLoss() # Define an optimizer that uses a random gradient descent optimizer optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
Training model on training set
 The optimization algorithms based on gradient descent need many rounds of iterative training
# Training model # The overall dataset traverses two rounds for epoch in range(2): running_loss = 0.0 for i, data in enumerate(trainloader , 0): # data contains input image tensor inputs and label tensor labels inputs, labels = data # First, zero the optimizer gradient optimizer.zero_grad() # The input image tensor enters the network to obtain the output tensor outputs outputs = net(inputs) # The loss value is calculated by image output outputs and label labels loss = criterion(outputs, labels) # Back propagation + parameter update, standard process of standard code loss.backward() optimizer.step() # Print rounds and loss values running_loss += loss.item() if (i + 1) % 2000 == 0: print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss/2000)) running_loss = 0.0 print('Finished Training') [1, 2000] loss: 2.218 [1, 4000] loss: 1.939 [1, 6000] loss: 1.717 [1, 8000] loss: 1.580 [1, 10000] loss: 1.530 [1, 12000] loss: 1.488 [2, 2000] loss: 1.391 [2, 4000] loss: 1.405 [2, 6000] loss: 1.347 [2, 8000] loss: 1.359 [2, 10000] loss: 1.328 [2, 12000] loss: 1.297 Finished Training
 Save model
# Save model PATH = './cifat_net.pyh' # Save the state Dictionary of the model torch.save(net.state_dict(), PATH)
Test model on test set
 The first step is to show some pictures in the test set
 In the test phase, first remember to comment out the state Dictionary of the previously saved model, because the model will not be trained in the test phase. If the model is saved, it means that the model has learned nothing.
# test model dataiter = iter(testloader) images, labels = dataiter.next() # Print original picture imshow(torchvision.utils.make_grid(images)) # Print real labels print('GroundTruth: ', " ".join('%5s' % classes[labels[j]] for j in range(4)))
 The second step is to load the model and predict the test image
# Instantiate class object net = Net() # Load the state Dictionary of the model saved in the training phase net.load_state_dict(torch.load(PATH)) # Use the model to predict the picture outputs = net(images) # There are ten categories in total, and the category with the highest probability calculated by the model is used as the prediction category (greed) _, predicted = torch.max(outputs, 1) # Print the results of the forecast label print('Predicted: ', ' '.join('%5s' % classes[predicted[j]] for j in range(4)))

It is normal to find that there are errors between the test results and the actual results. One of the reasons may be the small number of models.

You can see the performance of the model on all test sets:
correct = 0 total = 0 with torch.no_grad(): # Indicates that the code block only reads the model and does not change it for data in testloader: images, labels = data outputs = net(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() print('Accuracy of the network on the 10000 test images: %d%%' % ( 100 * correct / total)) Accuracy of the network on the 10000 test images: 52%

52% is a very normal data, indicating that the model has learned something. At the same time, be very vigilant. If the accuracy of the model is about 10%, it means that the model has learned nothing, that is, the accuracy of 10% guessed by all Mongolia.

This data is a very overall data. If we want to take a more detailed look at which categories the model performs better, we can calculate the accuracy separately
# Statistics for different categories class_correct = list(0. for i in range(10)) class_total = list(0. for i in range(10)) with torch.no_grad(): for data in testloader: images, labels = data outputs = net(images) _, predicted = torch.max(outputs, 1) c = (predicted == labels).squeeze() for i in range(4): label = labels[i] class_correct[label] += c[i].item() class_total[label] += 1 for i in range(10): print('Accuracy of %5s : %2d%%' % (classes[i], 100*class_correct[i] / class_total[i])) Accuracy of plane : 39% Accuracy of car : 58% Accuracy of bird : 30% Accuracy of cat : 13% Accuracy of deer : 36% Accuracy of dog : 67% Accuracy of frog : 69% Accuracy of horse : 53% Accuracy of ship : 74% Accuracy of truck : 78%
Training model on GPU
 In order to make real use of the excellent properties of Tensor in pytoch and accelerate model training, we can transfer the training process to GPU.
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') # Then transfer the model to the GPU net.to(device) # Finally, during training and testing, the picture and label tensor are transferred to GPU at each step inputs, labels = data[0].to(device), data[1].to(device)
summary
# * coding: utf8 * """ Created on Tue Oct 19 20:18:46 2021 @author: Lancibe """ import ssl ssl._create_default_https_context = ssl._create_unverified_context import torch import torchvision import torchvision.transforms as transforms transform = transforms.Compose( [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) trainset = torchvision.datasets.CIFAR10( root='./data', train=True, download=True, transform=transform) trainloader = torch.utils.data.DataLoader( trainset, batch_size=4, shuffle=True, num_workers=0) testset = torchvision.datasets.CIFAR10( root='./data', train=False, download=True, transform=transform) testloader = torch.utils.data.DataLoader( testset, batch_size=4, shuffle=False, num_workers=0) classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck') import numpy as np import matplotlib.pyplot as plt # Build a function to display pictures def imshow(img): img = img / 2 + 0.5 npimg = img.numpy() # matplot can only be applied if it is converted to numpy data first plt.imshow(np.transpose(npimg, (1,2,0))) plt.show() # Read a picture from the data iterator #dataiter = iter(trainloader) #images, labels = dataiter.next() # Show pictures #imshow(torchvision.utils.make_grid(images)) # print label #print(" ".join('%5s' % classes[labels[j]] for j in range(4))) import torch.nn as nn import torch.nn.functional as F class Net(nn.Module): def __init__(self): super(Net, self).__init__() # Define two convolution layers self.conv1 = nn.Conv2d(3, 6, 5) self.conv2 = nn.Conv2d(6, 16, 5) # Define pooling layer self.pool = nn.MaxPool2d(2, 2) # Define three fully connected layers self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) #Only 120 and 84 of these parameters can be changed def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) # Transform the shape of x to fit the full connection layer x = x.view(1, 16*5*5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x device = torch.device('cuda' if torch.cuda.is_available() else "cpu") net = Net() net.to(device) #print(net) import torch.optim as optim # Define the loss function and use the cross entropy loss function criterion = nn.CrossEntropyLoss() # Define an optimizer that uses a random gradient descent optimizer optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) # Training model # The overall dataset traverses two rounds for epoch in range(2): running_loss = 0.0 for i, data in enumerate(trainloader , 0): # data contains input image tensor inputs and label tensor labels inputs, labels = data[0].to(device), data[1].to(device) # First, zero the optimizer gradient optimizer.zero_grad() # The input image tensor enters the network to obtain the output tensor outputs outputs = net(inputs) # The loss value is calculated by image output outputs and label labels loss = criterion(outputs, labels) # Back propagation + parameter update, standard process of standard code loss.backward() optimizer.step() # Print rounds and loss values running_loss += loss.item() if (i + 1) % 2000 == 0: print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss/2000)) running_loss = 0.0 print('Finished Training') # Save model PATH = './cifat_net.pyh' # Save the state Dictionary of the model torch.save(net.state_dict(), PATH) # test model dataiter = iter(testloader) images, labels = dataiter.next() # Print original picture #imshow(torchvision.utils.make_grid(images)) # Print real labels #print('GroundTruth: ', " ".join('%5s' % classes[labels[j]] for j in range(4))) # Instantiate class object #net = Net() # Load the state Dictionary of the model saved in the training phase net.load_state_dict(torch.load(PATH)) # Use the model to predict the picture #outputs = net(images) # There are ten categories in total, and the category with the highest probability calculated by the model is used as the prediction category (greed) #_, predicted = torch.max(outputs, 1) # Print the results of the forecast label #print('Predicted: ', ' '.join('%5s' % classes[predicted[j]] for j in range(4))) ''' correct = 0 total = 0 with torch.no_grad(): # Indicates that the code block only reads the model and does not change it for data in testloader: images, labels = data outputs = net(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() print('Accuracy of the network on the 10000 test images: %d%%' % ( 100 * correct / total)) ''' # Statistics for different categories class_correct = list(0. for i in range(10)) class_total = list(0. for i in range(10)) with torch.no_grad(): for data in testloader: images, labels = data[0].to(device), data[1].to(device) outputs = net(images) _, predicted = torch.max(outputs, 1) c = (predicted == labels).squeeze() for i in range(4): label = labels[i] class_correct[label] += c[i].item() class_total[label] += 1 for i in range(10): print('Accuracy of %5s : %2d%%' % (classes[i], 100*class_correct[i] / class_total[i]))