This chapter will train a simple CNN network on CIFAR10 dataset:
- A simple CNN network is trained based on CIFAR-10 data set.
- Save the trained model and test it.
- Train with GPU.
CIFAR dataset
CIFAR data sets can be divided into CIFAR10 and CIFAR100. CIFAR-10 includes 10 categories and CIFAR-100 includes 100 categories.
CIFAR-10
Features: 32x32 color image; 10 categories; 60000 images in total; 50000 training samples + 10000 test samples; 6000 images per category, 10 x 6000 = 60000;
10 categories: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck;
Tips: you don't need to download it manually. You can download it automatically by using the Dataset API in pytorch
Experimental process
Prepare dataset
This step is very convenient in pytorch. Pytorch has prepared common data sets for us. We only need to import them.
The dataset is in the torchvision.dataset package:
import torch import torchvision import torchvision.transforms as transforms from torch.utils.data import DataLoader import torch.nn as nn import torch.nn.functional as F import torch.optim as optim import matplotlib.pyplot as plt import numpy as np
torchvision.dataset.CFAIR10 is a class. You can manipulate a dataset by instantiating an object of this class.
Parameters:
root ---- the path saved after the data set is downloaded
train ---- training or test
Download ---- whether automatic download is required
Transform ---- to transform an image, you generally need to transform the original image with ToTensor(), Normalize()
Then, use the DataLoader class to wrap the dataset for easy reading and use, such as min_batch read, using multithreading.
# --------------------Prepare dataset------------------ # Dataset, DataLoader transform = transforms.Compose( [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), std =(0.5, 0.5, 0.5))]) trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform) testset = torchvision.datasets.CIFAR10(root='./data',train=False, transform=transform, download=True) trainloader = DataLoader(dataset=trainset, batch_size=4, shuffle=True, num_workers=4) testloader = DataLoader(dataset=testset, batch_size=4, shuffle=True, num_workers=4)
# dataiter = iter(trainloader) images, labels = dataiter.next() imshow(torchvision.utils.make_grid(images)) # print labels print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Define CNN network
For simplicity, the LeNet network is used to change the input channel of the first convolution layer to 3, because CIFAR-10 is a color 3-channel image.
#Define a simple network # LeNet -5 class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5) self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2) self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5) self.fc1 = nn.Linear(in_features=16 * 5 * 5,out_features=120) self.fc2 = nn.Linear(in_features=120, out_features=84) self.fc3 = nn.Linear(in_features=84, out_features=10) def forward(self, x): x = self.pool1(F.relu(self.conv1(x))) x = self.pool1(F.relu(self.conv2(x))) x = x.view(-1, 16 * 5 * 5) # reshape tensor x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x
Set up the optimization and iteration methods of the network and train the network
CNN network training is essentially the problem of minimizing an objective function (loss function). In mathematics, for general convex functions, optimization methods include gradient descent method, Newton method and so on. (in addition, there are heuristic search, such as genetic algorithm). For the training of neural network, the commonly used optimization method is stochastic gradient descent method SGD.
- Definition of loss function and optimization method
Cross entropy loss function
SGD random gradient descent method is used for optimization (driving quantity term)
# Definition of loss function and optimization method # Cross enterprise loss, SGD with moment criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
- Iterative optimization and training
Iter ----- an iteration refers to a min_ One forward+backward of batch
Epoch ----- after iterating all training data (once), it is called an epoch
There are 20 epoch s running here.
# Training network # Iterative epoch for epoch in range(20): running_loss = 0.0 for i, data in enumerate(trainloader, 0): # get the input inputs, labels = data # zeros the paramster gradients optimizer.zero_grad() # # forward + backward + optimize outputs = net(inputs) loss = criterion(outputs, labels) # Calculate loss loss.backward() # loss derivation optimizer.step() # Update parameters # print statistics running_loss += loss.item() # tensor.item() gets the value of tensor if i % 2000 == 1999: print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000)) # The average value of loss is output every 2000 iterations running_loss = 0.0 print('Finished Training')
Save model
# --------Save model----------- torch.save(net, './model/model_cfair10_2.pth') # Save the whole model, and the volume is relatively large # torch.save(net.state_dict(), './model/model_cfair10.pth')
test model
import torch import torchvision import torchvision.transforms as transforms from torch.utils.data import DataLoader import torch.nn as nn import torch.nn.functional as F import torch.optim as optim import matplotlib.pyplot as plt import numpy as np from PIL import Image
CIFAR-10 contains a total of 10 categories:
CFAIR10_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'forg', 'horse', 'ship', 'truck']
Loading an image, RBG, must belong to one of the above categories, otherwise it cannot be recognized
# load a image image = Image.open('/xxxx/image/dog.jpg')
Make the same transformation on the image:
transform = transforms.Compose( [transforms.Resize((32, 32)), transforms.ToTensor(), transforms.Normalize( mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5) )]) image_transformed = transform(image) print(image_transformed.size())
Points needing attention
The input of CNN network is 4D Tensor (NxCxHxW), and the converted image needs to be transformed into 4D
torsor1.unsqueeze(0) can add a dimension, so the entered tensor is 1x3x32x32
# transform = transforms.Compose( [transforms.Resize((32, 32)), transforms.ToTensor(), transforms.Normalize( mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5) )]) image_transformed = transform(image) print(image_transformed.size()) class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5) self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2) self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5) self.fc1 = nn.Linear(in_features=16 * 5 * 5,out_features=120) self.fc2 = nn.Linear(in_features=120, out_features=84) self.fc3 = nn.Linear(in_features=84, out_features=10) def forward(self, x): x = self.pool1(F.relu(self.conv1(x))) x = self.pool1(F.relu(self.conv2(x))) x = x.view(-1, 16 * 5 * 5) # reshape tensor x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x net = torch.load('./model/model_cfair10.pth') # print(net) image_transformed = image_transformed.unsqueeze(0) output = net(image_transformed) predict_value, predict_idx = torch.max(output, 1) # Find the maximum value of the specified dimension and return the maximum value and index plt.figure() plt.imshow(np.array(image)) plt.title(CFAIR10_names[predict_idx]) plt.axis('off') plt.show()
----------------------------------------------------
GPU training and model learning rate adjustment
Question:
- After training 20 epochs with CPU model, the loss decreased to about 0.6. Later, 20 epochs were iterated based on the previous training, and it was found that the loss was between 0.5 and 0.6.
- Training on the CPU is really slow. It took more than 1h to run 20 epochs (I don't remember the specific time), which is quite long.
Using GPU training model
- Computer configuration GPU: 1080
First, you need to install the GPU version of pytorch. The specific installation steps are available on the pytorch official website. Training with GPU requires some minor adjustments to the code.
**step1: * * in the code, first use the function in pytorch to determine whether GPU is supported
is_support = torch.cunda.is_available() if is_support: device = torch.device('cuda:0') # device = torch.device('cuda:1') else: device = torch.device('cpu')
Step 2: transfer the calculation on CPU to GPU
net = Net() net.to(device) # GPU mode needs to be added # Training network # Iterative epoch for epoch in range(20): running_loss = 0.0 for i, data in enumerate(trainloader, 0): # get the input inputs, labels = data inputs = inputs.to(device) # GPU calculation labels = labels.to(device) # GPU calculation # zeros the paramster gradients optimizer.zero_grad() # # forward + backward + optimize outputs = net(inputs) loss = criterion(outputs, labels) # Calculate loss loss.backward() # loss derivation optimizer.step() # Update parameters # print statistics running_loss += loss.item() # tensor.item() gets the value of tensor if i % 2000 == 1999: print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000)) # The average value of loss is output every 2000 iterations running_loss = 0.0 print('Finished Training')
run, you will find that the iteration speed flies, and 20 Epoch iterations can be completed in about 10 minutes, which is very fast.
Learning rate adjustment
- An important parameter of stochastic gradient descent SGD is learning rate_ rate
The above code uses a fixed learning rate lr=0.001,. At the beginning of the iteration, the learning rate can be larger, so the convergence speed is fast. With the increase of the number of iterations, the learning rate should be reduced to prevent loss oscillation.
For simplicity, I adjust the learning rate to lr=0.0001, and then iterate 20 epochs based on the previous model. Loss was obviously found to be 0.3, 0.2 and 0.1.
Although GPU training was used, lr was reduced to 0.0001 and loss was also reduced (training set loss). In the test, a horse is identified as deer and a bird is identified as cat. Because, to train to a suitable model, other strategies are needed, including the use of other network models.
Evaluate the model performance on the whole test set
- Calculate Acc
import torch import torchvision import torchvision.transforms as transforms from torch.utils.data import DataLoader import torch.nn as nn import torch.nn.functional as F import torch.optim as optim import matplotlib.pyplot as plt import numpy as np from PIL import Image CFAIR10_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'forg', 'horse', 'ship', 'truck'] # --------------Test data set------------------------------ transform = transforms.Compose( [transforms.Resize((32, 32)), transforms.ToTensor(), transforms.Normalize( mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5) )]) testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform) testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=4) # -----------------Nettle model------------------------------- class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5) self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2) self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5) self.fc1 = nn.Linear(in_features=16 * 5 * 5,out_features=120) self.fc2 = nn.Linear(in_features=120, out_features=84) self.fc3 = nn.Linear(in_features=84, out_features=10) def forward(self, x): x = self.pool1(F.relu(self.conv1(x))) x = self.pool1(F.relu(self.conv2(x))) x = x.view(-1, 16 * 5 * 5) # reshape tensor x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x net = torch.load('./model/model_cfair10_20.pth',map_location='cpu') # ------------Test on the entire test set------------------------------------------- correct = 0 total = 0 count = 0 with torch.no_grad(): for sample_batch in testloader: images = sample_batch[0] labels = sample_batch[1] # forward out = net(images) # _, pred = torch.max(out, 1) correct += (pred == labels).sum().item() total += labels.size(0) print('batch:{}'.format(count + 1)) count += 1 # # Acc accuracy = float(correct) / total print('Acc = {:.5f}'.format(accuracy))
Link: https://www.jianshu.com/p/e704a6f6e8d3
Source: Jianshu