Calculation of cross_entropy loss function of torch (including python code)

Keywords: Python Pytorch torch

1. Call

Firstly, the cross entropy loss function of torch is called as follows:

torch.nn.functional.cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

It is usually written as:

import torch.nn.functional as F
F.cross_entropy(input, target)

2. Parameter description

  • Input( tensor )–(N, C),   Where C = number of categories; Or in the case of 2D loss, enter the dimension (N, C, H, W)  , Or at K ≥ 1   In the case of k-dimensional loss, the input dimension is   (N, C, d1, d2, ..., dK)  .

  • target(tensor )-(N) Where each value is 0 ≤ target[i] ≤ C-1, or   K≥1   For k-dimensional loss, the size of the target tensor is (n, D1, D2,..., DK).

  • weight ( Tensor  ,  optional  ) – Manually rescale weights for each category. If given, it must be a tensor of size C

  • size_average ( bool  ,  optional  ) – Not recommended. By default, the loss is the average of each loss element in the batch. Note that for some losses, each sample has multiple elements. If the field size_average   Set to False to sum the losses of each small batch. Ignore False when reduce is. Default: True

  • ignore_index ( int  ,  optional  ) – Specify a target value that is ignored and does not contribute to the input gradient. When size_average is   True, the loss is averaged over targets that are not ignored. Default: - 100

  • reduce ( bool  ,  optional  ) – Not recommended. By default, losses are averaged or summed for each small batch of observations, depending on size_average. When reduceis is False, the loss of each batch element is returned and the size is ignored_ average. Default: True

  • reduce   (   string  ,  optional  ) – Specify the reduction applied to the output:  ' none'|  ' mean'|  ' sum'.  ' None ': no reduction will be applied,  ' Mean ': the sum of the outputs will be divided by the number of elements in the output,  ' Sum ': the output will be summed. Note: size_average   And reduce are being deprecated. At the same time, specifying either of these two parameters will override reduction   Default: 'mean'

3. Examples


import torch
import torch.nn.functional as F
input = torch.randn(3, 5, requires_grad=True)
target = torch.randint(5, (3,), dtype=torch.int64)
loss = F.cross_entropy(input, target)

Variable output:

tensor([[-0.6314,  0.6876,  0.8655, -1.8212,  0.0963],
        [-0.5437,  0.2778, -0.1662, -0.0784, -0.6565],
        [-0.1164,  0.3882,  0.2487, -0.5318,  0.3943]], requires_grad=True)
tensor([1, 0, 0])
tensor(1.6557, grad_fn=<NllLossBackward>)

4. Attention

The torch.nn.functional.cross_entropy function in python is implemented as follows:

def cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100,
                  reduce=None, reduction='mean'):
    if size_average is not None or reduce is not None:
        reduction = _Reduction.legacy_get_string(size_average, reduce)
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

Note 1: the input tensor does not need to go through softmax. The tensor directly taken from fn layer can be sent to the cross entropy, because softmax has been made for the input in the cross entropy.

Note 2: there is no need to encode the label one_hot, because the nll_loss function has implemented a similar one hot process. The difference is that when class = [1, 2, 3], it should start from 0 [0, 1, 2].

The address of the official website is also put here: torch.nn.functional — PyTorch master documentation

It's not easy to tidy up. Welcome to click three times!!!

Posted by wilhud on Wed, 01 Dec 2021 00:39:19 -0800