NLP learning notes
Introduction to pytoch
Pytoch basic operation
Tensor tensor
from __future__ import print_function import torch # Create an uninitialized matrix x = torch.empty(5,3) print(x) # Create a matrix with initialization x = torch.rand(5,3) print(x) # The output is tensor([n,n,n])
- It can be found that when using the empty method to create a five row three column matrix, the data is not 0. This is because the empty method does not set the data in memory to 0 and retains the original data in memory. The random data created by rand method with initialization matrix conforms to the standard Gaussian distribution (standard normal distribution).
# Create an all zero matrix and specify the data element type as long x = torch.zeros(5, 3, dtype=torch.long) # Creating tensors directly from data x = torch.tensor([2.5, 3.5]) # Create a new tensor of the same size from an existing tensor x = x.new_ones(5, 3, dtype=torch.double) # new_methods are methods y = torch_randn_like(x, dtypr=torch.float) # randn_ The same tensor size is obtained by like method, and its value is assigned by random initialization. # The tensor size is obtained by the size() method print(x.size()) # The output format is torch.Size([5,3]), and the return value is a tuple
Basic operation
- Addition operation
x = torch.rand(5,3) y = torch.rand(5,3) # Addition operation, the result is the same print(x+y) print(torch.add(x,y)) # Store the addition result result = torch.empty(5,3) torch.add(x,y,result) print(result) # In situ replacement y.add_(x) # Similar to y+=x print(y)
- Extract specific rows and columns
print(x[:, 1]) # Print out any row, first column print(x[:, :2]) # Print out any row and the first two columns
- Change tensor shape
x = torch.randn(4,4) # The view() method needs to keep the total number of elements unchanged y = view(16); # -1 indicates automatic matching z = x.view(-1,8) # -1 means 2 here # When there is only one element in the tensor, the element can be extracted with the item() method x = torch.randn(1) print(x.item())
Mutual conversion of Torch Tensor and Numpy array
- They share the underlying memory space, so changing the value of one will change the other.
# Use the numpy() method directly a = torch.ones(5) b = a.numpy() # Changing one will change the other a.add_(1) print(a) print(b) # numpy array to torch tensor import numpy as np a = np.ones(5) b = torch.from_numpy(a) np.add(a, 1, out=a)
- All tensors on the CPU, except CharTensor, can be converted to each other.
# About Cuda Tensor: Tensors can use the to() method to move it to any device. if torch.cuda.is_available(): # Define a device object and specify CUDA as GPU device = torch.device("cuda") # Create a Tensor on the GPU y = torch.ones_like(x, device = device) # Move CPU tensor to GPU x = x.to(device) # Operations can only be performed on the same device z = x+y # z is automatically created on the GPU print(z) # Transfer z to the CPU and specify the tensor element data type print(z.to("cpu", torch.double))
autograd in Python
- In the whole pytoch framework, all neural networks are essentially an autograd package. It provides a function of differentiating all operations on Tensors.
- torch.Tensor is the core class in the whole package. If the attribute requires_grad() is set to True, which will track all operations defined on this class. When the code needs back propagation, all gradients can be calculated automatically by directly calling backward(). All gradients on this Tensor will be added to the attribute grad.
- You can use detach() to terminate the backtracking in a tensor calculation graph. You can also use with torch.no_grad() no longer performs directional propagation for derivatives.
- torch.Function(), this class is as important as Tensor class, and each Tensor has a grad_fn attribute, representing which specific function is referenced to create the Tensor
- If a tensor is user-defined, the attribute is None
- About Tensor:
import torch x1 = torch.ones(3,3) print(x1) x = torch.ones(2,2,requires_grad=True) print(x)
- output
tensor([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.]]) tensor([[1., 1.], [1., 1.]], requires_grad=True)
# In requires_ Operate on Tensor with grad = true y = x + 2 print(y) tensor([[3., 3.], [3., 3.]], grad_fn=<AddBackward0>) print(x.grad_fn) print(y.grad_fn) None <AddBackward0 object at 0x000001CDF6C18848>
# Perform more complex operations z = y*y*3 out = z.mean() print(z) print(out) tensor([[27., 27.], [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)
-
requires_grad_ The () method can change the Tensor property requires in place_ Grad, if not actively set, the default is False.
-
For the attribute setting of automatic derivation, you can set requirements_ Grad = true to perform automatic derivation, or you can stop automatic derivation through the restriction of code blocks.
print(x.requires_grad) print((x**2).requires_grad) with torch.no_grad(): print((x**2).requires_grad) True True False # You can also create Tensor through detach() to get the same content without automatic derivation. print(x.requires_grad) y = x.detach() print(y.requires_grad) print(x.eq(y).all()) True False tensor(True)