The Cross Entropy Loss in PyTorch is used to compute the probability (or loss) of the model performing correctly given a single sample. This loss value is then used to determine how well the model has trained using a classification problem. NN Cross Entropy Loss is a popular training loss for Machine Learning as it is used to train neural networks for classification problems with high performance.
Purpose of using Cross Entropy Loss function
The purpose of using cross entropy loss is to measure the difference between the predicted probabilities outputted by a neural network and the actual probabilities of the target variable in a classification problem. It is a popular loss function used in machine learning, particularly in deep learning, because it is suitable for models with multiple classes and can effectively penalize the model for incorrect predictions.
Cross entropy loss aims to minimize the difference between the predicted probabilities and the actual probabilities by adjusting the model parameters during training. The lower the cross entropy loss value, the better the model is performing on the given classification task. Therefore, the ultimate goal is to minimize the cross entropy loss during training to improve the accuracy of the model.
Cross entropy loss
Using the function CrossEntropyLoss(), we compute the cross entropy loss between the input and target values (predicted and actual). Therefore, It is achieved by the torch.nn module, to determine the cross entropy loss.
Cross entropy loss is designed for classification problems. However, Multiclass classification problems can be trained very effectively using CrossEntropyLoss().
In the following example, there will be a calculation of the cross-entropy of the dummy variables. The loss is calculated between 0-1. Where
0 refers to a perfect model close to actual variables and our goal is to achieve the model close to 0.
from torch import nn out = nn.CrossEntropyLoss() # Describe the input variable input = torch.ones(5) #print input print("Input = \n",input) input = torch.tensor([[0, 1,2, 3]],dtype=torch.float) print("Input long actual = \n",input) y = torch.tensor(, dtype=torch.long) print("Input predicted = \n",y) crosseloss=out(input, y) print('Cross Entropy Loss: \n', crosseloss)
Input = tensor([1., 1., 1., 1., 1.]) Input long actual = tensor([[0., 1., 2., 3.]]) Input predicted = tensor() Cross Entropy Loss: tensor(0.4402)
Cross entropy loss PyTorch backward
Here a demonstration about cross-entropy loss PyTorch backward in Python.
- Cross entropy loss backward is used to determine the best fit model between actual and targeted variables.
- Softmax is not a loss but a function. It provides a probability of each element.
- However, backward() function just computes gradient.
- So softmax() function computes with backward() function to determine the gradient.
- Next, compute gradients based on the output backward function. However, You can improve loss in the next training iteration.
#target with class probabilities import torch import torch.nn as nn # Describe the input variable input = torch.ones(3,requires_grad=True) print("Input = ",input) pred = torch.randn(3) .softmax(dim=0) print('predicted size \n:', pred.size()) lss = nn.CrossEntropyLoss() out = lss(input, pred) out.backward() print("Input:",input) print("predicted:",pred) print("Cross Entropy Loss:",out) print('Input grads: ', input.grad)
Input = tensor([1., 1., 1.], requires_grad=True) predicted size : torch.Size() Input: tensor([1., 1., 1.], requires_grad=True) predicted: tensor([0.6004, 0.3339, 0.0657]) Cross Entropy Loss: tensor(1.0986, grad_fn=<DivBackward1>) Input grads: tensor([-0.2670, -0.0006, 0.2676])
Use backward() function without softmax() function
loss = nn.CrossEntropyLoss() import torch import torch.nn as nn # Describe the input variable input = torch.ones(3,requires_grad=True) y = torch.empty(3) out = loss(input, y) out.backward() print("Input: \n",input) print("predicted: \n",y) print("Cross Entropy Loss: \n",out) print('Input grads: \n', input.grad)
Input: tensor([1., 1., 1.], requires_grad=True) predicted: tensor([0.0000e+00, 1.8980e+01, 7.7549e-14]) Cross Entropy Loss: tensor(20.8519, grad_fn=<DivBackward1>) Input grads: tensor([ 6.3267, -12.6535, 6.3267])
Logits with binary Cross entropy loss
The following example demonstrates cross-entropy loss PyTorch logits in Python.
- In cross-entropy loss, PyTorch logits are the net input of the last neuron layer (unnormalized raw value). However, we can also say that logits have an inverse reaction with logistic sigmoid function.
- Define a dummy input and test target the cross entropy loss pytorch function.
- Consider importing BCEWithLogitsLoss(pos_weight=pos_wgt) supported function from torch.nn module. However, Call the function with positive weight.
- Define the above command as variable and pass the dummy inputs and target as argument.
- In PyTorch, there are nn.BCELoss and nn.BCEWithLogitsLoss. However, PyTorch can take raw unnormalized logits for the first and normalized sigmoid probabilities for the second.
- Binary cross entropy refers to the classes of 2.
import torch # 2 classes, batch size = 3 input = torch.ones([2,3], dtype=torch.float32) #A prediction (logit) pred = torch.full([2, 3], 1.0) #All weights are equal to 1 pos_wgt = torch.ones() c = torch.nn.BCEWithLogitsLoss(pos_weight=pos_wgt) print("Input: \n",input) print("predicted: \n",pred) print("Cross Entropy Loss: \n",pos_wgt) # -log(sigmoid(1.0)) x=c(pred, input) x
Input: tensor([[1., 1., 1.], [1., 1., 1.]]) predicted: tensor([[1., 1., 1.], [1., 1., 1.]]) Cross Entropy Loss: tensor([1., 1., 1.]) tensor(0.3133)
Cross entropy loss PyTorch weight
The below example illustrates the cross-entropy loss weight in PyTorch.Cross-entropy loss weight is determined with CrossEntropyLoss(weight=target) by using the torch.nn module. Hence, Invoke the function with a weight parameter.
from torch import nn import torch softmax=nn.Softmax() target=torch.tensor([1.3,2.0]) # weight assign manually to each class. CrossEntropyLoss = nn.CrossEntropyLoss(weight=target) #inputvariable input_var = torch.tensor([[1.3, 2.0],[1.3,2.0]]) #targetvariable target_var = torch.tensor([1,1])#classes=1, batch size=2 out = CrossEntropyLoss(input_var, target_var) print(out)
Cross-Entropy Loss implemented on classification model
A Cost Function presents the discrepancy between predicted and expected values. However, In classification models, cost function or cross-entropy, is a measure of how well they predict probability values between 0 and 1.
The Classification Model with low cross-entropy losses was successful to determine a good decision boundary for provided weights and biases.
#import libraries import numpy as np import matplotlib.pyplot as plt #define a function def func_sigmoid(score): return 1/(1+ np.exp(-score)) #define another function with def keyword def error(line_parameter, points, y ): m= points.shape p = func_sigmoid(points * line_parameter) #Cross-entropy loss formula implementation cross_entropy = -(1/m)*(np.log(p).T * y + np.log(1-p).T * (1-y)) return cross_entropy #sample points sample_points = 3 #provide a starting point np.random.seed(0) bias = np.ones(sample_points) # t_regression = np.array([np.random.normal(5, 2, sample_points),np.random.normal(5, 2, sample_points),bias]).T b_regression = np.array([np.random.normal(3, 2, sample_points),np.random.normal(3, 2, sample_points),bias]).T #verically row wise sequence array allsamplepoints = np.vstack((t_regression,b_regression)) #weight and bias w1 = -0.3 w2 = -0.35 #Low bias will best fit the target. b = 2.5 line= np.matrix([w1,w2,b]).T print ('line_parameter : \n', line) print ('all_points : \n', allsamplepoints) #print linear regression linear = allsamplepoints*line print ('linear combination : \n', linear) positive_likelihood = func_sigmoid(linear) #gives probability of each point being in positive region print("probabilities",positive_likelihood) #reshape array of size 3 into shape (6,1) y= np.array([np.zeros(sample_points),np.ones(sample_points)]).reshape(sample_points*2, 1) #print Cross Entropy Loss print('Cross Entropy Loss:',(error(line,allsamplepoints,y))) # code for plot the Cross Entropy Loss _,axis = plt.subplots(figsize=(4, 4)) axis.scatter(t_regression[:, 0], t_regression[:, 1],c='gray') axis.scatter(b_regression[:, 0], b_regression[:, 1], c='r') plt.draw() #display 2 D graph plt.show()
line_parameter : [[-0.3 ] [-0.35] [ 2.5 ]] all_points : [[8.52810469 9.4817864 1. ] [5.80031442 8.73511598 1. ] [6.95747597 3.04544424 1. ] [4.90017684 3.821197 1. ] [2.69728558 3.28808714 1. ] [2.7935623 5.90854701 1. ]] linear combination : [[-3.37705665] [-2.29738492] [-0.65314827] [-0.307472 ] [ 0.53998383] [-0.40606014]] probabilities [[0.03302025] [0.09133977] [0.34228043] [0.42373191] [0.63180866] [0.3998572 ]] Cross Entropy Loss: [[0.46380153]]
This article demonstrates how the CrossEntropyLoss function used for classification is implemented in PyTorch, including its usage with softmax, weights, and the backward() function. The following topics are covered:
- The purpose of using the Cross Entropy Loss function
- The Cross Entropy Loss function itself
- Implementing the Cross Entropy Loss function with PyTorch’s backward() function
- Logits with binary Cross Entropy Loss
- Using weights with the Cross Entropy Loss function in PyTorch
- Implementation of Cross-Entropy Loss on a classification model.