class torch.nn.SmoothL1Loss(size_average=None, reduce=None, reduction: str = 'mean', beta: float = 1.0)
[source]
Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise. It is less sensitive to outliers than the MSELoss
and in some cases prevents exploding gradients (e.g. see Fast R-CNN
paper by Ross Girshick). Also known as the Huber loss:
where $z_{i}$ is given by:
$x$ and $y$ arbitrary shapes with a total of $n$ elements each the sum operation still operates over all the elements, and divides by $n$ .
beta is an optional parameter that defaults to 1.
Note: When beta is set to 0, this is equivalent to L1Loss
. Passing a negative value in for beta will result in an exception.
The division by $n$ can be avoided if sets reduction = 'sum'
.
reduction
). By default, the losses are averaged over each loss element in the batch. Note that for some losses, there are multiple elements per sample. If the field size_average
is set to False
, the losses are instead summed for each minibatch. Ignored when reduce is False
. Default: True
reduction
). By default, the losses are averaged or summed over observations for each minibatch depending on size_average
. When reduce
is False
, returns a loss per batch element instead and ignores size_average
. Default: True
'none'
| 'mean'
| 'sum'
. 'none'
: no reduction will be applied, 'mean'
: the sum of the output will be divided by the number of elements in the output, 'sum'
: the output will be summed. Note: size_average
and reduce
are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction
. Default: 'mean'
reduction
is 'none'
, then $(N, *)$ , same shape as the input
© 2019 Torch Contributors
Licensed under the 3-clause BSD License.
https://pytorch.org/docs/1.7.0/generated/torch.nn.SmoothL1Loss.html