MultiBernoulliNegativeLogProbLoss
Inherits From: DistributionNegativeLogProbLoss, NaturalParamsNegativeLogProbLoss
Defined in tensorflow/contrib/kfac/python/ops/loss_functions.py.
Neg log prob loss for multiple Bernoulli distributions param'd by logits.
Represents N independent Bernoulli distributions where N = len(logits). Its Fisher Information matrix is given by,
F = diag(p * (1-p)) p = sigmoid(logits)
As F is diagonal with positive entries, its factor B is,
B = diag(sqrt(p * (1-p)))
distThe underlying tf.distributions.Distribution.
fisher_factor_inner_shapeThe shape of the tensor returned by multiply_fisher_factor.
fisher_factor_inner_static_shapeStatic version of fisher_factor_inner_shape.
hessian_factor_inner_shapeThe shape of the tensor returned by multiply_hessian_factor.
hessian_factor_inner_static_shapeStatic version of hessian_factor_inner_shape.
inputsThe inputs to the loss function (excluding the targets).
paramsParameters to the underlying distribution.
targetsThe targets being predicted by the model.
None or Tensor of appropriate shape for calling self._evaluate() on.
__init____init__(
logits,
targets=None,
seed=None
)
Initialize self. See help(type(self)) for accurate signature.
evaluateevaluate()
Evaluate the loss function on the targets.
evaluate_on_sampleevaluate_on_sample(seed=None)
Evaluates the log probability on a random sample.
seed: int or None. Random seed for this draw from the distribution.Log probability of sampled targets, summed across examples.
multiply_fishermultiply_fisher(vector)
Right-multiply a vector by the Fisher.
vector: The vector to multiply. Must be the same shape(s) as the 'inputs' property.The vector right-multiplied by the Fisher. Will be of the same shape(s) as the 'inputs' property.
multiply_fisher_factormultiply_fisher_factor(vector)
Right-multiply a vector by a factor B of the Fisher.
Here the 'Fisher' is the Fisher information matrix (i.e. expected outer- product of gradients) with respect to the parameters of the underlying probability distribtion (whose log-prob defines the loss). Typically this will be block-diagonal across different cases in the batch, since the distribution is usually (but not always) conditionally iid across different cases.
Note that B can be any matrix satisfying B * B^T = F where F is the Fisher, but will agree with the one used in the other methods of this class.
vector: The vector to multiply. Must be of the shape given by the 'fisher_factor_inner_shape' property.The vector right-multiplied by B. Will be of the same shape(s) as the 'inputs' property.
multiply_fisher_factor_replicated_one_hotmultiply_fisher_factor_replicated_one_hot(index)
Right-multiply a replicated-one-hot vector by a factor B of the Fisher.
Here the 'Fisher' is the Fisher information matrix (i.e. expected outer- product of gradients) with respect to the parameters of the underlying probability distribtion (whose log-prob defines the loss). Typically this will be block-diagonal across different cases in the batch, since the distribution is usually (but not always) conditionally iid across different cases.
A 'replicated-one-hot' vector means a tensor which, for each slice along the batch dimension (assumed to be dimension 0), is 1.0 in the entry corresponding to the given index and 0 elsewhere.
Note that B can be any matrix satisfying B * B^T = H where H is the Fisher, but will agree with the one used in the other methods of this class.
index: A tuple representing in the index of the entry in each slice that is 1.0. Note that len(index) must be equal to the number of elements of the 'fisher_factor_inner_shape' tensor minus one.The vector right-multiplied by B. Will be of the same shape(s) as the 'inputs' property.
multiply_fisher_factor_transposemultiply_fisher_factor_transpose(vector)
Right-multiply a vector by the transpose of a factor B of the Fisher.
Here the 'Fisher' is the Fisher information matrix (i.e. expected outer- product of gradients) with respect to the parameters of the underlying probability distribtion (whose log-prob defines the loss). Typically this will be block-diagonal across different cases in the batch, since the distribution is usually (but not always) conditionally iid across different cases.
Note that B can be any matrix satisfying B * B^T = F where F is the Fisher, but will agree with the one used in the other methods of this class.
vector: The vector to multiply. Must be the same shape(s) as the 'inputs' property.The vector right-multiplied by B^T. Will be of the shape given by the 'fisher_factor_inner_shape' property.
multiply_hessianmultiply_hessian(vector)
Right-multiply a vector by the Hessian.
Here the 'Hessian' is the Hessian matrix (i.e. matrix of 2nd-derivatives) of the loss function with respect to its inputs.
vector: The vector to multiply. Must be the same shape(s) as the 'inputs' property.The vector right-multiplied by the Hessian. Will be of the same shape(s) as the 'inputs' property.
multiply_hessian_factormultiply_hessian_factor(vector)
Right-multiply a vector by a factor B of the Hessian.
Here the 'Hessian' is the Hessian matrix (i.e. matrix of 2nd-derivatives) of the loss function with respect to its inputs. Typically this will be block-diagonal across different cases in the batch, since the loss function is typically summed across cases.
Note that B can be any matrix satisfying B * B^T = H where H is the Hessian, but will agree with the one used in the other methods of this class.
vector: The vector to multiply. Must be of the shape given by the 'hessian_factor_inner_shape' property.The vector right-multiplied by B. Will be of the same shape(s) as the 'inputs' property.
multiply_hessian_factor_replicated_one_hotmultiply_hessian_factor_replicated_one_hot(index)
Right-multiply a replicated-one-hot vector by a factor B of the Hessian.
Here the 'Hessian' is the Hessian matrix (i.e. matrix of 2nd-derivatives) of the loss function with respect to its inputs. Typically this will be block-diagonal across different cases in the batch, since the loss function is typically summed across cases.
A 'replicated-one-hot' vector means a tensor which, for each slice along the batch dimension (assumed to be dimension 0), is 1.0 in the entry corresponding to the given index and 0 elsewhere.
Note that B can be any matrix satisfying B * B^T = H where H is the Hessian, but will agree with the one used in the other methods of this class.
index: A tuple representing in the index of the entry in each slice that is 1.0. Note that len(index) must be equal to the number of elements of the 'hessian_factor_inner_shape' tensor minus one.The vector right-multiplied by B^T. Will be of the same shape(s) as the 'inputs' property.
multiply_hessian_factor_transposemultiply_hessian_factor_transpose(vector)
Right-multiply a vector by the transpose of a factor B of the Hessian.
Here the 'Hessian' is the Hessian matrix (i.e. matrix of 2nd-derivatives) of the loss function with respect to its inputs. Typically this will be block-diagonal across different cases in the batch, since the loss function is typically summed across cases.
Note that B can be any matrix satisfying B * B^T = H where H is the Hessian, but will agree with the one used in the other methods of this class.
vector: The vector to multiply. Must be the same shape(s) as the 'inputs' property.The vector right-multiplied by B^T. Will be of the shape given by the 'hessian_factor_inner_shape' property.
samplesample(seed)
Sample 'targets' from the underlying distribution.
© 2018 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/api_docs/python/tf/contrib/kfac/loss_functions/MultiBernoulliNegativeLogProbLoss