FullyConnectedDiagonalFB
Inherits From: FisherBlock
Defined in tensorflow/contrib/kfac/python/ops/fisher_blocks.py
.
FisherBlock for fully-connected (dense) layers using a diagonal approx.
Estimates the Fisher Information matrix's diagonal entries for a fully connected layer. Unlike NaiveDiagonalFB this uses the low-variance "sum of squares" estimator.
Let 'params' be a vector parameterizing a model and 'i' an arbitrary index into it. We are interested in Fisher(params)[i, i]. This is,
Consider fully connected layer in this model with (unshared) weight matrix 'w'. For an example 'x' that produces layer inputs 'a' and output preactivations 's',
This FisherBlock tracks Fisher(params)[i, i] for all indices 'i' corresponding to the layer's parameters 'w'.
num_registered_towers
__init__
__init__( layer_collection, has_bias=False )
Creates a FullyConnectedDiagonalFB block.
layer_collection
: The collection of all layers in the K-FAC approximate Fisher information matrix to which this FisherBlock belongs.has_bias
: Whether the component Kronecker factors have an additive bias. (Default: False)instantiate_factors
instantiate_factors( grads_list, damping )
Creates and registers the component factors of this Fisher block.
grads_list
: A list gradients (each a Tensor or tuple of Tensors) with respect to the tensors returned by tensors_to_compute_grads() that are to be used to estimate the block.damping
: The damping factor (float or Tensor).multiply
multiply(vector)
Multiplies the vector by the (damped) block.
vector
: The vector (a Tensor or tuple of Tensors) to be multiplied.The vector left-multiplied by the (damped) block.
multiply_inverse
multiply_inverse(vector)
Multiplies the vector by the (damped) inverse of the block.
vector
: The vector (a Tensor or tuple of Tensors) to be multiplied.The vector left-multiplied by the (damped) inverse of the block.
multiply_matpower
multiply_matpower( vector, exp )
Multiplies the vector by the (damped) matrix-power of the block.
vector
: Tensor or 2-tuple of Tensors. if self._has_bias, Tensor of shape [input_size, output_size] corresponding to layer's weights. If not, a 2-tuple of the former and a Tensor of shape [output_size] corresponding to the layer's bias.exp
: A scalar representing the power to raise the block before multiplying it by the vector.The vector left-multiplied by the (damped) matrix-power of the block.
register_additional_tower
register_additional_tower( inputs, outputs )
register_inverse
register_inverse()
Registers a matrix inverse to be computed by the block.
register_matpower
register_matpower(exp)
Registers a matrix power to be computed by the block.
exp
: A float representing the power to raise the block by.tensors_to_compute_grads
tensors_to_compute_grads()
Tensors to compute derivative of loss with respect to.
© 2018 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/api_docs/python/tf/contrib/kfac/fisher_blocks/FullyConnectedDiagonalFB