LayerCollection
Defined in tensorflow/contrib/kfac/python/ops/layer_collection.py.
Registry of information about layers and losses.
Note that you need to create a new one of these for each MatrixEstimator or KfacOptimizer.
fisher_blocks: a LayersParamsDict (subclass of OrderedDict) mapping layer parameters (Tensors or tuples of Tensors) to FisherBlock instances.fisher_factors: an OrderedDict mapping tuples to FisherFactor instances.losses: a list of LossFunction objects. The loss to be optimized is their sum.loss_colocation_ops: ops to colocate loss function evaluations with. These will typically be the inputs to the losses.default_conv2d_approximationdefault_conv2d_multi_approximationdefault_embedding_approximationdefault_embedding_multi_approximationdefault_fully_connected_approximationdefault_fully_connected_multi_approximationdefault_generic_approximationgraphlinked_parametersGroups of parameters with an optionally specified approximation.
Linked parameters can be added using define_linked_parameters. If an approximation is specified, then this approximation will be used when registering a layer with exactly these parameters, unless an approximation is specified when calling the registration function.
A dict mapping tuples of parameters to an optional string.
lossesTuple of LossFunction objects registered with this LayerCollection.
registered_variablesA tuple of all of the variables currently registered.
subgraphtowers_by_lossTuple across losses of LossFunction objects registered to each tower.
__init____init__(
graph=None,
name='LayerCollection'
)
Initialize self. See help(type(self)) for accurate signature.
as_defaultas_default(
*args,
**kwds
)
Sets this LayerCollection as the default.
check_registrationcheck_registration(variables)
Checks that all variable uses have been registered properly.
variables: List of variables.ValueError: If any registered variables are not included in the list.ValueError: If any variable in the list is not registered.ValueError: If any variable in the list is registered with the wrong number of "uses" in the subgraph recorded (vs the number of times that variable is actually used in the subgraph).create_subgraphcreate_subgraph()
define_linked_parametersdefine_linked_parameters(
params,
approximation=None
)
Identify a set of parameters that should be grouped together.
During automatic graph scanning, any matches containing variables that have been identified as part of a linked group will be filtered out unless the match parameters are exactly equal to the ones specified in the linked group.
params: A variable, or a tuple or list of variables. The variables to be linked.approximation: Optional string specifying the type of approximation to use for these variables. If unspecified, this layer collection's default approximation for the layer type will be used.ValueError: If the parameters were already registered in a layer or identified as part of an incompatible group.eval_losseseval_losses()
Return evaluated losses (colocated with inputs to losses).
eval_losses_on_sampleseval_losses_on_samples()
Return losses evaluated on samples (colocated with inputs to losses).
get_blocksget_blocks()
get_factorsget_factors()
make_or_get_factormake_or_get_factor(
cls,
args
)
Insert cls(args) into 'self.fisher_factors` if not already present.
Wraps constructor in tf.variable_scope() to ensure variables constructed in cls.__init__ are placed under this LayerCollection's scope.
cls: Class that implements FisherFactor.args: Tuple of arguments to pass into `cls's constructor. Must be hashable.Instance of cls found in self.fisher_factors.
register_blockregister_block(
layer_key,
fisher_block,
reuse=VARIABLE_SCOPE
)
Validates and registers the layer_key associated with the fisher_block.
layer_key: A variable or tuple of variables. The key to check for in existing registrations and to register if valid.fisher_block: The associated FisherBlock.reuse: Method to use for inserting new FisherBlock's. One of True, False, orVARIABLE_SCOPE`.ValueError: If layer_key was already registered and reuse is False, if layer_key was registered with a different block type, or if layer_key shares any variables with but is not equal to a previously registered key.KeyError: If reuse is True but layer_key was not previously registered.The FisherBlock registered under layer_key. If layer_key was already registered, this will be the previously registered FisherBlock.
register_categorical_predictive_distributionregister_categorical_predictive_distribution(
logits,
seed=None,
targets=None,
name=None,
reuse=VARIABLE_SCOPE
)
Registers a categorical predictive distribution.
logits: The logits of the distribution (i.e. its parameters).seed: The seed for the RNG (for debugging) (Default: None)targets: (OPTIONAL) The targets for the loss function. Only required if one wants to call total_loss() instead of total_sampled_loss(). total_loss() is required, for example, to estimate the "empirical Fisher" (instead of the true Fisher). (Default: None)name: (OPTIONAL) str or None. Unique name for this loss function. If None, a new name is generated. (Default: None)reuse: bool or str. If True, this adds logits as an additional mini-batch/tower of inputs to the loss-function/predictive distribution (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")register_conv2dregister_conv2d(
params,
strides,
padding,
inputs,
outputs,
data_format=None,
dilations=None,
approx=None,
reuse=VARIABLE_SCOPE
)
Registers a call to tf.nn.conv2d().
params: Tensor or 2-tuple of Tensors corresponding to weight and bias of this layer. Weight matrix should have shape [kernel_height, kernel_width, in_channels, out_channels]. Bias should have shape [out_channels].strides: List of 4 ints. Strides for convolution kernel.padding: string. see tf.nn.conv2d for valid values.inputs: Tensor of shape [batch_size, height, width, in_channels]. Inputs to layer.outputs: Tensor of shape [batch_size, height, width, out_channels]. Output produced by layer.data_format: str or None. Format of data.dilations: List of 4 ints. Dilations along each dimension.approx: str or None. If not None must be one of "kron" or "diagonal". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.register_conv2d_multiregister_conv2d_multi(
params,
strides,
padding,
inputs,
outputs,
num_uses=None,
data_format=None,
dilations=None,
approx=None,
reuse=VARIABLE_SCOPE
)
Registers convolutional layers with shared parameters.
params: Tensor or 2-tuple of Tensors corresponding to weight and bias of this layer. Weight matrix should have shape [kernel_height, kernel_width, in_channels, out_channels]. Bias should have shape [out_channels].strides: 1-D Tensor of length 4. Strides for convolution kernel.padding: string. see tf.nn.conv2d for valid values.inputs: A list of Tensors, each of shape [batch_size, height, width, in_channels]. Inputs to layer. The list indexes each use in the graph (which might correspond to a "time-step" in an RNN). OR, can be single Tensor, of shape [num_uses * batch_size, height, width, in_channels], which is a reshaped version of a Tensor of shape [num_uses, batch_size, height, width, in_channels].outputs: A list of Tensors, each of shape [batch_size, height, width, out_channels]. Output produced by layer. The list indexes each use in the graph (which might correspond to a "time-step" in an RNN). Needs to correspond with the order used in inputs. OR, can be a single Tensor, of shape [num_uses * batch_size, height, width, out_channels], which is a reshaped version of a Tensor of shape [num_uses, batch_size, height, width, out_channels].num_uses: int or None. The number uses/time-steps in the graph where the layer appears. Only needed if both inputs and outputs are given in the single Tensor format. (Default: None)data_format: str or None. Format of data.dilations: List of 4 ints. Dilations along each dimension.approx: str or None. If not None must by "kron_indep". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Note that the word use here has a completely different meaning to "use in the graph" as it perturns to the inputs, outputs, and num_uses arguments.) (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.register_convolutionregister_convolution(
params,
inputs,
outputs,
padding,
strides=None,
dilation_rate=None,
data_format=None,
approx=None,
reuse=VARIABLE_SCOPE
)
Register a call to tf.nn.convolution().
params: Tensor or 2-tuple of Tensors corresponding to weight and bias of this layer. Weight matrix should have shape [..filter_spatial_size.., in_channels, out_channels]. Bias should have shape [out_channels].inputs: Tensor of shape [batch_size, ..input_spatial_size.., in_channels]. Inputs to layer.outputs: Tensor of shape [batch_size, ..output_spatial_size.., out_channels]. Output produced by layer.padding: string. see tf.nn.conv2d for valid values.strides: List of ints of length len(..input_spatial_size..). Strides for convolution kernel in spatial dimensions.dilation_rate: List of ints of length len(..input_spatial_size..). Dilations along spatial dimension.data_format: str or None. Format of data.approx: str or None. If not None must be one of "kron" or "diagonal". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.register_depthwise_conv2dregister_depthwise_conv2d(
params,
inputs,
outputs,
strides,
padding,
rate=None,
data_format=None,
approx=None,
reuse=VARIABLE_SCOPE
)
Register a call to tf.nn.depthwise_conv2d().
params: 4-D Tensor of shape [filter_height, filter_width, in_channels, channel_multiplier]. Convolutional filter.inputs: Tensor of shape [batch_size, input_height, input_width, in_channels]. Inputs to layer.outputs: Tensor of shape [batch_size, output_height, output_width, in_channels * channel_multiplier]. Output produced by depthwise conv2d.strides: List of ints of length 4. Strides along all dimensions.padding: string. see tf.nn.conv2d for valid values.rate: None or List of ints of length 2. Dilation rates in spatial dimensions.data_format: str or None. Format of data.approx: str or None. If not None must "diagonal". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.register_embeddingregister_embedding(
params,
inputs,
outputs,
approx=None,
reuse=VARIABLE_SCOPE
)
Registers an embedding layer.
params: Embedding matrix of shape [vocab_size, embedding_size].inputs: Tensor of shape [batch_size, input_size] and dtype int32. Indices into embedding matrix.outputs: Tensor of shape [batch_size, embedding_size]. Outputs produced by layer.approx: str or None. If not None must be "kron". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.register_embedding_multiregister_embedding_multi(
params,
inputs,
outputs,
num_uses=None,
approx=None,
reuse=VARIABLE_SCOPE
)
Registers embedding layers with shared parameters.
params: Embedding matrix of shape [vocab_size, embedding_size].inputs: A list of Tensors, each of shape [batch_size, input_size] and dtype int32. Indices into embedding matrix. The list indexes each use in the graph (which might correspond to a "time-step" in an RNN). OR, can be single Tensor, of shape [num_uses*batch_size, input_size], which is a reshaped version of a Tensor of shape [num_uses, batch_size, input_size].outputs: A list of Tensors, each of shape [batch_size, embedding_size]. Outputs produced by layer. The list indexes each use in the graph (which might correspond to a "time-step" in an RNN). Needs to correspond with the order used in inputs. OR, can be a single Tensor, of shape [num_uses * batch_size, embedding_size], which is a reshaped version of a Tensor of shape [num_uses, batch_size, embedding_size].num_uses: int or None. The number uses/time-steps in the graph where the layer appears. Only needed if both inputs and outputs are given in the single Tensor format. (Default: None)approx: str or None. If not None must by "kron_indep". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Note that the word use here has a completely different meaning to "use in the graph" as it perturns to the inputs, outputs, and num_uses arguments.) (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.register_fully_connectedregister_fully_connected(
params,
inputs,
outputs,
approx=None,
reuse=VARIABLE_SCOPE
)
Registers a fully connnected layer.
params: Tensor or 2-tuple of Tensors corresponding to weight and bias of this layer. Weight matrix should have shape [input_size, output_size]. Bias should have shape [output_size].inputs: Tensor of shape [batch_size, input_size]. Inputs to layer.outputs: Tensor of shape [batch_size, output_size]. Outputs produced by layer.approx: str or None. If not None must be one of "kron" or "diagonal". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.register_fully_connected_multiregister_fully_connected_multi(
params,
inputs,
outputs,
num_uses=None,
approx=None,
reuse=VARIABLE_SCOPE
)
Register fully connected layers with shared parameters.
This can handle general fully-connected layers with shared parameters, but has specialized approximations to deal with the case where there is a meaningful linear order to the share instances (such as in an RNN).
params: Tensor or 2-tuple of Tensors corresponding to weight and bias of this layer. Weight matrix should have shape [input_size, output_size]. Bias should have shape [output_size].inputs: A list of Tensors, each of shape [batch_size, input_size]. Inputs to layer. The list indexes each use in the graph (which might correspond to a "time-step" in an RNN). OR, can be single Tensor, of shape [num_uses * batch_size , input_size], which is a reshaped version of a Tensor of shape [num_uses, batch_size, input_size].outputs: A list of Tensors, the same length as inputs, each of shape [batch_size, output_size]. Outputs produced by layer. The list indexes each use in the graph (which might correspond to a "time-step" in an RNN). Needs to correspond with the order used in inputs. OR, can be a single Tensor of shape [num_uses * batch_size, output_size], which is a reshaped version of a Tensor of shape [num_uses, batch_size, output_size].num_uses: int or None. The number uses/time-steps in the graph where the layer appears. Only needed if both inputs and outputs are given in the single Tensor format. (Default: None)approx: str or None. If not None, must be of "kron_indep", "kron_series_1" or "kron_series_2". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Note that the word use here has a completely different meaning to "use in the graph" as it perturns to the inputs, outputs, and num_uses arguments.) (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.register_genericregister_generic(
params,
batch_size,
approx=None,
reuse=VARIABLE_SCOPE
)
Registers a generic layer.
params: Tensor or tuple of Tensors corresponding to the parameters.batch_size: 0-D Tensor. Size of the minibatch (for this tower).approx: str or None. It not None, must be one of "full" or "diagonal". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds batch_size to the total mini-batch size use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.register_loss_functionregister_loss_function(
loss,
colocation_op,
base_name,
name=None,
reuse=VARIABLE_SCOPE
)
Registers a LossFunction object.
loss: The LossFunction object.colocation_op: The op to colocate the loss function's computations with.base_name: The name to derive a new unique name from is the name argument is None.name: (OPTIONAL) str or None. Unique name for this loss function. If None, a new name is generated. (Default: None)reuse: (OPTIONAL) bool or str. If True, adds loss as an additional tower for the existing loss function.ValueError: If reuse == True and name == None.ValueError: If reuse == True and seed != None.KeyError: If reuse == True and no existing LossFunction with name found.KeyError: If reuse == False and existing LossFunction with name found.register_multi_bernoulli_predictive_distributionregister_multi_bernoulli_predictive_distribution(
logits,
seed=None,
targets=None,
name=None,
reuse=VARIABLE_SCOPE
)
Registers a multi-Bernoulli predictive distribution.
logits: The logits of the distribution (i.e. its parameters).seed: The seed for the RNG (for debugging) (Default: None)targets: (OPTIONAL) The targets for the loss function. Only required if one wants to call total_loss() instead of total_sampled_loss(). total_loss() is required, for example, to estimate the "empirical Fisher" (instead of the true Fisher). (Default: None)name: (OPTIONAL) str or None. Unique name for this loss function. If None, a new name is generated. (Default: None)reuse: bool or str. If True, this adds logits as an additional mini-batch/tower of inputs to the loss-function/predictive distribution (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")register_normal_predictive_distributionregister_normal_predictive_distribution(
mean,
var=0.5,
seed=None,
targets=None,
name=None,
reuse=VARIABLE_SCOPE
)
Registers a normal predictive distribution.
mean: The mean vector defining the distribution.var: The variance (must be a scalar). Note that the default value of 0.5 corresponds to a standard squared error loss (target - prediction)2. If your squared error loss is of the form 0.5*(target - prediction)2 you should use var=1.0. (Default: 0.5)seed: The seed for the RNG (for debugging) (Default: None)targets: (OPTIONAL) The targets for the loss function. Only required if one wants to call total_loss() instead of total_sampled_loss(). total_loss() is required, for example, to estimate the "empirical Fisher" (instead of the true Fisher). (Default: None)name: (OPTIONAL) str or None. Unique name for this loss function. If None, a new name is generated. (Default: None)reuse: bool or str. If True, this adds mean and var as an additional mini-batch/tower of inputs to the loss-function/predictive distribution (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")register_separable_conv2dregister_separable_conv2d(
depthwise_params,
pointwise_params,
inputs,
depthwise_outputs,
pointwise_outputs,
strides,
padding,
rate=None,
data_format=None,
approx=None,
reuse=VARIABLE_SCOPE
)
Register a call to tf.nn.separable_conv2d().
Note: This requires access to intermediate outputs between depthwise and pointwise convolutions.
depthwise_params: 4-D Tensor of shape [filter_height, filter_width, in_channels, channel_multiplier]. Filter for depthwise conv2d.pointwise_params: 4-D Tensor of shape [1, 1, in_channels * channel_multiplier, out_channels]. Filter for pointwise conv2d.inputs: Tensor of shape [batch_size, input_height, input_width, in_channels]. Inputs to layer.depthwise_outputs: Tensor of shape [batch_size, output_height, output_width, in_channels * channel_multiplier]. Output produced by depthwise conv2d.pointwise_outputs: Tensor of shape [batch_size, output_height, output_width, out_channels]. Output produced by pointwise conv2d.strides: List of ints of length 4. Strides for depthwise conv2d kernel in all dimensions.padding: string. see tf.nn.conv2d for valid values.rate: None or List of ints of length 2. Dilation rate of depthwise conv2d kernel in spatial dimensions.data_format: str or None. Format of data.approx: str or None. If not None must be one of "kron" or "diagonal". The Fisher approximation to use. If None the default value is used. (Default: None)reuse: bool or str. If True, this adds inputs and outputs as an additional mini-batch/tower of data to use when estimating the Fisher block for this layer (which must have already been registered). If "VARIABLE_SCOPE", use tf.get_variable_scope().reuse. (Default: "VARIABLE_SCOPE")ValueError: For improper value to approx.KeyError: If reuse == True but no FisherBlock found for params.ValueError: If reuse == True and FisherBlock found but of the wrong type.set_default_conv2d_approximationset_default_conv2d_approximation(value)
set_default_embedding_approximationset_default_embedding_approximation(value)
set_default_fully_connected_approximationset_default_fully_connected_approximation(value)
set_default_fully_connected_multi_approximationset_default_fully_connected_multi_approximation(value)
set_default_generic_approximationset_default_generic_approximation(value)
total_losstotal_loss()
total_sampled_losstotal_sampled_loss()
© 2018 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/api_docs/python/tf/contrib/kfac/layer_collection/LayerCollection