/TensorFlow Python



Defined in tensorflow/contrib/quantize/python/quantize_graph.py.

Rewrites a training input_graph in place for simulated quantization.

Variables added by the rewrite get added to the global variables collection.

This function has additional experimental options not (yet) available to create_training_graph. The resulting behavior may be undefined.

The graph has fake quantization ops inserted to simulate the error introduced by quantization. Since the graph is transformed in place, the expected behavior of previously held references to nodes and tensors may change.

The default value of quant_delay is suitable for finetuning an already trained floating point model (recommended). If one wants to train a quantized model from scratch, quant_delay should be set to the number of steps it take the floating point model to converge. Quantization will be activated at this point and effectively finetune the model. If quant_delay is not provided when training from scratch, training can often fail.


  • input_graph: The tf.Graph to be transformed, if None then defaults to the default graph.
  • weight_bits: Number of bits to use for quantizing weights.
  • activation_bits: Number of bits to use for quantizing activations.
  • quant_delay: Number of steps after which weights and activations are quantized during training.
  • freeze_bn_delay: Number of steps after which moving mean and variance are frozen and used instead of batch statistics during training. freeze_bn_delay should be greater than quant_delay and should correspond to when training has almost converged
  • scope: The scope to be transformed. If it's not None, only the ops which are in this scope will be transformed.


  • ValueError: If elements contains an element that isn't a tf.Tensor or tf.Operation.

© 2018 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.