Optimization parameters for Adam with TPU embeddings.
tf.tpu.experimental.AdamParameters( learning_rate, beta1=0.9, beta2=0.999, epsilon=1e-08, lazy_adam=True, sum_inside_sqrt=True, use_gradient_accumulation=True, clip_weight_min=None, clip_weight_max=None )
Pass this to tf.estimator.tpu.experimental.EmbeddingConfigSpec
via the optimization_parameters
argument to set the optimizer and its parameters. See the documentation for tf.estimator.tpu.experimental.EmbeddingConfigSpec
for more details.
estimator = tf.estimator.tpu.TPUEstimator( ... embedding_config_spec=tf.estimator.tpu.experimental.EmbeddingConfigSpec( ... optimization_parameters=tf.tpu.experimental.AdamParameters(0.1), ...))
Args | |
---|---|
learning_rate | a floating point value. The learning rate. |
beta1 | A float value. The exponential decay rate for the 1st moment estimates. |
beta2 | A float value. The exponential decay rate for the 2nd moment estimates. |
epsilon | A small constant for numerical stability. |
lazy_adam | Use lazy Adam instead of Adam. Lazy Adam trains faster. Please see optimization_parameters.proto for details. |
sum_inside_sqrt | This improves training speed. Please see optimization_parameters.proto for details. |
use_gradient_accumulation | setting this to False makes embedding gradients calculation less accurate but faster. Please see optimization_parameters.proto for details. for details. |
clip_weight_min | the minimum value to clip by; None means -infinity. |
clip_weight_max | the maximum value to clip by; None means +infinity. |
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/tpu/experimental/AdamParameters