TPUConfig
Defined in tensorflow/contrib/tpu/python/tpu/tpu_config.py
.
TPU related configuration required by TPUEstimator
.
iterations_per_loop
: This is the number of train steps running in TPU system before returning to CPU host for each Session.run
. This means global step is increased iterations_per_loop
times in one Session.run
. It is recommended to be set as number of global steps for next checkpoint.num_shards
: (Deprecated, ignored by TPUEstimator). The number of model replicas in the system. For non-model-parallelism case, this number equals the total number of TPU cores. For model-parallelism, the total number of TPU cores equals product(computation_shape) * num_shards.computation_shape
: Defaults to None
, which disables model parallelism. A list of size 3 which describes the shape of a model replica's block of cores. This is required by model-parallelism which enables partitioning the model to multiple cores. For example, [2, 2, 1] means the model is partitioned across 4 cores which span two cores in both x and y coordinates. Please refer to tf.contrib.tpu.Topology
for the geometry of a TPU mesh.per_host_input_for_training
: If True
, PER_HOST_V1
, or PER_HOST_V2
, input_fn
is invoked per-host rather than per-core. With per-host input pipeline configuration, input_fn
is invoked once on each host. With the per-core input pipeline configuration, it is invoked once for each core. With a global batch size train_batch_size
in TPUEstimator
constructor, the batch size for each shard is train_batch_size
// #hosts in the True
or PER_HOST_V1
mode. In PER_HOST_V2
mode, it is train_batch_size
// #cores. With the per-core input pipeline configuration, the shard batch size is also train_batch_size
// #cores.Note
: per_host_input_for_training==PER_SHARD_V1 only supports mode.TRAIN.tpu_job_name
: The name of the TPU job. Typically, this name is auto-inferred within TPUEstimator, however when using ClusterSpec propagation in more esoteric cluster configurations, you may need to specify the job name as a string.initial_infeed_sleep_secs
: The number of seconds the infeed thread should wait before enqueueing the first batch. This helps avoid timeouts for models that require a long compilation time.
Raises
: * ValueError
: If computation_shape
or computation_shape
are invalid.
computation_shape
Alias for field number 2
initial_infeed_sleep_secs
Alias for field number 5
iterations_per_loop
Alias for field number 0
num_shards
Alias for field number 1
per_host_input_for_training
Alias for field number 3
tpu_job_name
Alias for field number 4
__new__
@staticmethod __new__( cls, iterations_per_loop=2, num_shards=None, computation_shape=None, per_host_input_for_training=True, tpu_job_name=None, initial_infeed_sleep_secs=None )
Create new instance of TPUConfig(iterations_per_loop, num_shards, computation_shape, per_host_input_for_training, tpu_job_name, initial_infeed_sleep_secs)
© 2018 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/api_docs/python/tf/contrib/tpu/TPUConfig