| View source on GitHub | 
NCCL all-reduce implementation of CrossDeviceOps.
Inherits From: CrossDeviceOps
tf.distribute.NcclAllReduce(
    num_packs=1
)
  It uses Nvidia NCCL for all-reduce. For the batch API, tensors will be repacked or aggregated for more efficient cross-device transportation.
For reduces that are not all-reduce, it falls back to tf.distribute.ReductionToOneDevice.
Here is how you can use NcclAllReduce in tf.distribute.MirroredStrategy:
strategy = tf.distribute.MirroredStrategy( cross_device_ops=tf.distribute.NcclAllReduce())
| Args | |
|---|---|
| num_packs | a non-negative integer. The number of packs to split values into. If zero, no packing will be done. | 
| Raises | |
|---|---|
| ValueError | if num_packsis negative. | 
batch_reduce
batch_reduce(
    reduce_op, value_destination_pairs, options=None
)
 Reduce values to destinations in batches.
See tf.distribute.StrategyExtended.batch_reduce_to. This can only be called in the cross-replica context.
| Args | |
|---|---|
| reduce_op | a tf.distribute.ReduceOpspecifying how values should be combined. | 
| value_destination_pairs | a sequence of (value, destinations) pairs. See tf.distribute.CrossDeviceOps.reducefor descriptions. | 
| options | a tf.distribute.experimental.CommunicationOptions. Seetf.distribute.experimental.CommunicationOptionsfor details. | 
| Returns | |
|---|---|
| A list of tf.Tensorortf.distribute.DistributedValues, one per pair invalue_destination_pairs. | 
| Raises | |
|---|---|
| ValueError | if value_destination_pairsis not an iterable of tuples oftf.distribute.DistributedValuesand destinations. | 
broadcast
broadcast(
    tensor, destinations
)
 Broadcast tensor to destinations.
This can only be called in the cross-replica context.
| Args | |
|---|---|
| tensor | a tf.Tensorlike object. The value to broadcast. | 
| destinations | a tf.distribute.DistributedValues, atf.Variable, atf.Tensoralike object, or a device string. It specifies the devices to broadcast to. Note that if it's atf.Variable, the value is broadcasted to the devices of that variable, this method doesn't update the variable. | 
| Returns | |
|---|---|
| A tf.Tensorortf.distribute.DistributedValues. | 
reduce
reduce(
    reduce_op, per_replica_value, destinations, options=None
)
 Reduce per_replica_value to destinations.
See tf.distribute.StrategyExtended.reduce_to. This can only be called in the cross-replica context.
| Args | |
|---|---|
| reduce_op | a tf.distribute.ReduceOpspecifying how values should be combined. | 
| per_replica_value | a tf.distribute.DistributedValues, or atf.Tensorlike object. | 
| destinations | a tf.distribute.DistributedValues, atf.Variable, atf.Tensoralike object, or a device string. It specifies the devices to reduce to. To perform an all-reduce, pass the same tovalueanddestinations. Note that if it's atf.Variable, the value is reduced to the devices of that variable, and this method doesn't update the variable. | 
| options | a tf.distribute.experimental.CommunicationOptions. Seetf.distribute.experimental.CommunicationOptionsfor details. | 
| Returns | |
|---|---|
| A tf.Tensorortf.distribute.DistributedValues. | 
| Raises | |
|---|---|
| ValueError | if per_replica_value can't be converted to a tf.distribute.DistributedValuesor if destinations is not a string,tf.Variableortf.distribute.DistributedValues. | 
    © 2022 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 4.0.
Code samples licensed under the Apache 2.0 License.
    https://www.tensorflow.org/versions/r2.9/api_docs/python/tf/distribute/NcclAllReduce