W3cubDocs

/PyTorch 2.9

Events

Created On: May 04, 2021 | Last Updated On: Jun 10, 2024

Module contains events processing mechanisms that are integrated with the standard python logging.

Example of usage:

from torch.distributed.elastic import events

event = events.Event(
    name="test_event", source=events.EventSource.WORKER, metadata={...}
)
events.get_logging_handler(destination="console").info(event)

API Methods

torch.distributed.elastic.events.record(event, destination='null') [source]
torch.distributed.elastic.events.construct_and_record_rdzv_event(run_id, message, node_state, name='', hostname='', pid=None, master_endpoint='', local_id=None, rank=None) [source]

Initialize rendezvous event object and record its operations.

Parameters
  • run_id (str) – The run id of the rendezvous.
  • message (str) – The message describing the event.
  • node_state (NodeState) – The state of the node (INIT, RUNNING, SUCCEEDED, FAILED).
  • name (str) – Event name. (E.g. Current action being performed).
  • hostname (str) – Hostname of the node.
  • pid (Optional[int]) – The process id of the node.
  • master_endpoint (str) – The master endpoint for the rendezvous store, if known.
  • local_id (Optional[int]) – The local_id of the node, if defined in dynamic_rendezvous.py
  • rank (Optional[int]) – The rank of the node, if known.
Returns

None

Return type

None

Example

>>> # See DynamicRendezvousHandler class
>>> def _record(
...     self,
...     message: str,
...     node_state: NodeState = NodeState.RUNNING,
...     rank: Optional[int] = None,
... ) -> None:
...     construct_and_record_rdzv_event(
...         name=f"{self.__class__.__name__}.{get_method_name()}",
...         run_id=self._settings.run_id,
...         message=message,
...         node_state=node_state,
...         hostname=self._this_node.addr,
...         pid=self._this_node.pid,
...         local_id=self._this_node.local_id,
...         rank=rank,
...     )
torch.distributed.elastic.events.get_logging_handler(destination='null') [source]
Return type

Handler

Event Objects

class torch.distributed.elastic.events.api.Event(name, source, timestamp=0, metadata=<factory>) [source]

The class represents the generic event that occurs during the torchelastic job execution.

The event can be any kind of meaningful action.

Parameters
  • name (str) – event name.
  • source (EventSource) – the event producer, e.g. agent or worker
  • timestamp (int) – timestamp in milliseconds when event occurred.
  • metadata (dict[str, Union[str, int, float, bool, NoneType]]) – additional data that is associated with the event.
class torch.distributed.elastic.events.api.EventSource(value) [source]

Known identifiers of the event producers.

torch.distributed.elastic.events.api.EventMetadataValue

alias of Optional[Union[str, int, float, bool]]

© 2025, PyTorch Contributors
PyTorch has a BSD-style license, as found in the LICENSE file.
https://docs.pytorch.org/docs/2.9/elastic/events.html