The dispatchWorkgroups()
method of the GPUComputePassEncoder
interface dispatches a specific grid of workgroups to perform the work being done by the current GPUComputePipeline
(i.e. set via GPUComputePassEncoder.setPipeline()
).
dispatchWorkgroups(workgroupCountX)
dispatchWorkgroups(workgroupCountX, workgroupCountY)
dispatchWorkgroups(workgroupCountX, workgroupCountY, workgroupCountZ)
workgroupCountX
-
The X dimension of the grid of workgroups to dispatch.
-
workgroupCountY
Optional
-
The Y dimension of the grid of workgroups to dispatch. If omitted, workgroupCountY
defaults to 1.
-
workgroupCountZ
Optional
-
The Z dimension of the grid of workgroups to dispatch. If omitted, workgroupCountZ
defaults to 1.
Note: The X, Y, and Z dimension values passed to dispatchWorkgroups()
and GPUComputePassEncoder.dispatchWorkgroupsIndirect()
are the number of workgroups to dispatch for each dimension, not the number of shader invocations to perform across each dimension. This matches the behavior of modern native GPU APIs, but differs from the behavior of OpenCL. This means that if a GPUShaderModule
defines an entry point with @workgroup_size(4, 4)
, and work is dispatched to it with the call passEncoder.dispatchWorkgroups(8, 8);
, the entry point will be invoked 1024 times total — Dispatching a 4 x 4 workgroup 8 times along both the X and Y axes. 4 * 4 * 8 * 8 = 1024
.
The following criteria must be met when calling dispatchWorkgroups()
, otherwise a GPUValidationError
is generated and the GPUComputePassEncoder
becomes invalid:
-
workgroupCountX
, workgroupCountY
, and workgroupCountZ
are all less than or equal to the GPUDevice
's maxComputeWorkgroupsPerDimension
limit.
In our basic compute demo, several commands are recorded via a GPUCommandEncoder
. Most of these commands originate from the GPUComputePassEncoder
created via beginComputePass()
.
At the start of the code, we set a global buffer size of 1000. Also, note that the workgroup size in the shader is set to 64.
const BUFFER_SIZE = 1000;
const shader = `
@group(0) @binding(0)
var<storage, read_write> output: array<f32>;
@compute @workgroup_size(64)
...
`;
Later in the code, the dispatchWorkgroups()
workgroupCountX
parameter is set based on the global buffer size and the shader workgroup count.
const commandEncoder = device.createCommandEncoder();
const passEncoder = commandEncoder.beginComputePass();
passEncoder.setPipeline(computePipeline);
passEncoder.setBindGroup(0, bindGroup);
passEncoder.dispatchWorkgroups(Math.ceil(BUFFER_SIZE / 64));
passEncoder.end();
commandEncoder.copyBufferToBuffer(
output,
0,
stagingBuffer,
0,
BUFFER_SIZE,
);
device.queue.submit([commandEncoder.finish()]);