33. Ray Tracing
Unlike draw commands, which use rasterization, ray tracing is a rendering method that generates an image by tracing the path of rays which have a single origin and using shaders to determine the final colour of an image plane.
Ray tracing uses a separate rendering pipeline from both the graphics and compute pipelines (see Ray tracing Pipeline). It has a unique set of programmable and fixed function stages.
33.1. Ray Tracing Commands
Ray tracing commands provoke work in the ray tracing pipeline. Ray tracing commands are recorded into a command buffer and when executed by a queue will produce work that executes according to the currently bound ray tracing pipeline. A ray tracing pipeline must be bound to a command buffer before any ray tracing commands are recorded in that command buffer.
Each ray tracing call operates on a set of shader stages that are specific
to the ray tracing pipeline as well as a set of
VkAccelerationStructureNV objects, which describe the scene geometry
in an implementation-specific way.
The relationship between the ray tracing pipeline object and the
acceleration structures is passed into the ray tracing command in a
VkBuffer object known as a shader binding table.
During execution, control alternates between scheduling and other operations. The scheduling functionality is implementation-specific and is responsible for workload execution. The shader stages are programmable. Traversal, which refers to the process of traversing acceleration structures to find potential intersections of rays with geometry, is fixed function.
The programmable portions of the pipeline are exposed in a single-ray programming model. Each GPU thread handles one ray at a time. Memory operations can be synchronized using standard memory barriers. However, communication and synchronization between threads is not allowed. In particular, the use of compute pipeline synchronization functions is not supported in the ray tracing pipeline.
To dispatch a ray tracing call use:
void vkCmdTraceRaysNV(
VkCommandBuffer commandBuffer,
VkBuffer raygenShaderBindingTableBuffer,
VkDeviceSize raygenShaderBindingOffset,
VkBuffer missShaderBindingTableBuffer,
VkDeviceSize missShaderBindingOffset,
VkDeviceSize missShaderBindingStride,
VkBuffer hitShaderBindingTableBuffer,
VkDeviceSize hitShaderBindingOffset,
VkDeviceSize hitShaderBindingStride,
VkBuffer callableShaderBindingTableBuffer,
VkDeviceSize callableShaderBindingOffset,
VkDeviceSize callableShaderBindingStride,
uint32_t width,
uint32_t height,
uint32_t depth);
-
commandBufferis the command buffer into which the command will be recorded. -
raygenShaderBindingTableBufferis the buffer object that holds the shader binding table data for the ray generation shader stage. -
raygenShaderBindingOffsetis the offset in bytes (relative toraygenShaderBindingTableBuffer) of the ray generation shader being used for the trace. -
missShaderBindingTableBufferis the buffer object that holds the shader binding table data for the miss shader stage. -
missShaderBindingOffsetis the offset in bytes (relative tomissShaderBindingTableBuffer) of the miss shader being used for the trace. -
missShaderBindingStrideis the size in bytes of each shader binding table record inmissShaderBindingTableBuffer. -
hitShaderBindingTableBufferis the buffer object that holds the shader binding table data for the hit shader stages. -
hitShaderBindingOffsetis the offset in bytes (relative tohitShaderBindingTableBuffer) of the hit shader group being used for the trace. -
hitShaderBindingStrideis the size in bytes of each shader binding table record inhitShaderBindingTableBuffer. -
callableShaderBindingTableBufferis the buffer object that holds the shader binding table data for the callable shader stage. -
callableShaderBindingOffsetis the offset in bytes (relative tocallableShaderBindingTableBuffer) of the callable shader being used for the trace. -
callableShaderBindingStrideis the size in bytes of each shader binding table record incallableShaderBindingTableBuffer. -
widthis the width of the ray trace query dimensions. -
heightis height of the ray trace query dimensions. -
depthis depth of the ray trace query dimensions.
When the command is executed, a ray generation group of width
× height × depth rays is assembled.
33.2. Shader Binding Table
A shader binding table is a resource which establishes the relationship between the ray tracing pipeline and the acceleration structures that were built for the ray tracing query. It indicates the shaders that operate on each geometry in an acceleration structure. In addition, it contains the resources accessed by each shader, including indices of textures and constants. The application allocates and manages shader binding tables as VkBuffer objects.
Each entry in the shader binding table consists of
shaderGroupHandleSize bytes of data as queried by
vkGetRayTracingShaderGroupHandlesNV to refer to the shader that it
invokes.
The remainder of the data specified by the stride is application-visible
data that can be referenced by a shaderRecordNV block in the shader.
The shader binding tables to use in a ray tracing query are passed to vkCmdTraceRaysNV. Shader binding tables are read-only in shaders that are executing on the ray tracing pipeline.
33.2.1. Indexing Rules
In order to execute the correct shaders and access the correct resources during a ray tracing dispatch, the implementation must be able to locate shader binding table entries at various stages of execution. This is accomplished by defining a set of indexing rules that compute shader binding table record positions relative to the buffer’s base address in memory. The application must organize the contents of the shader binding table’s memory in a way that application of the indexing rules will lead to correct records.
Ray Generation Shaders
Only one ray generation shader is executed per ray tracing dispatch.
Its location is passed into vkCmdTraceRaysNV using the
raygenShaderBindingTableBuffer and
raygenShaderBindingTableOffset parameters — there is no indexing.
Hit Shaders
The base for the computation of intersection, any-hit and closest hit shader
locations is the instanceShaderBindingTableRecordOffset value stored
with each instance of a top-level acceleration structure.
This value determines the beginning of the shader binding table records for
a given instance.
Each geometry in the instance must have at least one hit program record.
In the following rule, geometryIndex refers to the location of the geometry within the instance.
The sbtRecordStride and sbtRecordOffset values are passed in as
parameters to traceNV() calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, these correspond to the SBTOffset and SBTStride
parameters to the OpTraceNV instruction.
The result of this computation is then added to
hitShaderBindingOffset, a base offset passed to
vkCmdTraceRaysNV.
The complete rule to compute a hit shader binding table record address in
the hitShaderBindingTableBuffer is:
-
hitShaderBindingOffset+hitShaderBindingStride× (instanceShaderBindingTableRecordOffset+ geometryIndex ×sbtRecordStride+sbtRecordOffset)
Miss Shaders
A miss shader is executed whenever a ray query fails to find an intersection for the given scene geometry. Multiple miss shaders may be executed throughout a ray tracing dispatch.
The base for the computation of miss shader locations is
missShaderBindingOffset, a base offset passed into
vkCmdTraceRaysNV.
The missIndex value is passed in as parameters to traceNV() calls
made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, this corresponds to the MissIndex parameter to the
OpTraceNV instruction.
The complete rule to compute a miss shader binding table record address in
the missShaderBindingTableBuffer is:
-
missShaderBindingOffset+missShaderBindingStride×missIndex
Callable Shaders
A callable shader is executed when requested by a ray tracing shader. Multiple callable shaders may be executed throughout a ray tracing dispatch.
The base for the computation of callable shader locations is
callableShaderBindingOffset, a base offset passed into
vkCmdTraceRaysNV.
The sbtRecordIndex value is passed in as a parameter to
executeCallableNV() calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, this corresponds to the SBTIndex parameter to the
OpExecuteCallableNV instruction.
The complete rule to compute a callable shader binding table record address
in the callableShaderBindingTableBuffer is:
-
callableShaderBindingOffset+callableShaderBindingStride×sbtRecordIndex
33.3. Acceleration Structures
Acceleration structures are data structures used by the implementation to efficiently manage the scene geometry as it is traversed during a ray tracing query. The application is responsible for managing acceleration structure objects (see Acceleration Structures, including allocation, destruction, executing builds or updates, and synchronizing resources used during ray tracing queries.
There are two types of acceleration structures, top level acceleration structures and bottom level acceleration structures.
33.3.1. Instances
Instances are found in top level acceleration structures and contain data that refer to a single bottom-level acceleration structure, a transform matrix, and shading information. Multiple instances can point to a single bottom level acceleration structure.
An instance is defined in a VkBuffer by a structure consisting of 64 bytes of data.
-
transformis 12 floats representing a 4x3 transform matrix in row-major order -
instanceCustomIndexThe low 24 bits of a 32-bit integer after the transform. This value appears in the builtingl_InstanceCustomIndexNV -
maskThe high 8 bits of the same integer asinstanceCustomIndex. This is the visibility mask. The instance may only be hit ifrayMask & instance.mask != 0 -
instanceOffsetThe low 24 bits of the next 32-bit integer. The value contributed by this instance to the hit shader binding table index computation asinstanceShaderBindingTableRecordOffset. -
flagsThe high 8 bits of the same integer asinstanceOffset. VkGeometryInstanceFlagBitsNV values that apply to this instance. -
accelerationStructure. The 8 byte value returned by vkGetAccelerationStructureHandleNV for the bottom level acceleration structure referred to by this instance.
|
Note
The C language spec does not define the ordering of bit-fields, but in practice, this struct produces the layout described above:
|
Possible values of flags in the instance modifying the behavior of
that instance are:,
typedef enum VkGeometryInstanceFlagBitsNV {
VK_GEOMETRY_INSTANCE_TRIANGLE_CULL_DISABLE_BIT_NV = 0x00000001,
VK_GEOMETRY_INSTANCE_TRIANGLE_FRONT_COUNTERCLOCKWISE_BIT_NV = 0x00000002,
VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NV = 0x00000004,
VK_GEOMETRY_INSTANCE_FORCE_NO_OPAQUE_BIT_NV = 0x00000008,
VK_GEOMETRY_INSTANCE_FLAG_BITS_MAX_ENUM_NV = 0x7FFFFFFF
} VkGeometryInstanceFlagBitsNV;
-
VK_GEOMETRY_INSTANCE_TRIANGLE_CULL_DISABLE_BIT_NVdisables face culling for this instance. -
VK_GEOMETRY_INSTANCE_TRIANGLE_FRONT_COUNTERCLOCKWISE_BIT_NVindicates that the front face of the triangle for culling purposes is the face that is counter clockwise in object space relative to the ray origin. Because the facing is determined in object space, an instance transform matrix does not change the winding, but a geometry transform does. -
VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NVcauses this instance to act as thoughVK_GEOMETRY_OPAQUE_BIT_NVwere specified on all geometries referenced by this instance. This behavior can be overridden by the ray flaggl_RayFlagsNoOpaqueNV. -
VK_GEOMETRY_INSTANCE_FORCE_NO_OPAQUE_BIT_NVcauses this instance to act as thoughVK_GEOMETRY_OPAQUE_BIT_NVwere not specified on all geometries referenced by this instance. This behavior can be overridden by the ray flaggl_RayFlagsOpaqueNV.
VK_GEOMETRY_INSTANCE_FORCE_NO_OPAQUE_BIT_NV and
VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NV must not be used in the same
flag.
typedef VkFlags VkGeometryInstanceFlagsNV;
VkGeometryInstanceFlagsNV is a bitmask type for setting a mask of zero
or more VkGeometryInstanceFlagBitsNV.
33.3.2. Geometry
Geometries refer to a triangle or axis-aligned bounding box.
33.3.3. Top Level Acceleration Structures
Opaque acceleration structure for an array of instances. The descriptor referencing this is the starting point for tracing
33.3.4. Bottom Level Acceleration Structures
Opaque acceleration structure for an array of geometries.
33.3.5. Building Acceleration Structures
To build an acceleration structure call:
void vkCmdBuildAccelerationStructureNV(
VkCommandBuffer commandBuffer,
const VkAccelerationStructureInfoNV* pInfo,
VkBuffer instanceData,
VkDeviceSize instanceOffset,
VkBool32 update,
VkAccelerationStructureNV dst,
VkAccelerationStructureNV src,
VkBuffer scratch,
VkDeviceSize scratchOffset);
-
commandBufferis the command buffer into which the command will be recorded. -
pInfocontains the shared information for the acceleration structure’s structure. -
instanceDatais the buffer containing instance data that will be used to build the acceleration structure as described in Accelerator structure instances. This parameter must beNULLfor bottom level acceleration structures. -
instanceOffsetis the offset in bytes (relative to the start ofinstanceData) at which the instance data is located. -
updatespecifies whether to update thedstacceleration structure with the data insrc. -
dstpoints to the target acceleration structure for the build. -
srcpoints to an existing acceleration structure that is to be used to update thedstacceleration structure. -
scratchis the VkBuffer that will be used as scratch memory for the build. -
scratchOffsetis the offset in bytes relative to the start ofscratchthat will be used as a scratch memory.
33.3.6. Copying Acceleration Structures
An additional command exists for copying acceleration structures without updating their contents. The acceleration structure object can be compacted in order to improve performance. Before copying, an application must query the size of the resulting acceleration structure.
To query acceleration structure size parameters call:
void vkCmdWriteAccelerationStructuresPropertiesNV(
VkCommandBuffer commandBuffer,
uint32_t accelerationStructureCount,
const VkAccelerationStructureNV* pAccelerationStructures,
VkQueryType queryType,
VkQueryPool queryPool,
uint32_t firstQuery);
-
commandBufferis the command buffer into which the command will be recorded. -
accelerationStructureCountis the count of acceleration structures for which to query the property. -
pAccelerationStructurespoints to an array of existing previously built acceleration structures. -
queryTypeis a VkQueryType value specifying the type of queries managed by the pool. -
queryPoolis the query pool that will manage the results of the query. -
firstQueryis the first query index within the query pool that will contain theaccelerationStructureCountnumber of results.
To copy an acceleration structure call:
void vkCmdCopyAccelerationStructureNV(
VkCommandBuffer commandBuffer,
VkAccelerationStructureNV dst,
VkAccelerationStructureNV src,
VkCopyAccelerationStructureModeNV mode);
-
commandBufferis the command buffer into which the command will be recorded. -
dstpoints to the target acceleration structure for the copy. -
srcpoints to the source acceleration structure for the copy. -
modeis a VkCopyAccelerationStructureModeNV value that specifies additional operations to perform during the copy.
Possible values of vkCmdCopyAccelerationStructureNV::mode,
specifying additional operations to perform during the copy, are:
typedef enum VkCopyAccelerationStructureModeNV {
VK_COPY_ACCELERATION_STRUCTURE_MODE_CLONE_NV = 0,
VK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_NV = 1,
VK_COPY_ACCELERATION_STRUCTURE_MODE_MAX_ENUM_NV = 0x7FFFFFFF
} VkCopyAccelerationStructureModeNV;
-
VK_COPY_ACCELERATION_STRUCTURE_MODE_CLONE_NVcreates a direct copy of the acceleration structure specified insrcinto the one specified bydst. Thedstacceleration structure must have been created with the same parameters assrc. -
VK_COPY_ACCELERATION_STRUCTURE_MODE_COMPACT_NVcreates a more compact version of an acceleration structuresrcintodst. The acceleration structuredstmust have been created with acompactedSizecorresponding to the one returned by vkCmdWriteAccelerationStructuresPropertiesNV after the build of the acceleration structure specified bysrc.