nvidiagameworks / raytracingdenoiser Goto Github PK

View Code? Open in Web Editor NEW

484.0 484.0 44.0 5.48 MB

NVIDIA Ray Tracing Denoiser

License: Other

Batchfile 0.23% HLSL 59.88% C++ 38.94% C 0.09% CMake 0.73% Shell 0.12%

raytracingdenoiser's Introduction

NVIDIA REAL-TIME DENOISERS v4.8.1 (NRD)

For quick starting see NRD sample project.

OVERVIEW

NVIDIA Real-Time Denoisers (NRD) is a spatio-temporal API agnostic denoising library. The library has been designed to work with low rpp (ray per pixel) signals. NRD is a fast solution that slightly depends on input signals and environment conditions.

NRD includes the following denoisers:

REBLUR - recurrent blur based denoiser
RELAX - A-trous based denoiser, has been designed for RTXDI (RTX Direct Illumination)
SIGMA - shadow-only denoiser

Performance on RTX 4080 @ 1440p (native resolution, default denoiser settings):

REBLUR_DIFFUSE_SPECULAR - 2.45 ms
RELAX_DIFFUSE_SPECULAR - 2.90 ms
SIGMA_SHADOW - 0.30 ms (0.24 ms if temporal stabilization is off)
SIGMA_SHADOW_TRANSLUCENCY - 0.40 ms (0.30 ms if temporal stabilization is off)

Supported signal types:

RELAX:
- Diffuse & specular radiance
REBLUR:
- Diffuse & specular radiance
- Diffuse (ambient) & specular occlusion (OCCLUSION variants)
- Diffuse (ambient) directional occlusion (DIRECTIONAL_OCCLUSION variant)
- Diffuse & specular radiance in spherical harmonics (spherical gaussians) (SH variants)
SIGMA:
- Shadows from an infinite light source (sun, moon)
- Shadows from a local light source (omni, spot)

For diffuse and specular signals de-modulated irradiance (i.e. irradiance with "removed" materials) can be used instead of radiance (see "Recommendations and Best Practices" section).

NRD is distributed as a source as well with a “ready-to-use” library (if used in a precompiled form). It can be integrated into any DX12, VULKAN or DX11 engine using two variants:

Native implementation of the NRD API using engine capabilities
Integration via an abstraction layer. In this case, the engine should expose native Graphics API pointers for certain types of objects. The integration layer, provided as a part of SDK, can be used to simplify this kind of integration.

HOW TO BUILD?

Install Cmake 3.15+
Install on
- Windows: latest WindowsSDK and VulkanSDK
- Linux (x86-64): latest VulkanSDK
- Linux (aarch64): find a precompiled binary for DXC or disable shader compilation NRD_EMBEDS_SPIRV_SHADERS=OFF
Build (variant 1) - using Git and CMake explicitly
- Clone project and init submodules
- Generate and build the project using CMake
Build (variant 2) - by running scripts:
- Run 1-Deploy
- Run 2-Build

CMake options:

NRD_SHADERS_PATH - shader output path override
NRD_STATIC_LIBRARY - build static library (OFF by default)
NRD_DXC_CUSTOM_PATH - custom DXC to use if Vulkan SDK is not installed
NRD_NORMAL_ENCODING - normal encoding for the entire library
NRD_ROUGHNESS_ENCODING - roughness encoding for the entire library
NRD_EMBEDS_DXBC_SHADERS - NRD compiles and embeds DXBC shaders (ON by default on Windows)
NRD_EMBEDS_DXIL_SHADERS - NRD compiles and embeds DXIL shaders (ON by default on Windows)
NRD_EMBEDS_SPIRV_SHADERS - NRD compiles and embeds SPIRV shaders (ON by default)
NRD_DISABLE_SHADER_COMPILATION - disable shader compilation on the NRD side, NRD assumes that shaders are already compiled externally and have been put into NRD_SHADERS_PATH folder

NRD_NORMAL_ENCODING and NRD_ROUGHNESS_ENCODING can be defined only once during project deployment. These settings are dumped in NRDEncoding.hlsli file, which needs to be included on the application side prior NRD.hlsli inclusion to deliver encoding settings matching NRD settings. LibraryDesc includes encoding settings too. It can be used to verify that the library meets the application expectations.

Tested platforms:

OS	Architectures	Compilers
Windows	AMD64	MSVC, Clang
Linux	AMD64, ARM64	GCC, Clang

SDK packaging:

Compile the solution (Debug / Release or both, depending on what you want to get in NRD package)
Run 3-Prepare NRD SDK
Grab generated in the root directory _NRD_SDK and _NRI_SDK (if needed) folders and use them in your project

HOW TO UPDATE?

Clone latest with all dependencies
Run 4-Clean.bat
Run 1-Deploy
Run 2-Build

HOW TO REPORT ISSUES?

NRD sample has TESTS section in the bottom of the UI, a new test can be added if needed. The following procedure is recommended:

Try to reproduce a problem in the NRD sample first
- if reproducible
  - add a test (by pressing Add button)
  - describe the issue and steps to reproduce on GitHub
  - attach depending on the selected scene .bin file from the Tests folder
- if not
  - verify the integration
If nothing helps
- describe the issue, attach a video and steps to reproduce

Additionally, for any information, suggestions or general requests please feel free to contact us at [email protected]

API

Terminology:

Denoiser - a denoiser to use (for example: Denoiser::REBLUR_DIFFUSE)
Instance - a set of denoisers aggregated into a monolithic entity (the library is free to rearrange passes without dependencies). Each denoiser in the instance has an associated Identifier
Resource - an input, output or internal resource (currently can only be a texture)
Texture pool (or pool) - a texture pool that stores permanent or transient resources needed for denoising. Textures from the permanent pool are dedicated to NRD and can not be reused by the application (history buffers are stored here). Textures from the transient pool can be reused by the application right after denoising. NRD doesn’t allocate anything. NRD provides resource descriptions, but resource creations are done on the application side.

Flow:

GetLibraryDesc - contains general NRD library information (supported denoisers, SPIRV binding offsets). This call can be skipped if this information is known in advance (for example, is diffuse denoiser available?), but it can’t be skipped if SPIRV binding offsets are needed for VULKAN
CreateInstance - creates an instance for requested denoisers
GetInstanceDesc - returns descriptions for pipelines, samplers, texture pools, constant buffer and descriptor set. All this stuff is needed during the initialization step
SetCommonSettings - sets common (shared) per frame parameters
SetDenoiserSettings - can be called to change parameters dynamically before applying the denoiser on each new frame / denoiser call
GetComputeDispatches - returns per-dispatch data for the list of denoisers (bound subresources with required state, constant buffer data). Returned memory is owned by the instance and gets overwritten by the next GetComputeDispatches call
DestroyInstance - destroys an instance

NRD doesn't make any graphics API calls. The application is supposed to invoke a set of compute Dispatch calls to actually denoise input signals. Please, refer to NrdIntegration::Denoise() and NrdIntegration::Dispatch() calls in NRDIntegration.hpp file as an example of an integration using low level RHI.

NRD doesn’t have a "resize" functionality. On resolution change the old denoiser needs to be destroyed and a new one needs to be created with new parameters. But NRD supports dynamic resolution scaling via CommonSettings::resolutionScale.

Some textures can be requested as inputs or outputs for a method (see the next section). Required resources are specified near a denoiser declaration inside the Denoiser enum class. Also NRD.hlsli has a comment near each front-end or back-end function, clarifying which resources this function is for.

NON-NOISY INPUTS

Commons inputs for primary hits (if PSR is not used, common use case) or for secondary hits (if PSR is used, valid only for 0-roughness):

IN_MV - non-jittered surface motion (old = new + MV)

Modes:
- 2D screen-space motion - 2D motion doesn't provide information about movement along the view direction. NRD can reject history on dynamic objects in this case
- 2.5D screen-space motion (recommended) - similar to the 2D screen-space motion, but .z = viewZprev - viewZ
- 3D world-space motion - camera motion should not be included (it's already in the matrices). In other words, if there are no moving objects, all motion vectors must be 0 even if the camera is moving
Motion vector scaling can be provided via CommonSettings::motionVectorScale. NRD expectations:
- Use CommonSettings::isMotionVectorInWorldSpace = true for 3D world-space motion
- Use CommonSettings::isMotionVectorInWorldSpace = false and CommonSettings::motionVectorScale[2] == 0 for 2D screen-space motion
- Use CommonSettings::isMotionVectorInWorldSpace = false and CommonSettings::motionVectorScale[2] != 0 for 2.5D screen-space motion
IN_NORMAL_ROUGHNESS - surface world-space normal and linear roughness

Normal and roughness encoding must be controlled via Cmake parameters NRD_NORMAL_ENCODING and NRD_ROUGHNESS_ENCODING. Optional NRDEncoding.hlsli file is generated during project deployment, which can be included prior NRD.hlsli to make encoding macro definitions visible in shaders (if NRD_NORMAL_ENCODING and NRD_ROUGHNESS_ENCODING are not defined in another way by the application). Encoding settings can be known at runtime by accessing GetLibraryDesc().normalEncoding and GetLibraryDesc().roghnessEncoding respectively. NormalEncoding and RoughnessEncoding enums briefly describe encoding variants. It's recommended to use NRD_FrontEnd_PackNormalAndRoughness from NRD.hlsli to match decoding.

NRD computes local curvature using provided normals. Less accurate normals can lead to banding in curvature and local flatness. RGBA8 normals is a good baseline, but R10G10B10A10 oct-packed normals improve curvature calculations and specular tracking as the result.

If materialID is provided and supported by encoding, NRD diffuse and specular denoisers won't mix up surfaces with different material IDs.
IN_VIEWZ - .x - view-space Z coordinate of primary hits (linearized g-buffer depth)

Positive and negative values are supported. Z values in all pixels must be in the same space, matching space defined by matrices passed to NRD. If, for example, the protagonist's hands are rendered using special matrices, Z values should be computed as:
- reconstruct world position using special matrices for "hands"
- project on screen using matrices passed to NRD
- .w component is positive view Z (or just transform world-space position to main view space and take .z component)

All textures should be NaN free at each pixel, even at pixels outside of denoising range.

The illustration below shows expected inputs for primary hits:

hitDistance = length( B - A ); // hitT for 1st bounce (recommended baseline)

IN_VIEWZ = TransformToViewSpace( A ).z;
IN_NORMAL_ROUGHNESS = GetNormalAndRoughnessAt( A );
IN_MV = GetMotionAt( A );

See NRDDescs.h for more details and descriptions of other inputs and outputs.

NOISY INPUTS

NRD sample is a good start to familiarize yourself with input requirements and best practices, but main requirements can be summarized to:

Since NRD denoisers accumulate signals for a limited number of frames, the input signal must converge reasonably well for this number of frames. REFERENCE denoiser can be used to estimate temporal signal quality
Since NRD denoisers process signals spatially, high-energy fireflies in the input signal should be avoided. Most of them can be removed by enabling anti-firefly filter in NRD, but it will only work if the "background" signal is confident. The worst case is having a single pixel with high energy divided by a very small PDF to represent the lack of energy in neighboring non-representative (black) pixels
Radiance must be separated into diffuse and specular at primary hit (or secondary hit in case of PSR)
hitT can't be negative
hitT must not include primary hit distance
hitT for the first bounce after the primary hit or PSR must be provided "as is"
hitT for subsequent bounces and for bounces before PSR must be adjusted by curvature and lobe energy dissipation on the application side
- Do not pass sum of lengths of all segments as hitT. A solid baseline is to use hit distance for the 1st bounce only, it works well for diffuse and specular signals
- NRD sample uses more complex approach for accumulating hitT along the path, which takes into account energy dissipation due to lobe spread and curvature at the current hit
For rays pointing inside the surface (VNDF sampling can easily produce those), hitT must be set to 0 (but better to not cast such rays)
Noise in hit distances must follow a diffuse or specular lobe. It implies that hitT for roughness = 0 must be clean (if probabilistic sampling is not in use)
In case of probabilistic diffuse / specular selection at the primary hit, provided hitT must follow the following rules:
- Should not be divided by PDF
- If diffuse or specular sampling is skipped, hitT must be set to 0 for corresponding signal type
- hitDistanceReconstructionMode must be set to something other than OFF, but bear in mind that the search area is limited to 3x3 or 5x5. In other words, it's the application's responsibility to guarantee a valid sample in this area. It can be achieved by clamping probabilities and using Bayer-like dithering (see the sample for more details)
- Pre-pass must be enabled (i.e. diffusePrepassBlurRadius and specularPrepassBlurRadius must be set to 20-70 pixels) to compensate entropy increase, since radiance in valid samples is divided by probability to compensate 0 values in some neighbors
Probabilistic sampling for 2nd+ bounces is absolutely acceptable
in case of many paths per pixel hitT for specular must be "averaged" by NRD_FrontEnd_SpecHitDistAveraging_* functions from NRD.hlsli

See NRDDescs.h for more details and descriptions of other inputs and outputs.

IMPROVING OUTPUT QUALITY

The temporal part of NRD naturally suppresses jitter, which is essential for upscaling techniques. If an SH denoiser is in use, a high quality resolve can be applied to the final output to regain back macro details, micro details and per-pixel jittering. As an example, the image below demonstrates the results after and before resolve with active DLSS (quality mode).

The resolve process takes place on the application side and has the following modular structure:

construct an SG (spherical gaussian) light
apply diffuse or specular resolve function to reconstruct macro details
apply re-jittering to reconstruct micro details
(optionally) or just extract unresolved color (fully matches the output of a corresponding non-SH denoiser)

Shader code:

// Diffuse
float4 diff = gIn_Diff.SampleLevel( gLinearSampler, pixelUv, 0 );
float4 diff1 = gIn_DiffSh.SampleLevel( gLinearSampler, pixelUv, 0 );
NRD_SG diffSg = REBLUR_BackEnd_UnpackSh( diff, diff1 );

// Specular
float4 spec = gIn_Spec.SampleLevel( gLinearSampler, pixelUv, 0 );
float4 spec1 = gIn_SpecSh.SampleLevel( gLinearSampler, pixelUv, 0 );
NRD_SG specSg = REBLUR_BackEnd_UnpackSh( spec, spec1 );

// ( Optional ) AO / SO ( available only for REBLUR )
diff.w = diffSg.normHitDist;
spec.w = specSg.normHitDist;

if( gResolve )
{
    // ( Optional ) replace "roughness" with "roughnessAA"
    roughness = NRD_SG_ExtractRoughnessAA( specSg );

    // Regain macro-details
    diff.xyz = NRD_SG_ResolveDiffuse( diffSg, N ); // or NRD_SH_ResolveDiffuse( sg, N )
    spec.xyz = NRD_SG_ResolveSpecular( specSg, N, V, roughness );

    // Regain micro-details & jittering // TODO: preload N and Z into SMEM
    float3 Ne = NRD_FrontEnd_UnpackNormalAndRoughness( gIn_Normal_Roughness[ pixelPos + int2( 1, 0 ) ] ).xyz;
    float3 Nw = NRD_FrontEnd_UnpackNormalAndRoughness( gIn_Normal_Roughness[ pixelPos + int2( -1, 0 ) ] ).xyz;
    float3 Nn = NRD_FrontEnd_UnpackNormalAndRoughness( gIn_Normal_Roughness[ pixelPos + int2( 0, 1 ) ] ).xyz;
    float3 Ns = NRD_FrontEnd_UnpackNormalAndRoughness( gIn_Normal_Roughness[ pixelPos + int2( 0, -1 ) ] ).xyz;

    float Ze = gIn_ViewZ[ pixelPos + int2( 1, 0 ) ];
    float Zw = gIn_ViewZ[ pixelPos + int2( -1, 0 ) ];
    float Zn = gIn_ViewZ[ pixelPos + int2( 0, 1 ) ];
    float Zs = gIn_ViewZ[ pixelPos + int2( 0, -1 ) ];

    float2 scale = NRD_SG_ReJitter( diffSg, specSg, Rf0, V, roughness, viewZ, Ze, Zw, Zn, Zs, N, Ne, Nw, Nn, Ns );

    diff.xyz *= scale.x;
    spec.xyz *= scale.y;
}
else
{
    // ( Optional ) Unresolved color matching the non-SH version of the denoiser
    diff.xyz = NRD_SG_ExtractColor( diffSg );
    spec.xyz = NRD_SG_ExtractColor( specSg );
}

Re-jittering math with minorly modified inputs can also be used with RESTIR produced sampling without involving SH denoisers. You only need to get light direction in the current pixel from RESTIR. Despite that RESTIR produces noisy light selections, its low variations can be easily handled by DLSS or other upscaling techs.

VALIDATION LAYER

If CommonSettings::enableValidation = true REBLUR & RELAX denoisers render debug information into OUT_VALIDATION output. Alpha channel contains layer transparency to allow easy mix with the final image on the application side. Currently the following viewport layout is used on the screen:

0	1	2	3
4	5	6	7
8	9	10	11
12	13	14	15

where:

Viewport 0 - world-space normals
Viewport 1 - linear roughness
Viewport 2 - linear viewZ
- green = +
- blue = -
- red = out of denoising range
Viewport 3 - difference between MVs, coming from IN_MV, and expected MVs, assuming that the scene is static
- blue = out of screen
- pixels with moving objects have non-0 values
Viewport 4 - world-space grid & camera jitter:
- 1 cube = 1 unit
- the square in the bottom-right corner represents a pixel with accumulated samples
- the red boundary of the square marks jittering outside of the pixel area

REBLUR specific:

Viewport 7 - amount of virtual history
Viewport 8 - number of accumulated frames for diffuse signal (red = history reset)
Viewport 11 - number of accumulated frames for specular signal (red = history reset)
Viewport 12 - input normalized hitT for diffuse signal (ambient occlusion, AO)
Viewport 15 - input normalized hitT for specular signal (specular occlusion, SO)

MEMORY REQUIREMENTS

The Persistent column (matches NRD Permanent pool) indicates how much of the Working set is required to be left intact for subsequent frames of the application. This memory stores the history resources consumed by NRD. The Aliasable column (matches NRD Transient pool) shows how much of the Working set may be aliased by textures or other resources used by the application outside of the operating boundaries of NRD.

Resolution	Denoiser	Working set (Mb)	Persistent (Mb)	Aliasable (Mb)
1080p	REBLUR_DIFFUSE	86.69	42.25	44.44
	REBLUR_DIFFUSE_OCCLUSION	42.44	25.38	17.06
	REBLUR_DIFFUSE_SH	137.31	59.12	78.19
	REBLUR_SPECULAR	105.75	50.75	55.00
	REBLUR_SPECULAR_OCCLUSION	50.94	33.88	17.06
	REBLUR_SPECULAR_SH	156.38	67.62	88.75
	REBLUR_DIFFUSE_SPECULAR	169.06	71.88	97.19
	REBLUR_DIFFUSE_SPECULAR_OCCLUSION	72.12	38.12	34.00
	REBLUR_DIFFUSE_SPECULAR_SH	270.31	105.62	164.69
	REBLUR_DIFFUSE_DIRECTIONAL_OCCLUSION	86.69	42.25	44.44
	SIGMA_SHADOW	15.00	0.00	15.00
	SIGMA_SHADOW_TRANSLUCENCY	33.94	0.00	33.94
	RELAX_DIFFUSE	99.25	63.31	35.94
	RELAX_DIFFUSE_SH	158.31	88.62	69.69
	RELAX_SPECULAR	101.44	63.38	38.06
	RELAX_SPECULAR_SH	168.94	97.12	71.81
	RELAX_DIFFUSE_SPECULAR	168.94	97.12	71.81
	RELAX_DIFFUSE_SPECULAR_SH	303.94	164.62	139.31
	REFERENCE	33.75	33.75	0.00

1440p	REBLUR_DIFFUSE	153.81	75.00	78.81
	REBLUR_DIFFUSE_OCCLUSION	75.06	45.00	30.06
	REBLUR_DIFFUSE_SH	243.81	105.00	138.81
	REBLUR_SPECULAR	187.56	90.00	97.56
	REBLUR_SPECULAR_OCCLUSION	90.06	60.00	30.06
	REBLUR_SPECULAR_SH	277.56	120.00	157.56
	REBLUR_DIFFUSE_SPECULAR	300.06	127.50	172.56
	REBLUR_DIFFUSE_SPECULAR_OCCLUSION	127.56	67.50	60.06
	REBLUR_DIFFUSE_SPECULAR_SH	480.06	187.50	292.56
	REBLUR_DIFFUSE_DIRECTIONAL_OCCLUSION	153.81	75.00	78.81
	SIGMA_SHADOW	26.38	0.00	26.38
	SIGMA_SHADOW_TRANSLUCENCY	60.12	0.00	60.12
	RELAX_DIFFUSE	176.31	112.50	63.81
	RELAX_DIFFUSE_SH	281.31	157.50	123.81
	RELAX_SPECULAR	180.06	112.50	67.56
	RELAX_SPECULAR_SH	300.06	172.50	127.56
	RELAX_DIFFUSE_SPECULAR	300.06	172.50	127.56
	RELAX_DIFFUSE_SPECULAR_SH	540.06	292.50	247.56
	REFERENCE	60.00	60.00	0.00

2160p	REBLUR_DIFFUSE	326.81	159.38	167.44
	REBLUR_DIFFUSE_OCCLUSION	159.44	95.62	63.81
	REBLUR_DIFFUSE_SH	518.06	223.12	294.94
	REBLUR_SPECULAR	398.50	191.25	207.25
	REBLUR_SPECULAR_OCCLUSION	191.31	127.50	63.81
	REBLUR_SPECULAR_SH	589.75	255.00	334.75
	REBLUR_DIFFUSE_SPECULAR	637.56	270.94	366.62
	REBLUR_DIFFUSE_SPECULAR_OCCLUSION	271.00	143.44	127.56
	REBLUR_DIFFUSE_SPECULAR_SH	1020.06	398.44	621.62
	REBLUR_DIFFUSE_DIRECTIONAL_OCCLUSION	326.81	159.38	167.44
	SIGMA_SHADOW	56.19	0.00	56.19
	SIGMA_SHADOW_TRANSLUCENCY	127.81	0.00	127.81
	RELAX_DIFFUSE	374.69	239.12	135.56
	RELAX_DIFFUSE_SH	597.81	334.75	263.06
	RELAX_SPECULAR	382.69	239.12	143.56
	RELAX_SPECULAR_SH	637.69	366.62	271.06
	RELAX_DIFFUSE_SPECULAR	637.69	366.62	271.06
	RELAX_DIFFUSE_SPECULAR_SH	1147.69	621.62	526.06
	REFERENCE	127.50	127.50	0.00

INTEGRATION VARIANTS

VARIANT 1: Black-box library (using the application-side Render Hardware Interface)

RHI must have the ability to do the following:

Create shaders from precompiled binary blobs
Create an SRV for a specific range of subresources
Create and bind 4 predefined samplers
Invoke a Dispatch call (no raster, no VS/PS)
Create 2D textures with SRV / UAV access

VARIANT 2: White-box library (using the application-side Render Hardware Interface)

Logically it's close to the Method 1, but the integration takes place in the full source code (only the NRD project is needed). In this case NRD shaders are handled by the application shader compilation pipeline. The application should still use NRD via NRD API to preserve forward compatibility. This variant suits best for compilation on other platforms (consoles, ARM), unlocks NRD modification on the application side and increases portability.

VARIANT 3: Black-box library (using native API pointers)

If Graphics API's native pointers are retrievable from the RHI, the standard NRD integration layer can be used to greatly simplify the integration. In this case, the application should only wrap up native pointers for the Device, CommandList and some input / output Resources into entities, compatible with an API abstraction layer (NRI), and all work with NRD library will be hidden inside the integration layer:

Engine or App → native objects → NRD integration layer → NRI → NRD

NRI = NVIDIA Rendering Interface - an abstraction layer on top of Graphics APIs: DX11, DX12 and VULKAN. NRI has been designed to provide low overhead access to the Graphics APIs and simplify development of DX12 and VULKAN applications. NRI API has been influenced by VULKAN as the common denominator among these 3 APIs.

NRI and NRD are ready-to-use products. The application must expose native pointers only for Device, Resource and CommandList entities (no SRVs and UAVs - they are not needed, everything will be created internally). Native resource pointers are needed only for the denoiser inputs and outputs (all intermediate textures will be handled internally). Descriptor heap will be changed to an internal one, so the application needs to bind its original descriptor heap after invoking the denoiser.

In rare cases, when the integration via the engine’s RHI is not possible and the integration using native pointers is complicated, a "DoDenoising" call can be added explicitly to the application-side RHI. It helps to avoid increasing code entropy.

The pseudo code below demonstrates how NRD integration and NRI can be used to wrap native Graphics API pointers into NRI objects to establish connection between the application and NRD:

//=======================================================================================================
// INITIALIZATION - DECLARATIONS
//=======================================================================================================

#include "NRIDescs.hpp"
#include "Extensions/NRIWrapperD3D12.h"
#include "Extensions/NRIHelper.h"

#include "NRD.h"
#include "NRDIntegration.hpp"

// bufferedFramesNum (usually 2-3 frames):
//      The application must provide number of buffered frames, it's needed to guarantee that
//      constant data and descriptor sets are not overwritten while being executed on the GPU.
// enableDescriptorCaching:
//      true - enables descriptor caching for the whole lifetime of an NrdIntegration instance
//      false - descriptors are cached only within a single "Denoise" call
NrdIntegration NRD = NrdIntegration(bufferedFramesNum, enableDescriptorCaching, "Name");

struct NriInterface
    : public nri::CoreInterface
    , public nri::HelperInterface
    , public nri::WrapperD3D12Interface
{};
NriInterface NRI;

//=======================================================================================================
// INITIALIZATION - WRAP NATIVE DEVICE
//=======================================================================================================

// Wrap the device
nri::DeviceCreationD3D12Desc deviceDesc = {};
deviceDesc.d3d12Device = ...;
deviceDesc.d3d12GraphicsQueue = ...;
deviceDesc.enableNRIValidation = false;

nri::Device* nriDevice = nullptr;
nri::Result nriResult = nri::nriCreateDeviceFromD3D12Device(deviceDesc, nriDevice);

// Get core functionality
nriResult = nri::nriGetInterface(*nriDevice,
  NRI_INTERFACE(nri::CoreInterface), (nri::CoreInterface*)&NRI);

nriResult = nri::nriGetInterface(*nriDevice,
  NRI_INTERFACE(nri::HelperInterface), (nri::HelperInterface*)&NRI);

// Get appropriate "wrapper" extension (XXX - can be D3D11, D3D12 or VULKAN)
nriResult = nri::nriGetInterface(*nriDevice,
  NRI_INTERFACE(nri::WrapperXXXInterface), (nri::WrapperXXXInterface*)&NRI);

//=======================================================================================================
// INITIALIZATION - INITIALIZE NRD
//=======================================================================================================

const nrd::DenoiserDesc denoiserDescs[] =
{
    // Put neeeded denoisers here, like:
    { identifier1, nrd::Denoiser::XXX },
    { identifier2, nrd::Denoiser::YYY },
};

nrd::InstanceCreationDesc instanceCreationDesc = {};
instanceCreationDesc.denoisers = denoiserDescs;
instanceCreationDesc.denoisersNum = GetCountOf(denoiserDescs);

// NRD itself is flexible and supports any kind of dynamic resolution scaling, but NRD INTEGRATION pre-
// allocates resources with statically defined dimensions. DRS is only supported by adjusting the viewport
// via "CommonSettings::rectSize"
bool result = NRD.Initialize(resourceWidth, resourceHeight, instanceCreationDesc, *nriDevice, NRI, NRI);

//=======================================================================================================
// INITIALIZATION or RENDER - WRAP NATIVE POINTERS
//=======================================================================================================

// Wrap the command buffer
nri::CommandBufferD3D12Desc commandBufferDesc = {};
commandBufferDesc.d3d12CommandList = (ID3D12GraphicsCommandList*)d3d12CommandList;

// Not needed for NRD integration layer, but needed for NRI validation layer
commandBufferDesc.d3d12CommandAllocator = (ID3D12CommandAllocator*)d3d12CommandAllocatorOrJustNonNull;

nri::CommandBuffer* nriCommandBuffer = nullptr;
NRI.CreateCommandBufferD3D12(*nriDevice, commandBufferDesc, nriCommandBuffer);

// Wrap required textures (better do it only once on initialization)
nri::TextureTransitionBarrierDesc entryDescs[N] = {};
nri::Format entryFormat[N] = {};

for (uint32_t i = 0; i < N; i++)
{
    nri::TextureTransitionBarrierDesc& entryDesc = entryDescs[i];
    const MyResource& myResource = GetMyResource(i);

    nri::TextureD3D12Desc textureDesc = {};
    textureDesc.d3d12Resource = myResource->GetNativePointer();
    NRI.CreateTextureD3D12(*nriDevice, textureDesc, (nri::Texture*&)entryDesc.texture );

    // You need to specify the current state of the resource here, after denoising NRD can modify
    // this state. Application must continue state tracking from this point.
    // Useful information:
    //    SRV = nri::AccessBits::SHADER_RESOURCE, nri::TextureLayout::SHADER_RESOURCE
    //    UAV = nri::AccessBits::SHADER_RESOURCE_STORAGE, nri::TextureLayout::GENERAL
    entryDesc.nextState.accessBits = ConvertResourceStateToAccessBits( myResource->GetCurrentState() );
    entryDesc.nextState.layout = ConvertResourceStateToLayout( myResource->GetCurrentState() );
}

//=======================================================================================================
// RENDER - DENOISE
//=======================================================================================================

// Set common settings
//  - for the first time use defaults
//  - currently NRD supports only the following view space: X - right, Y - top, Z - forward or backward
nrd::CommonSettings commonSettings = {};
PopulateCommonSettings(commonSettings);

NRD.SetCommonSettings(commonSettings);

// Set settings for each method in the NRD instance
nrd::XxxSettings settings1 = {};
PopulateXxxSettings(settings1);

NRD.SetDenoiserSettings(identifier1, &settings1);

nrd::YyySettings settings2 = {};
PopulateYyySettings(settings2);

NRD.SetDenoiserSettings(identifier2, &settings2);

// Fill up the user pool
NrdUserPool userPool = {};
{
    // Fill only required "in-use" inputs and outputs in appropriate slots using entryDescs & entryFormat,
    // applying remapping if necessary. Unused slots will be {nullptr, nri::Format::UNKNOWN}
    NrdIntegration_SetResource(userPool, ...);
    ...
    NrdIntegration_SetResource(userPool, ...);
};

const nrd::Identifier denoisers[] = {identifier1, identifier2};
NRD.Denoise(denoisers, helper::GetCountOf(denoisers), *nriCommandBuffer, userPool);

// IMPORTANT: NRD integration binds own descriptor pool, don't forget to re-bind back your pool (heap)

//=======================================================================================================
// SHUTDOWN or RENDER - CLEANUP
//=======================================================================================================

// Better do it only once on shutdown
for (uint32_t i = 0; i < N; i++)
    NRI.DestroyTexture(entryDescs[i].texture);

NRI.DestroyCommandBuffer(*nriCommandBuffer);

//=======================================================================================================
// SHUTDOWN - DESTROY
//=======================================================================================================

// Release wrapped device
nri::nriDestroyDevice(*nriDevice);

// Also NRD needs to be recreated on "resize"
NRD.Destroy();

Shader part:

#if 1
    #include "NRDEncoding.hlsli"
#else
    // Or define NRD encoding in Cmake and deliver macro definitions to shader compilation command line
#endif

#include "NRD.hlsli"

// Call corresponding "front end" function to encode data for NRD (NRD.hlsli indicates which function
// needs to be used for a specific input for a specific denoiser). For example:

float4 nrdIn = RELAX_FrontEnd_PackRadianceAndHitDist(radiance, hitDistance);

// Call corresponding "back end" function to decode data produced by NRD. For example:

float4 nrdOut = RELAX_BackEnd_UnpackRadiance(nrdOutEncoded);

RECOMMENDATIONS AND BEST PRACTICES: GREATER TIPS

Denoising is not a panacea or miracle. Denoising works best with ray tracing results produced by a suitable form of importance sampling. Additionally, NRD has its own restrictions. The following suggestions should help to achieve best image quality:

MATERIAL DE-MODULATION (IRRADIANCE → RADIANCE)

NRD has been designed to work with pure radiance coming from a particular direction. This means that data in the form "something / probability" should be avoided if possible because overall entropy of the input signal will be increased (but it doesn't mean that denoising won't work). Additionally, it means that materials needs to be decoupled from the input signal, i.e. irradiance, typically produced by a path tracer, needs to be transformed into radiance, i.e. BRDF should be applied after denoising. This is achieved by using "demodulation" trick:

// Diffuse
Denoising( diffuseRadiance * albedo ) → NRD( diffuseRadiance / albedo ) * albedo

// Specular
float3 preintegratedBRDF = PreintegratedBRDF( Rf0, N, V, roughness )
Denoising( specularRadiance * BRDF ) → NRD( specularRadiance * BRDF / preintegratedBRDF ) * preintegratedBRDF

A good approximation for pre-integrated specular BRDF can be found [here](https://github.com/NVIDIAGameWorks/MathLib/blob/407ecd0d1892d12ee1ec98c3d46cbeed73b79a0d/STL.hlsli#L2147. Pre-integrated specular BRDF can also be referenced as "specular albedo" or "environment BRDF".

COMBINED DENOISING OF DIRECT AND INDIRECT LIGHTING

For specular signal use indirect hitT for both direct and indirect lighting

The reason is that the denoiser uses hitT mostly for calculating motion vectors for reflections. For that purpose, the denoiser expects to see hitT from surfaces that are in the specular reflection lobe. When calculating direct lighting (NEE/RTXDI), we select a light per pixel, and the distance to that light becomes the hitT for both diffuse and specular channels. In many cases, the light is selected for a surface because of its diffuse contribution, not specular, which makes the specular channel contain the hitT of a diffuse light. That confuses the denoiser and breaks reprojection. On the other hand, the indirect specular hitT is always computed by tracing rays in the specular lobe.

For diffuse signal hitT can be further adjusted by mixing hitT from direct and indirect rays to get sharper shadows

Use first bounce hit distance for the indirect in the pseudo-code below:

float hitDistContribution = directDiffuseLuminance / ( directDiffuseLuminance + indirectDiffuseLuminance + EPS );

float maxContribution = 0.5; // 0.65 works good as well
float directHitDistContribution = min(directHitDistContribution, maxContribution); // avoid over-sharpening

hitDist = lerp(indirectDiffuseHitDist, directDiffuseHitDist, directHitTContribution);

INTERACTION WITH PRIMARY SURFACE REPLACEMENTS (PSR)

When denoising reflections in pure mirrors, some advantages can be reached if NRD "sees" the first "non-pure mirror" point after a series of pure mirror bounces (delta events). This point is called Primary Surface Replacement.

Primary Surface Replacement (PSR) can be used with NRD.

Notes, requirements and restrictions:

the primary hit (0th bounce) gets replaced with the first "non-pure mirror" hit in the bounce chain - this hit becomes PSR
all associated data in the g-buffer gets replaced by PSR data
the camera "sees" PSRs like the mirror surfaces in-between don't exist. This space is called virtual world space
- virtual space position lies on the same view vector as the primary hit position, but the position is elongated. Elongation depends on hitT and curvature at bounces, starting from the primary hit
- virtual space normal is the normal at PSR hit mirrored several times in the reversed order until the primary hit is reached
PSR data is NOT always data at the PSR hit!
- material properties (albedo, metalness, roughness etc.) are from PSR hit
- IN_VIEWZ contains viewZ of the virtual position
- IN_MV contains motion of the virtual position
- IN_NORMAL_ROUGHNESS contains normal at virtual world space and roughness at PSR
- accumulated hitT for NRD starts at the PSR hit. Curvature must be taken into account on the application side only for 2nd+ bounces starting from this hit (similarly to hitT requirements in Noisy Inputs section)
- ray direction for NRD must be transformed into virtual space

In case of PSR NRD disocclusion logic doesn't take curvature at primary hit into account, because data for primary hits is replaced. This can lead to more intense disocclusions on bumpy surfaces due to significant ray divergence. To mitigate this problem 2x-10x larger disocclusionThreshold can be used. This is an applicable solution if the denoiser is used to denoise surfaces with PSR only (glass only, for example). In a general case, when PSR and normal surfaces are mixed on the screen, higher disocclusion thresholds are needed only for pixels with PSR. This can be achieved by using IN_DISOCCLUSION_THRESHOLD_MIX input to smoothly mix baseline disocclusionThreshold into bigger disocclusionThresholdAlternate from CommonSettings. Most likely the increased disocclusion threshold is needed only for pixels with normal details at primary hits (local curvature is not zero).

The illustration below shows expected inputs for secondary hits:

hitDistance = length( C - B ); // hitT for 2nd bounce, but it's 1st bounce in the reflected world
Bvirtual = A + viewVector * length( B - A );

IN_VIEWZ = TransformToViewSpace( Bvirtual ).z;
IN_NORMAL_ROUGHNESS = GetVirtualSpaceNormalAndRoughnessAt( B );
IN_MV = GetMotionAt( B );

INTERACTION WITH `INFs` AND `NANs`

NRD doesn't touch pixels outside of viewport: INFs / NANs are allowed
NRD doesn't touch pixels outside of denoising range: INFs / NANs are allowed
INFs / NANs are not allowed for pixels inside the viewport and denoising range
- INFs can be used in IN_VIEWZ, but not recommended

INTERACTION WITH FRAME GENERATION TECHNIQUES

Frame generation (FG) techniques boost FPS by interpolating between 2 last available frames. NRD works better when framerate increases, because it gets more data per second. It's not the case for FG, because all rendering pipeline underlying passes (like, denoising) continue to work on the original non-boosted framerate.

HAIR DENOISING TIPS

NRD tries to preserve jittering at least on geometrical edges, it's essential for upscalers, which are usually applied at the end of the rendering pipeline. It naturally moves the problem of anti-aliasing to the application side. In order, it implies the following obvious suggestions:

trace at higher resolution, denoise, apply AA and downscale
apply a high-quality upscaler in "AA-only" mode, i.e. without reducing the tracing resolution (for example, DLSS in DLAA mode)

Sub-pixel thin geometry of strand-based hair transforms "normals guide" into jittering & flickering pixel mess, i.e. the guide itself becomes noisy. It worsens denoising IQ. At least for NRD better to replace geometry normals in "normals guide" with a vector = normalize( cross( T, B ) ), where:

T - hair strand tangent vector
B - is not a classic binormal, it's more an averaged direction to a bunch of closest hair strands (in many cases it's a binormal vector of underlying head / body mesh)
- B can be simplified to normalize( cross( V, T ) ), where V is the view vector
- in other words, B must follow the following rules:
  - cross( T, B ) != 0
  - B must not follow hair strand "tube"

Hair strands tangent vectors can't be used as "normals guide" for NRD due to BRDF and curvature related calculations, requiring a vector, which can be considered a "normal" vector.

RECOMMENDATIONS AND BEST PRACTICES: LESSER TIPS

[NRD] The NRD API has been designed to support integration into native VULKAN apps. If the RHI you work with is DX11-like, not all provided data will be needed.

[NRD] Read all comments in NRDDescs.h, NRDSettings.h and NRD.hlsli.

[NRD] If you are unsure of which parameters to use - use defaults via {} construction. It helps to improve compatibility with future versions and offers optimal IQ, because default settings are always adjusted by recent algorithmic changes.

[NRD] NRD requires linear roughness and world-space normals. See NRD.hlsli for more details and supported customizations.

[NRD] NRD requires non-jittered matrices.

[NRD] Most of denoisers do not write into output pixels outside of CommonSettings::denoisingRange.

[NRD] When upgrading to the latest version keep an eye on ResourceType enumeration. The order of the input slots can be changed or something can be added, you need to adjust the inputs accordingly to match the mapping. Or use NRD integration to simplify the process.

[NRD] All pixels in floating point textures should be INF / NAN free to avoid propagation, because such values are used in weight calculations and accumulation of a weighted sum. Functions XXX_FrontEnd_PackRadianceAndHitDist perform optional NAN / INF clearing of the input signal. There is a boolean to skip these checks.

[NRD] All denoisers work with positive RGB inputs (some denoisers can change color space in front end functions). For better image quality, HDR color inputs need to be in a sane range [0; 250], because the internal pipeline uses FP16 and RELAX tracks second moments of the input signal, i.e. x^2 must fit into FP16 range. If the color input is in a wider range, any form of non-aggressive color compression can be applied (linear scaling, pow-based or log-based methods). REBLUR supports wider HDR ranges, because it doesn't track second moments. Passing pre-exposured colors (i.e. color * exposure) is not recommended, because a significant momentary change in exposure is hard to react to in this case.

[NRD] NRD can track camera motion internally. For the first time pass all MVs set to 0 (you can use CommonSettings::motionVectorScale = {0} for this) and set CommonSettings::isMotionVectorInWorldSpace = true, it will allow you to simplify the initial integration. Enable application-provided MVs after getting denoising working on static objects.

[NRD] Using 2D MVs can lead to massive history reset on moving objects, because 2D motion provides information only about pixel screen position but not about real 3D world position. Consider using 2.5D or 3D MVs instead. 2.5D motion, which is 2D motion with additionally provided viewZ delta (i.e. viewZprev = viewZ + MV.z), is even better, because it has the same benefits as 3D motion, but doesn't suffer from imprecision problems caused by world-space delta rounding to FP16 during MV patching on the NRD side.

[NRD] Firstly, try to get a working reprojection on a diffuse signal for camera rotations only (without camera motion).

[NRD] Diffuse and specular signals must be separated at primary hit (or at secondary hit in case of PSR).

[NRD] Denoising logic is driven by provided hit distances. For indirect lighting denoising passing hit distance for the 1st bounce only is a good baseline. For direct lighting a distance to an occluder or a light source is needed. Primary hit distance must be excluded in any case.

[NRD] Importance sampling is recommended to achieve good results in case of complex lighting environments. Consider using:

Cosine distribution for diffuse from non-local light sources
VNDF sampling for specular
Custom importance sampling for local light sources (RTXDI).

[NRD] Additionally the quality of the input signal can be increased by re-using already denoised information from the current or the previous frame.

[NRD] Hit distances should come from an importance sampling method. But if denoising of AO/SO is needed, AO/SO can come from cos-weighted (or VNDF) sampling in a tradeoff of IQ.

[NRD] Low discrepancy sampling (blue noise) helps to get more stable output in 0.5-1 rpp mode. It's a must for REBLUR-based Ambient and Specular Occlusion denoisers and SIGMA.

[NRD] It's recommended to set CommonSettings::accumulationMode to RESET for a single frame, if a history reset is needed. If history buffers are recreated or contain garbage, it's recommended to use CLEAR_AND_RESET for a single frame. CLEAR_AND_RESET is not free because clearing is done in a compute shader. Render target clears on the application side should be prioritized over this solution.

[NRD] If there are areas (besides sky), which don't require denoising (for example, casting a specular ray only if roughness is less than some threshold), providing viewZ > CommonSettings::denoisingRange in IN_VIEWZ texture for such pixels will effectively skip denoising. Additionally, the data in such areas won't contribute to the final result.

[NRD] If there are areas (besides sky), which don't require denoising (for example, skipped diffuse rays for true metals). materialID and materialMask can be used to drive spatial passes.

[NRD] Input signal quality can be improved by enabling pre-pass via setting diffusePrepassBlurRadius and specularPrepassBlurRadius to a non-zero value. Pre-pass is needed more for specular and less for diffuse, because pre-pass outputs optimal hit distance for specular tracking (see the sample for more details).

[NRD] In case of probabilistic diffuse / specular split at the primary hit, hit distance reconstruction pass must be enabled, if exposed in the denoiser (see HitDistanceReconstructionMode).

[NRD] In case of probabilistic diffuse / specular split at the primary hit, pre-pass must be enabled, if exposed in the denoiser (see diffusePrepassBlurRadius and specularPrepassBlurRadius).

[NRD] Maximum number of accumulated frames can be FPS dependent. The following formula can be used on the application side to adjust maxAccumulatedFrameNum, maxFastAccumulatedFrameNum and potentially historyFixFrameNum too:

maxAccumulatedFrameNum = accumulationPeriodInSeconds * FPS

[NRD] Fast history is the input signal, accumulated for a few frames. Fast history helps to minimize lags in the main history, which is accumulated for more frames. The number of accumulated frames in the fast history needs to be carefully tuned to avoid introducing significant bias and dirt. Initial integration should be done with default settings. Bear in mind the following recommendation:

maxAccumulatedFrameNum > maxFastAccumulatedFrameNum > historyFixFrameNum

[NRD] In case of quarter resolution tracing and denoising use pixelPos / 2 as texture coordinates. Using a "rotated grid" approach (when a pixel gets selected from 2x2 footprint one by one) is not recommended because it significantly bumps entropy of non-noisy inputs, leading to more disocclusions. In case of REBLUR it's recommended to increase sigmaScale in antilag settings. "Nearest Z" upsampling works best for upscaling of the denoised output. Code, as well as upsampling function, can be found in NRD sample releases before 3.10.

[NRD] SH denoisers can use more relaxed lobeAngleFraction. It can help to improve stability, while details will be reconstructed back by SG resolve.

[REBLUR] If more performance is needed, consider using enablePerformanceMode = true.

[REBLUR] REBLUR expects hit distances in a normalized form. To avoid mismatching, REBLUR_FrontEnd_GetNormHitDist must be used for normalization. Normalization parameters should be passed into NRD as HitDistanceParameters for internal hit distance denormalization. Some tweaking can be needed here, but in most cases default HitDistanceParameters works well. REBLUR outputs denoised normalized hit distance, which can be used by the application as ambient or specular occlusion (AO & SO) (see unpacking functions from NRD.hlsli).

[REBLUR/RELAX] Antilag parameters need to be carefully tuned. Initial integration should be done with disabled antilag.

[RELAX] RELAX works well with signals produced by RTXDI or very clean high RPP signals. The Sweet Home of RELAX is RTXDI sample. Please, consider getting familiar with this application.

[SIGMA] Using "blue" noise can help to avoid shadow shimmering. It works best if the pattern is static on the screen.

[SIGMA] SIGMA can be used for multi-light shadow denoising if applied "per light". SigmaSettings::stabilizationStrength can be set to 0 to disable temporal history. It provides the followinmg benefits:

light count independent memory usage
no need to manage history buffers for lights

[SIGMA] In theory SIGMA_TRANSLUCENT_SHADOW can be used as a "single-pass" shadow denoiser for shadows from multiple light sources:

L[i] - unshadowed analytical lighting from a single light source (not noisy)
S[i] - stochastically sampled light visibility for L[i] (noisy)
Σ( L[i] ) - unshadowed analytical lighting, typically a result of tiled lighting (HDR, not in range [0; 1])
Σ( L[i] × S[i] ) - final lighting (what we need to get)

The idea:
L1 × S1 + L2 × S2 + L3 × S3 = ( L1 + L2 + L3 ) × [ ( L1 × S1 + L2 × S2 + L3 × S3 ) / ( L1 + L2 + L3 ) ]

Or:
Σ( L[i] × S[i] ) = Σ( L[i] ) × [ Σ( L[i] × S[i] ) / Σ( L[i] ) ]
Σ( L[i] × S[i] ) / Σ( L[i] ) - normalized weighted sum, i.e. pseudo translucency (LDR, in range [0; 1])

Input data preparation example:

float3 Lsum = 0;
float3 LSsum = 0.0;
float Wsum = 0.0;
float Psum = 0.0;

for( uint i = 0; i < N; i++ )
{
    float3 L = ComputeLighting( i );
    Lsum += L;

    // "distanceToOccluder" should respect rules described in NRD.hlsli in "INPUT PARAMETERS" section
    float distanceToOccluder = SampleShadow( i );
    float shadow = !IsOccluded( distanceToOccluder );
    LSsum += L * shadow;

    // The weight should be zero if a pixel is not in the penumbra, but it is not trivial to compute...
    float weight = ...;
    weight *= Luminance( L );
    Wsum += weight;

    float penumbraRadius = SIGMA_FrontEnd_PackPenumbra( ... ).x;
    Psum += penumbraRadius * weight;
}

float3 translucency = LSsum / max( Lsum, NRD_EPS );
float penumbraRadius = Psum / max( Wsum, NRD_EPS );

After denoising the final result can be computed as:

Σ( L[i] × S[i] ) = Σ( L[i] ) × OUT_SHADOW_TRANSLUCENCY.yzw

Is this a biased solution? If spatial filtering is off - no, because we just reorganized the math equation. If spatial filtering is on - yes, because denoising will be driven by most important light in a given pixel.

This solution is limited and hard to use:

obviously, can be used "as is" if shadows don't overlap (weight = 1)
if shadows overlap, a separate pass is needed to analyze noisy input and classify pixels as umbra - penumbra (and optionally empty space). Raster shadow maps can be used for this if available
it is not recommended to mix 1 cd and 100000 cd lights, since FP32 texture will be needed for a weighted sum. In this case, it's better to process the sun and other bright light sources separately.

raytracingdenoiser's People

Contributors

Stargazers

Watchers

raytracingdenoiser's Issues

Depth of field, can it be denoised with NRD?

Hi, I've been tinkering with Nvidia's vk_raytrace sample and noticed it now supports non-zero aperture and a variable focal distance for Depth of Field.

It looks great so it would be fantastic if it could be done in realtime with NRD / Reblur.

Before I spent any time trying to modify NRD to support it, can you tell me if this is likely to succeed?

NRD integration already defined compiler error

Used headers:

//
#include <NRI.h>
#include <NRD.h>
#include <NRIDescs.hpp>
#include <Extensions/NRIWrapperVK.h>
#include <Extensions/NRIHelper.h>
#include <NRDIntegration.hpp>

Probably, you needs inline in code.

2>Generating Code...
2>VMA.obj : error LNK2005: "public: bool __cdecl NrdIntegration::Initialize(struct nri::Device &,struct nri::CoreInterface const &,struct nri::HelperInterface const &,struct nrd::DenoiserCreationDesc const &)" (?Initialize@NrdIntegration@@QEAA_NAEAUDevice@nri@@AEBUCoreInterface@3@AEBUHelperInterface@3@AEBUDenoiserCreationDesc@nrd@@@Z) already defined in Alter.obj
2>VMA.obj : error LNK2005: "public: bool __cdecl NrdIntegration::SetMethodSettings(enum nrd::Method,void const *)" (?SetMethodSettings@NrdIntegration@@QEAA_NW4Method@nrd@@PEBX@Z) already defined in Alter.obj
2>VMA.obj : error LNK2005: "public: void __cdecl NrdIntegration::Denoise(unsigned int,struct nri::CommandBuffer &,struct nrd::CommonSettings const &,class std::array<struct NrdIntegrationTexture,26> const &,bool)" (?Denoise@NrdIntegration@@QEAAXIAEAUCommandBuffer@nri@@AEBUCommonSettings@nrd@@AEBV?$array@UNrdIntegrationTexture@@$0BK@@std@@_N@Z) already defined in Alter.obj
2>VMA.obj : error LNK2005: "public: void __cdecl NrdIntegration::Destroy(void)" (?Destroy@NrdIntegration@@QEAAXXZ) already defined in Alter.obj
2>VMA.obj : error LNK2005: "public: void __cdecl NrdIntegration::CreatePipelines(void)" (?CreatePipelines@NrdIntegration@@QEAAXXZ) already defined in Alter.obj
2>VMA.obj : error LNK2005: "private: void __cdecl NrdIntegration::CreateResources(void)" (?CreateResources@NrdIntegration@@AEAAXXZ) already defined in Alter.obj
2>VMA.obj : error LNK2005: "private: void __cdecl NrdIntegration::AllocateAndBindMemory(void)" (?AllocateAndBindMemory@NrdIntegration@@AEAAXXZ) already defined in Alter.obj
2>VMA.obj : error LNK2005: "private: void __cdecl NrdIntegration::Dispatch(struct nri::CommandBuffer &,struct nri::DescriptorPool &,struct nrd::DispatchDesc const &,class std::array<struct NrdIntegrationTexture,26> const &,bool)" (?Dispatch@NrdIntegration@@AEAAXAEAUCommandBuffer@nri@@AEAUDescriptorPool@3@AEBUDispatchDesc@nrd@@AEBV?$array@UNrdIntegrationTexture@@$0BK@@std@@_N@Z) already defined in Alter.obj
2>test.obj : error LNK2005: "public: bool __cdecl NrdIntegration::Initialize(struct nri::Device &,struct nri::CoreInterface const &,struct nri::HelperInterface const &,struct nrd::DenoiserCreationDesc const &)" (?Initialize@NrdIntegration@@QEAA_NAEAUDevice@nri@@AEBUCoreInterface@3@AEBUHelperInterface@3@AEBUDenoiserCreationDesc@nrd@@@Z) already defined in Alter.obj
2>test.obj : error LNK2005: "public: bool __cdecl NrdIntegration::SetMethodSettings(enum nrd::Method,void const *)" (?SetMethodSettings@NrdIntegration@@QEAA_NW4Method@nrd@@PEBX@Z) already defined in Alter.obj
2>test.obj : error LNK2005: "public: void __cdecl NrdIntegration::Denoise(unsigned int,struct nri::CommandBuffer &,struct nrd::CommonSettings const &,class std::array<struct NrdIntegrationTexture,26> const &,bool)" (?Denoise@NrdIntegration@@QEAAXIAEAUCommandBuffer@nri@@AEBUCommonSettings@nrd@@AEBV?$array@UNrdIntegrationTexture@@$0BK@@std@@_N@Z) already defined in Alter.obj
2>test.obj : error LNK2005: "public: void __cdecl NrdIntegration::Destroy(void)" (?Destroy@NrdIntegration@@QEAAXXZ) already defined in Alter.obj
2>test.obj : error LNK2005: "public: void __cdecl NrdIntegration::CreatePipelines(void)" (?CreatePipelines@NrdIntegration@@QEAAXXZ) already defined in Alter.obj
2>test.obj : error LNK2005: "private: void __cdecl NrdIntegration::CreateResources(void)" (?CreateResources@NrdIntegration@@AEAAXXZ) already defined in Alter.obj
2>test.obj : error LNK2005: "private: void __cdecl NrdIntegration::AllocateAndBindMemory(void)" (?AllocateAndBindMemory@NrdIntegration@@AEAAXXZ) already defined in Alter.obj
2>test.obj : error LNK2005: "private: void __cdecl NrdIntegration::Dispatch(struct nri::CommandBuffer &,struct nri::DescriptorPool &,struct nrd::DispatchDesc const &,class std::array<struct NrdIntegrationTexture,26> const &,bool)" (?Dispatch@NrdIntegration@@AEAAXAEAUCommandBuffer@nri@@AEAUDescriptorPool@3@AEBUDispatchDesc@nrd@@AEBV?$array@UNrdIntegrationTexture@@$0BK@@std@@_N@Z) already defined in Alter.obj
2>C:\VULKAN\Alter-Old\build\Debug\Alter.exe : fatal error LNK1169: one or more multiply defined symbols found
2>Done building project "Alter.vcxproj" -- FAILED.
========== Rebuild All: 1 succeeded, 1 failed, 0 skipped ==========

No swap chain recreation in Vulkan backend

When the window is resized, for example by moving it between two screens with different resolution, the application stops without a proper error message.
This problem is caused by directly exiting when present returns that the swap chain is out of date.
Fixing this issue would require the swap chain to be recreated when out of date.

Does NRDIntegration.hpp require using NRI for the entire engine?

NrdIntegration::Denoise asserts because the given nri::CommandBuffer is not in recording state.

Based on the source code, it seems that you have to call NRI::BeginCommandBuffer to start recording (passing your own descriptor pool, because the one in NrdIntegration is not publicly available). However, if I do that, I'm getting another assert because the wrapped native ID3D12GraphicsCommandList is not closed (since it already has previous commands of my engine in it).

I looked at the sample, which uses NRI for everything.

Does this mean, you can't use NRDIntegration.hpp if you don't want to base your entire engine on NRI?

Adding CMake config file for the shipped SDK.

Hi,

I would like to make a suggestion for a small improvement for the NRD project.

As a user, I high appreciate the usage of CMake to make the compilation process a lot simpler.
However, the prepared SDK doesn't come with a CMake config file, which would mean my own project would have to setup everything manually.

Here is what I did in my experimental project CMake script

include_directories(${PROJECT_DEPENENCIES}/nrd/Include)
link_directories(${PROJECT_DEPENENCIES}/nrd/Lib/Release)
target_link_libraries(ProjectName NRD)

This is a reasonable solution to include the NRD project in my own project, except a few minor catches

In the future when the file structure is changed, further upgrading of NRD would require me, in general all users who want to upgrade, to update their build script manually since there is an implicit connection between the user project and NRD project.
If the depenendcy is not there in the folder, CMake won't tell during project generation time. It would only tell during compliation, most likely compiler saying no header existed.

It would be nice if the NRD project could generate a CMake config file. This would make the NRD users, I mean those who indeed use CMake in their system, a lot easier to integration. Eventually, integrating NRD in their CMake project would simply be

find_package(NRD REQUIRED FATAL_ERROR HINTS ${PROJECT_DEPENENCIES}/nrd/CMake)
target_link_libraries(ProjectName NRD)

This doesn't really just save one line coding. The two issues mentioned above would be gone. From the users perspective, they won't care about how the files are structured inside the SDK, at least for Cpp host project.
Though for the shader header file, they would still need to inform the dxc/fxc to add the include directory in some way.

This, of course, is only a minor improvement to the system. And it would only help users who use CMake in their own system, which does limit the benefit of this improvement unfortunately.

Thanks
Jiayin

Dynamic Resolution

Currently to change resolution, you must recreate the denoiser from scratch with the new resolution. This causes all the lighting history information to be lost, and results poor image quality for the first few frames after creation.

Are there any recommendations for improving this behavior? Is dynamic resolution support being added at the library level any time soon?

History glitch

Hi there! I'm trying to resolve some artifacts with our NRD implementation, but I'm struggling with this particular issue, and I'm hoping that someone has a clue of where this could be originated.

As seen below:

It appears that some history frames are incorrectly mapped to the screen; they are recursively added to the illumination of the center of the screen. When I pass 0 to the CommonSettings::frameIndex, this glitch doesn't appear (but ofcourse the quality is much lower without previous frame textures).

The denoiser is in the RELAX_DIFFUSE_SPECULAR mode, and my implementation roughly follows the RTXDI sample's integration, but adjusted to a DirectX12 only environment. My matrices are left-handed (+y up, +z forward, +x right) and row-major, transposed into column-major.

Alpha Blending vs Masking issue, can shadow translucency denoiser solve it?

Hi, I'm using NRD and in my path tracer and am having an issue with translucencies. I have some torch flames with 0-1 alpha transparencies, which are noisy in the output.

Here's what I see, post denoise (ReBlur):

Blend on the left, mask / threshold on the right.

From what it looks like to me in the NRD Sample code, the translucencies are done in a separate pass then composited on top of the final image. This won't work for me since I need the alpha channel to apply to secondary bounces so that smoke and flames show up properly. I could treat secondary bounces through such translucent objects as masked, but they are denoised anyway so I think that's not necessary..

I could potentially use the anyhit shader to accumulate RGB+A values as an additional storage channel separately, so that it's not noisy at all and can be recombined either in compositing or just stored in the noise-free emissive gbuffer target.

Is this the recommended route here? Or is there something else that would work better, say, using NRD's SIGMA_SHADOW or SIGMA_SHADOW_TRANSLUCENCY.

I'd rather no have to do hacky stuff like back-to-front alpha rendering using rasterization and hardware blending which won't work properly with overlapping flames (which happens). But I could potentially see an easy solution for foreground flames by special-casing the anyhit to know it's being applied to the base pass / gbuffer fill pass, instead of secondary bounces, and do the blending between many potential transparent objects in the shader, leaving any solid "hits" out of it (which would be noisy). I don't really need the emissive lights themselves to react to lighting in the environment (treat their albedo as = 0).

add support for Linux

Be really nice(detrimental) to run this in Linux Ubuntu/Fedora..
but I guess the Windows SDK would prevent one from translating the batch files to shell..

Copy NRD.hlsli by default

Hi,

I'd like to point out a minor thing in the scripts.
As it is necessary to use code in NRD.hlsli in the user ray tracing app that integrates the denoiser as it has a few helper functions that packs data, such as REBLUR_FrontEnd_PackRadianceAndNormHitDist. Of course, an alternative solution is for the users to define the same function in their own code base, but that would cause further upgrading of Denoiser in the app a bit troublesome as there is some implicit connection between their code and the SDK that needs some special attention.

It makes a lot of sense to copy this particular file by default regardless whether the user choose to add the argument copy_shader or not. To my understanding, the copy_shaders option is specifically for people who wants to integrate the SDK with the third varient mentioned here . For people who are interested in method 1 or 2, they won't care about the source code of the shaders, except this NRD.hlsli.

An extra suggestion of the script is that, it would be very nice to generate the macro definition of the two macros, NRD_USE_OCT_NORMAL_ENCODING and NRD_USE_MATERIAL_ID by the script.
My current solution is to take a look at the way I build the NRD project and put the macros right before where I include it. But this could also cause some inconsistency if people won't pay attention to it. The script has everything it needs to generate the two macros and put it in front of the hlsli file to avoid asking user to define it.

And another very minor thing is that maybe it is also worth considering to not copy NRDIntegration.h and NRDIntegration.hpp by default, as these two files are only for the second integration method. If the user chooses to use the first and third method, these two files are not relevant.

Thanks
Jiayin

Problems with camera translation

Currently the denoiser works like a charm when I'm rotating the camera, but when I translate the camera then image lags behind and becomes blurry, until i stop moving.

I suspect my problem is somehow related to this:

Is there any step by step guide how to do this? The sample code provided was not very helpful to this problem.

[RELAX/REBLUR] Dealing with "dancing" noise (fireflies)

Unrelated to my other post at #46, we're also dealing with another issue that we'd hope to improve;

We use RTXDI together with NRD, and one of the problems that we've been running into is that certain clusters of fireflies result in boiling, that seems to be enlarged by NRD. This isn't necessarily NRD's fault but I'm curious about potential tips on reducing this noise. Video here: https://tc3-img-dev.s3.eu-central-1.amazonaws.com/b1cd9d461e9cb8ffb37ffd5e253654474fda5312818aa453971029aaf54e5dd3

[REBLUR] Laggy reflections

I am getting laggy reflections on my own project and on the NRD sample.
To reproduce this on the NRD sample: FPS cap at 30, No normal map and metallic at 1, gives lots of lag near the reflection as show in the video below:
https://user-images.githubusercontent.com/18483702/205492928-604dd7ac-17fb-4417-8477-76aa27ab5bf1.mp4

Is it supposed to be like this? How can one fix it in that case?

Default app behaviour is to fire up with a window you can't move that straddles two monitors

Hi there. I managed to get the sample running now, thanks for you help with fixing the batch files.

However.

Now I can run it, the apps behaviour is to fire up a window which attempts to centre itself on my two monitors. I get half the window in one monitor, half in the other. One monitor is 1080p, the other is 4k so it all looks rather odd. The app fires up with a window style that won't allow you to move it around, either.

Alt-enter doesn't work, nor can you do anything with the standard windows menu (alt-space) to move the window around.

Are there any command line options or similar that I can use to get the app displaying in a more useful way?

Thanks

[RELAX] Specular reflections behaving incorrectly upon camera translation

We recently updated from NRD 2.9.0 to 3.7.0, and started noticing issues with specular reflections that seem to interact with surrounding shadows incorrectly upon camera translation.

See the video below:
https://youtu.be/PuE9Us28NZU&vq=hd1080

Left view = NRD on
Right view = NRD off

As is visible in the video (although a little difficult to see through compression), the shadow of the box gets dragged along camera translation around the area of the floor that is specularly lit. We didn't have this problem in previous versions of NRD, but we also didn't notice the issue in the 3.7.0 sample, which seems to imply that this is a problem on our side. I've tried comparing our NRD integration to the official sample but unfortunately to no avail. I'm posting here in the hope that someone with more experience with the API might be able to identify the issue.

In case it is helpful, I've included a file with our integration, thank you in advance for any help 🙂
yar_nrd.txt

REBLUR VS. RELAX

Hi,

Maybe I missed it, but is there any recommendation between REBLUR vs. RELAX, for example, which denoising method works better in which case, in terms of diffuse signals?

Thanks

Checkerboard mode in Reblur causing issues with DLSS

Hi, I'm having an issue with Reblur Checkerboard with DLSS and was wondering if it's a known issue or if there's a fix for it (something to do with viewport jittering scale factor?)

It makes sense that anyone who's using NRD also probably wants DLSS to upscale it, for performance and quality.

I've seen some ifdefs for WITH_DLSS in some of Nvidia's samples (it may be in RTXDI) so it's probably been noticed already, but I was hoping there might be a fix for this because checkerboard is simply too good for performance to ignore.

It just seems to mess up some parts of the image but not others, so my jitter values are properly set. In the geometry in the background, the oblique lines are properly anti-aliased and upscaled in both cases with no jaggies, but the statue of Venus herself here has too many visual anomalies to ignore (looks like old interlaced video seen on an LCD).

Licensing Issue?

Hi,
Not sure if this is the place to ask, but seems it's not mentioned on the wiki page:
is there any suggestion on getting the proper licensing for use in production in an game engine?

Thanks

Issue during camera translation

Hi,

I am having an issue when using the ReBLUR denoiser. I am rendering a scene (Cornell Box) and producing only a diffuse signal which I am passing over to ReBLUR using only the REBLUR_DIFFUSE method. When I translate the camera, there are artefacts in the scene however if I simply rotate it, everything looks ok. There are two videos below showing the correct rotation behaviour and the broken translation behaviour.

I am using the NRD integration code and have compiled NRD (3.7.0.0) and NRI (1.86.0.0) DLLs for use with my renderer (running on 2080 Ti). Currently I am passing the IN_MV texture as a texture filled with zeros since I am dealing with a simple static scene. All the inputs are listed below:

IN_MV = 0 (static scene)
IN_NORMAL_ROUGHNESS - xyz - normal, w = 0
IN_VIEWZ - Linear view depth
IN_DIFF_RADIANCE_HITDIST = xyz = radiance (without albedo), w = hit distance (1st to 2nd bounce)

I am using DirectX 12 for rendering and using the denoiser as a black box. I use the right-handed coordinate system and matrices are stored in row-major (DirectX Math Library). I use the XMMatrixLookToRH function to build the view matrix and XMMatrixPerspectiveRH for the clip matrix. I am converting any matrices to column-major before passing them in the CommonSettings structure. Previously I had a similar issue (visual-wise) with rotation, but the problem was that I passed the matrix incorrectly. I suspect this issue might still be matrix related but I really can't tell. Is the view matrix format same as the one used in OpenGL?

I am not sure why these strange artefacts appear during camera translation and I hope someone can help. Please do let me know should you require any further information.

rotation.mp4

translation.mp4

[RFE] Writing documentation that is useful outside the C++ ecosystem

By now we have game engines in Go, Rust, JavaScript, WebGpu, etc. that can make use of HLSL/GLSL shaders.
Those could probably make good use of the shader side of this Library to implement nrd denoising.
All we need would be something in a form of this:

Render pass 1 - [name] (is pass optional?)
Description.
Recommended dispatch size, etc.
Bindings:
1. Binding 1 - name - description, format, etc.
2. Binding 2 - name - description, format, etc.
...

Right now its just the NRDSample, huge 4k+ code with all stuff, nrd related and non directly nrd related.

Debug layer and REBLUR::DiffuseOcclusion - Post-blur

When using DX12 and enabling the debug layer, I get the following crash during the call to Denoise

Since multiples REBLUR steps seems to succeed, I think my inputs are in the correct states.
What could cause this state error ?

D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE BEGIN: RayTracingAO [ EXECUTION MESSAGE #1014: BEGIN_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE BEGIN: RayTracingAO::RayTracingAO [ EXECUTION MESSAGE #1014: BEGIN_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE BEGIN: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - Temporal accumulation [ EXECUTION MESSAGE #1014: BEGIN_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE END: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - Temporal accumulation [ EXECUTION MESSAGE #1015: END_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE BEGIN: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - Mip generation [ EXECUTION MESSAGE #1014: BEGIN_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE END: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - Mip generation [ EXECUTION MESSAGE #1015: END_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE BEGIN: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - History fix [ EXECUTION MESSAGE #1014: BEGIN_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE END: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - History fix [ EXECUTION MESSAGE #1015: END_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE BEGIN: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - Blur [ EXECUTION MESSAGE #1014: BEGIN_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE END: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - Blur [ EXECUTION MESSAGE #1015: END_EVENT]
D3D12 MESSAGE: GPU-BASED VALIDATION: TRACE BEGIN: RayTracingAO::RayTracingAO::REBLUR::DiffuseOcclusion - Post-blur [ EXECUTION MESSAGE #1014: BEGIN_EVENT]
D3D12 ERROR: GPU-BASED VALIDATION: Dispatch, Incompatible resource state: Resource: 0x000001CF222E0060:'RaytracingAoPass::hit_distance_buffers[0]', Subresource Index: [0], Descriptor heap index to DescriptorTableStart: [25], Descriptor heap index FromTableStart: [2], Binding Type In Descriptor: UAV, Resource State: D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE|D3D12_RESOURCE_STATE_PIXEL_SHADER_RESOURCE(0xc0), Index of Descriptor Range: 0, Shader Stage: COMPUTE, Root Parameter Index: [1], Dispatch Index: [4], Shader Code: <couldn't find file location in debug info>, Asm Instruction Range: [0x34-0xffffffff], Asm Operand Index: [0], Command List: 0x000001CEA07E0F60:'CommandList 0, Allocator 1', SRV/UAV/CBV Descriptor Heap: 0x000001CE941609A0:'Unnamed ID3D12DescriptorHeap Object', Sampler Descriptor Heap: <not set>, Pipeline State: 0x000001CF044CF9D0:'Unnamed ID3D12PipelineState Object',  [ EXECUTION ERROR #942: GPU_BASED_VALIDATION_INCOMPATIBLE_RESOURCE_STATE]
D3D12: **BREAK** enabled for the previous message, which was: [ ERROR EXECUTION #942: GPU_BASED_VALIDATION_INCOMPATIBLE_RESOURCE_STATE ]

Let me know if you need more information

Best regards

Reblur Compatibility with DLSS 2.1

Hi, with NVidia shipping DLSS to all Unreal developers now, I'm curious how that fits in with NRD / Reblur and this repo.

Can we use both at the same time? Are there plans for further / tighter integration here?

It's not clear to my mind after reading news reports on it how DLSS on its own deals with noisy images in Raytracing scenes. I assume those scenes must be using a denoiser of some kind but I'm curious if DLSS on its own does denoising as part of the upscaling process.

Thanks.

Some questions about texture

Hi there. I feel a little confused about how to use NRD to do denoise. Here are some questions I summarized based on the README.md and 09_RayTracing_NRD.cpp:

I want to use nrd to denoise an image, so the input is " image_path ", how to convert an image to nri::Texture?

At the beginning, I referred to the steps of 09_RayTracing_NRD.cpp. By the function utils::LoadTexture(), I converted image to utils::Texture textureData , and when I wanted to use nri::createTexture() to convert utils::Texture to nri::Texture, I found that the createTexture() function's parameters were format, height, width and mipnum. It seemed that I just created a blank texture.

Maybe I should refer to the method "STEP 4 - WRAP NATIVE POINTERS" in README.md, but I have never contacted DirectX before, so it is a little difficult for me to understand.
How can I get the image after denoising? From the userpool's output slot?
What should I do about unused slots? If I use {nullptr, nullptr, nri::Format::UNKNOWN} ,this error will be generated :

An exception was thrown: Read access permission conflict. state is nullptr.

The error occurred in the Nrd::Dispatch() function:

bool isStateChanged = nextAccess != state->nextAccess || nextLayout != state->nextLayout

Could you give me some advice?

Compiling on linux breaks due to "-Werror"

CMakeLists.txt release v2.12.1 introduced -Werror for non-MSVC compilers. This breaks compilation on linux.
See attached file: Werror_errs.txt. Most of it seems to be in MathLib.h (-Werror=strict-aliasing).

Compiler is gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4)

Visual glitches in Vulkan NRD Bistro scene with v460.89 non-beta drivers

Hi, just checked the latest build, noticed the visual glitches are still there with the official non-beta Nvidia drivers (v460.89).

I'm curious what causes this and whether it has any implications for us integrators.

Resource State Transition Error (REBLUR - Diffuse)

Hi,

I am experiencing issues with the resource state of the output texture passed to the denoiser. The DirectX debug layer breaks with ERROR RESOURCE_MANIPULATION #527: RESOURCE_BARRIER_BEFORE_AFTER_MISMATCH. It seems as if the current state of the resource is different from the barrier being issued. This error is not deterministic, it sometimes occurs at frame 2, frame 3, frame 6, etc.

All input textures are in D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE and the output texture is in D3D12_RESOURCE_STATE_UNORDERED_ACCESS before executing the denoiser (with method nrd::Method::REBLUR_DIFFUSE). I used ID3D12DebugCommandList to verify this is the case. The NRI mapping of the textures is shown below:

nri::TextureD3D12Desc textureDesc = {};

// SRVs
textureDesc.d3d12Resource = *in_mv;
NRI->CreateTextureD3D12(*nriDevice, textureDesc, (nri::Texture*&)entryDescs[0].texture);
entryDescs[0].nextAccess = nri::AccessBits::SHADER_RESOURCE;
entryDescs[0].nextLayout = nri::TextureLayout::SHADER_RESOURCE;

.
.
.
// UAV
textureDesc.d3d12Resource = *out_diff_radiance_hitdist;
NRI->CreateTextureD3D12(*nriDevice, textureDesc, (nri::Texture*&)entryDescs[4].texture);
entryDescs[4].nextAccess = nri::AccessBits::SHADER_RESOURCE_STORAGE;
entryDescs[4].nextLayout = nri::TextureLayout::GENERAL;

I am using the NRD integration code and have compiled NRD (3.4.1.0) and NRI (1.84.0.0) DLLs for use with my renderer (running on 2080 Ti).

Part of the call stack:

NRI.dll!nri::CommandBufferD3D12::PipelineBarrier(const nri::TransitionBarrierDesc * transitionBarriers, const nri::AliasingBarrierDesc * aliasingBarriers, nri::BarrierDependency dependency) Line 526
	at C:\Users\kvnna\source\repos\NRI\Source\D3D12\CommandBufferD3D12.cpp(526)
NRI.dll!CmdPipelineBarrier(nri::CommandBuffer & commandBuffer, const nri::TransitionBarrierDesc * transitionBarriers, const nri::AliasingBarrierDesc * aliasingBarriers, nri::BarrierDependency dependency) Line 43
	at C:\Users\kvnna\source\repos\NRI\Source\D3D12\CommandBufferD3D12.hpp(43)
CandelaDXR.exe!NrdIntegration::Dispatch(nri::CommandBuffer & commandBuffer, nri::DescriptorPool & descriptorPool, const nrd::DispatchDesc & dispatchDesc, const std::array<NrdIntegrationTexture,34> & userPool, bool enableDescriptorCaching) Line 512
	at C:\Users\kvnna\source\repos\kvnnap\CandelaDXR\CandelaDXR\NVIDIA\NRDIntegration.hpp(512)
CandelaDXR.exe!NrdIntegration::Denoise(unsigned int consecutiveFrameIndex, nri::CommandBuffer & commandBuffer, const nrd::CommonSettings & commonSettings, const std::array<NrdIntegrationTexture,34> & userPool, bool enableDescriptorCaching) Line 406
	at C:\Users\kvnna\source\repos\kvnnap\CandelaDXR\CandelaDXR\NVIDIA\NRDIntegration.hpp(406)

Error:

D3D12 ERROR: ID3D12CommandList::ResourceBarrier: Before state (0x8: D3D12_RESOURCE_STATE_UNORDERED_ACCESS) of resource (0x000001FC972A13E0:'out_diff_radiance_hitdist') (subresource: 0) specified by transition barrier does not match with the current resource state (0x40: D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE) (assumed at first use) [ RESOURCE_MANIPULATION ERROR #527: RESOURCE_BARRIER_BEFORE_AFTER_MISMATCH]
D3D12: **BREAK** enabled for the previous message, which was: [ ERROR RESOURCE_MANIPULATION #527: RESOURCE_BARRIER_BEFORE_AFTER_MISMATCH ]
Exception thrown at 0x00007FFCB8A8474C (KernelBase.dll) in CandelaDXR.exe: 0x0000087A (parameters: 0x0000000000000001, 0x00000059CD548F40, 0x00000059CD54AD20).
Unhandled exception at 0x00007FFCB8A8474C (KernelBase.dll) in CandelaDXR.exe: 0x0000087A (parameters: 0x0000000000000001, 0x00000059CD548F40, 0x00000059CD54AD20).

Additionally, if I perform resource state assertions prior to calling the denoiser I get the following:

NRI.dll!nri::PipelineLayoutD3D12::SetDescriptorSetsImpl<0>(ID3D12GraphicsCommandList & graphicsCommandList, unsigned int baseIndex, unsigned int setNum, const nri::DescriptorSet * const * descriptorSets, const unsigned int * offsets) Line 130
	at C:\Users\kvnna\source\repos\NRI\Source\D3D12\PipelineLayoutD3D12.h(130)
NRI.dll!nri::PipelineLayoutD3D12::SetDescriptorSets(ID3D12GraphicsCommandList & graphicsCommandList, bool isGraphics, unsigned int baseIndex, unsigned int setNum, const nri::DescriptorSet * const * descriptorSets, const unsigned int * offsets) Line 156
	at C:\Users\kvnna\source\repos\NRI\Source\D3D12\PipelineLayoutD3D12.h(156)
NRI.dll!nri::CommandBufferD3D12::SetDescriptorSets(unsigned int baseIndex, unsigned int setNum, const nri::DescriptorSet * const * descriptorSets, const unsigned int * offsets) Line 293
	at C:\Users\kvnna\source\repos\NRI\Source\D3D12\CommandBufferD3D12.cpp(293)
NRI.dll!CmdSetDescriptorSets(nri::CommandBuffer & commandBuffer, unsigned int baseSlot, unsigned int descriptorSetNum, const nri::DescriptorSet * const * descriptorSets, const unsigned int * dynamicConstantBufferOffsets) Line 53
	at C:\Users\kvnna\source\repos\NRI\Source\D3D12\CommandBufferD3D12.hpp(53)
CandelaDXR.exe!NrdIntegration::Dispatch(nri::CommandBuffer & commandBuffer, nri::DescriptorPool & descriptorPool, const nrd::DispatchDesc & dispatchDesc, const std::array<NrdIntegrationTexture,34> & userPool, bool enableDescriptorCaching) Line 515
	at C:\Users\kvnna\source\repos\kvnnap\CandelaDXR\CandelaDXR\NVIDIA\NRDIntegration.hpp(515)
CandelaDXR.exe!NrdIntegration::Denoise(unsigned int consecutiveFrameIndex, nri::CommandBuffer & commandBuffer, const nrd::CommonSettings & commonSettings, const std::array<NrdIntegrationTexture,34> & userPool, bool enableDescriptorCaching) Line 406
	at C:\Users\kvnna\source\repos\kvnnap\CandelaDXR\CandelaDXR\NVIDIA\NRDIntegration.hpp(406)

D3D12 ERROR: CGraphicsCommandList::SetComputeRootDescriptorTable: Resource state (0x8: D3D12_RESOURCE_STATE_UNORDERED_ACCESS) (assumed at previous call to AssertResourceState) of resource (0x000002277F1C6BE0:'out_diff_radiance_hitdist') (subresource: 0) is invalid for use as a NON_PIXEL_SHADER_RESOURCE.  Expected State Bits (all): 0x40: D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE, Assumed Actual State: 0x8: D3D12_RESOURCE_STATE_UNORDERED_ACCESS, Missing State: 0x40: D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE. This validation is being enforced at bind during command list recording because the descriptor for this resource in the root signature is declared as DATA_STATIC_WHILE_SET_AT_EXECUTE. [ EXECUTION ERROR #538: INVALID_SUBRESOURCE_STATE]
D3D12: **BREAK** enabled for the previous message, which was: [ ERROR EXECUTION #538: INVALID_SUBRESOURCE_STATE ]
Exception thrown at 0x00007FFCB8A8474C (KernelBase.dll) in CandelaDXR.exe: 0x0000087A (parameters: 0x0000000000000001, 0x0000003D400F8250, 0x0000003D400FA030).
Unhandled exception at 0x00007FFCB8A8474C (KernelBase.dll) in CandelaDXR.exe: 0x0000087A (parameters: 0x0000000000000001, 0x0000003D400F8250, 0x0000003D400FA030).

I am not sure why this is happening and I hope someone can help. Please do let me know should you require any further information.

Can't get NRD to build

Hi devs from NV,

I've tried to get it to build for a bit, but no luck yet. I currently have these versions installed:

Latest Vulkan SDK 1.2.170.0
Latest Windows 20H2 19042.844
Latest Visual Studio 2017 15.9.33 and 2019 16.9.0 (neither work)
The latest version of MBulli.SmartCommandlineArguments.
(Irrelevant for building I think) Latest version of Nvidia driver (461.72)

Both ways listed in the README don't seem to work, neither manually building via visual studio nor automatically via MSBuild (Via the second command).

1-Deploy & 2-Build (Entered 2: vs2019, but 1 vs2017 also doesn't work)

"D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\SANDBOX.sln" (Build target) (1) ->
"D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\09_RayTracing_NRD\09_RayTracing_NRD.vcxproj" (default target)
(2) ->
"D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\SampleBase\SampleBase.vcxproj" (default target) (8) ->
"D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\CompileShaders\CompileShaders.vcxproj" (default target) (9) ->
(PostBuildEvent target) ->
  EXEC : error : failed to wait for process: 258 ('CompileShader.bat cs "D:\programming\work\RayTracingDenoiser\NRD\Sha
ders\REBLUR_DiffuseSpecular_PreBlur.cs.hlsl" "D:\programming\work\RayTracingDenoiser\_Data\Shaders\REBLUR_DiffuseSpecul
ar_PreBlur.cs" "D:\programming\work\RayTracingDenoiser\_Build\Shaders\REBLUR_DiffuseSpecular_PreBlur.cs"') [D:\programm
ing\work\RayTracingDenoiser\_Compiler\vs2019\CompileShaders\CompileShaders.vcxproj]
  EXEC : error : the process is still alive: 1 [D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\CompileShaders\
CompileShaders.vcxproj]
  EXEC : error : failed to execute the command line: 'CompileShader.bat cs "D:\programming\work\RayTracingDenoiser\NRD\
Shaders\REBLUR_DiffuseSpecular_PreBlur.cs.hlsl" "D:\programming\work\RayTracingDenoiser\_Data\Shaders\REBLUR_DiffuseSpe
cular_PreBlur.cs" "D:\programming\work\RayTracingDenoiser\_Build\Shaders\REBLUR_DiffuseSpecular_PreBlur.cs"' (result: 2
59) [D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\CompileShaders\CompileShaders.vcxproj]
  EXEC : error : failed to wait for process: 258 ('CompileShader.bat cs "D:\programming\work\RayTracingDenoiser\NRD\Sha
ders\REBLUR_Diffuse_PostBlur.cs.hlsl" "D:\programming\work\RayTracingDenoiser\_Data\Shaders\REBLUR_Diffuse_PostBlur.cs"
 "D:\programming\work\RayTracingDenoiser\_Build\Shaders\REBLUR_Diffuse_PostBlur.cs"') [D:\programming\work\RayTracingDe
noiser\_Compiler\vs2019\CompileShaders\CompileShaders.vcxproj]
  EXEC : error : failed to wait for process: 258 ('CompileShader.bat cs "D:\programming\work\RayTracingDenoiser\NRD\Sha
ders\SVGF_FilterMoments.cs.hlsl" "D:\programming\work\RayTracingDenoiser\_Data\Shaders\SVGF_FilterMoments.cs" "D:\progr
amming\work\RayTracingDenoiser\_Build\Shaders\SVGF_FilterMoments.cs"') [D:\programming\work\RayTracingDenoiser\_Compile
r\vs2019\CompileShaders\CompileShaders.vcxproj]
  EXEC : error : the process is still alive: 1 [D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\CompileShaders\
CompileShaders.vcxproj]
  EXEC : error : the process is still alive: 1 [D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\CompileShaders\
CompileShaders.vcxproj]
  EXEC : error : failed to execute the command line: 'CompileShader.bat cs "D:\programming\work\RayTracingDenoiser\NRD\
Shaders\SVGF_FilterMoments.cs.hlsl" "D:\programming\work\RayTracingDenoiser\_Data\Shaders\SVGF_FilterMoments.cs" "D:\pr
ogramming\work\RayTracingDenoiser\_Build\Shaders\SVGF_FilterMoments.cs"' (result: 259) [D:\programming\work\RayTracingD
enoiser\_Compiler\vs2019\CompileShaders\CompileShaders.vcxproj]
  EXEC : error : failed to execute the command line: 'CompileShader.bat cs "D:\programming\work\RayTracingDenoiser\NRD\
Shaders\REBLUR_Diffuse_PostBlur.cs.hlsl" "D:\programming\work\RayTracingDenoiser\_Data\Shaders\REBLUR_Diffuse_PostBlur.
cs" "D:\programming\work\RayTracingDenoiser\_Build\Shaders\REBLUR_Diffuse_PostBlur.cs"' (result: 259) [D:\programming\w
ork\RayTracingDenoiser\_Compiler\vs2019\CompileShaders\CompileShaders.vcxproj]
  D:\programs\vs\ide\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(155,5): error MSB3073: The command ""D:/prog
ramming/work/RayTracingDenoiser/_Build/vs2019/Bin/Release/CompileShaders.exe" [D:\programming\work\RayTracingDenoiser\_
Compiler\vs2019\CompileShaders\CompileShaders.vcxproj]
D:\programs\vs\ide\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(155,5): error MSB3073: :VCEnd" exited with cod
e 1. [D:\programming\work\RayTracingDenoiser\_Compiler\vs2019\CompileShaders\CompileShaders.vcxproj]

    0 Warning(s)
    10 Error(s)

I'd be interested in hearing if I maybe forgot to update/install something that wasn't listed in the README or if you know what's going on. I also tested f9067ef, since the latest is listed as WIP; no luck unfortunately.

If it helps, my install locations aren't at the default locations, so maybe the shader compiler can't find the proper tools from spirv?

-Niels Brunekreef

Artifacts on Vulkan with beta drivers

When using the vulkan backend, I get huge artifacts in the top region of the image as well as random flickering (horizontal lines, multiple pixels in height). This may be related to the glitches mentioned in #2.

I'm using:

Visual Studio 2019
latest Windows SDK
Vulkan SDK 1.2.141.2
Nvidia RTX 2070 Super (beta drivers: 457.00)

I need the beta drivers for support of VK_KHR_ray_tracing in a different project.
Furthermore, when debugging the application with Nvidia Nsight Graphics, everything seems fine after ray tracing, so the issue should lie elsewhere.
From a quick glance, I noticed that the artifacts first appear in the "Translucent shadow - pre blur" pass, specifically in the "gln_History" image.

.

Instructions on homepage for configuring project don't seem to work

Hi there,

I've just downloaded this denoiser and I'm keen to get it going. I thought I'd start off with building a sample and figuring out how it's all put together.

I tried following the instructions on the home page. My machine has the correct Windows 10 version and the latest SDk. I tried to run 1-Deploy.bat (not deploy.bat as it says on the page), and I get the following errors:
Traceback (most recent call last):
File "C:\packman-repo\packman-common\6.8.1\packman.py", line 58, in
import link as linkmodule
File "C:\packman-repo\packman-common\6.8.1\link.py", line 17, in
from packman import CONSOLE_ENCODING
File "C:\packman-repo\packman-common\6.8.1\packman.py", line 2184, in
read_configuration()
File "C:\packman-repo\packman-common\6.8.1\packman.py", line 1537, in read_configuration
configs.append(parser.parse_file(file_path))
File "C:\packman-repo\packman-common\6.8.1\xmlparser.py", line 194, in parse_file
root = self._parse_file(filename)
File "C:\packman-repo\packman-common\6.8.1\xmlparser.py", line 225, in _parse_file
p.ParseFile(fileobject)
File "c:_work\16\s\modules\pyexpat.c", line 471, in EndElement
File "C:\packman-repo\packman-common\6.8.1\xmlparser.py", line 167, in end_element
element.end_handler(self)
File "C:\packman-repo\packman-common\6.8.1\schemaparser.py", line 878, in end_handler
ConfigElement.REMOTES,
File "C:\packman-repo\packman-common\6.8.1\xmlparser.py", line 99, in raise_error
raise PackmanError(self._create_log_message(msg, *args))
errors.PackmanError: C:\Users\tom.hammersley\source\repos\RayTracingDenoiser\ProjectBase\Packman\bootstrap..\config.packman.xml(line 3): Remote named 's3' is listed in attribute 'remotes' to but not defined!

Is there something else I need to do, configure or define?

Thanks.

Importance sampling cuts performance by 50% in Bistro test #4

Hi again, I'm trying to figure out if there's a bug in the importance sampling, or if it's just expensive to compute in dynamic scenes with moving emissive triangles.

When I go to the fourth test scenario in Bistro (makes a bunch of moving boxes), it's significantly slower than other scenes, and the importance sampling checkbox glows red, drawing attention to itself as being the bottleneck for perf. (nice!)

Out of curiosity, I took a couple GPU captures using NSight graphics (because CPU % doesn't change much so I figure it's all done on GPU in the shaders, I haven't analyzed all the code yet), to compare the timings with importance sampling on vs off.

As you can see, the main difference seems to be the raytracing debug marker jumps from 9.26ms to 27.8ms when importance sampling is enabled, which makes sense with a bunch of moving glowy boxes.

But, I'm confused how the Nsight capture bar for "Raytracing" isn't three times longer, and more than that, I'm trying to understand why the performance is cut in half. And more importantly, what can be done to mitigate that. Cause I need a bunch of moving lights in my game. Is that where RTXGI comes in?

It looks like the TLAS update is very fast for both the static scene and the dynamic box test scene, and there are no BLAS updates in the captures here, so all the meshes are static and remain in memory I guess.

Can anyone explain why importance sampling triples the cost of the raytracing? It looks darker without it enabled, but still very nice, so I wonder if simply tone mapping it could do if you need the extra performance. Also, if the importance sampling can be cut into sub-parts, so that individual aspects are toggle-able, that might make it still bright enough but not so slow. (like just cosine / hemispherical sampling rather than iterating over all the meshes with emissive components and checking their visibility to the current shading point, which is what I assume is happening here).

The capture doesn't really make much sense, since toggling it goes from ~60FPS (on) to 120FPS (off), but if the ray tracing part alone takes 27.8ms with importance sampling, it seems like it should run at more like 30 FPS. This is with Vsync off on a 60 Hz monitor to avoid back pressure from the swap chain.

Thanks for your time and any insights here.

Sample project is missing in 2.50

It looks like it was deleted, was that on purpose?

Initial window position

I'am currently having some issues with the window that is positioned weirdly (see screen shot below, for privacy reasons I covered the unimportant regions with red, I also highlighted the border between the screens with a red line).

In particular, when starting the application (via 3b-RunNRDSample.bat or manually) the window is opened in the center between my 2 screens (dual monitor setup). So one half of the window is on one screen and the other half on the other one.
It also seems to automatically enable borderless fullscreen if the resolution is set to the native resolution of the screen, which is generally fine except in my case I cannot move the window anymore and it is stuck inbetween my 2 screens.

Using a resolution below the native resolution leads to a window with border and title bar which can be moved withut any issues.

please invite to the organization

hi,invite me to join organization, thanks

NRI SDK deployment does not copy DLL

The NRI SDK deployment batch script does not copy the debug / release NRI DLL (NRI.dll):

4a-Prepare NRI SDK.bat
https://github.com/NVIDIAGameWorks/RayTracingDenoiser/blob/93ea3f5e1c1c330aa0ce4a477b74d96516e47513/4a-Prepare%20NRI%20SDK.bat

This DLL is necessary to run the sample application, and so presumably should be part of a complete SDK. The NRD deployment script does copy the NRD DLL.

CompileShaders concurrency issue on 32-core Threadripper

Hi there, I had some issues compiling the shaders, both using the batch file and in SANDBOX.sln. I tried inspecting why some of the shader compiles were timing out by checking their command line strings and running each manually in CMD, and they compiled just fine. Also, the hang was occurring on different shaders debug and release, so I figured it was a timing or a race condition.

In CompileShaders.cpp, I simply forced the number of hardware threads to 1 from my default of 64 (AMD TR4 2990wx CPU, 32 cores with 64 threads), and voila, no hangs and it compiles them all successfully without hanging or timeouts.

const size_t threadNum = 1; // std::thread::hardware_concurrency();

I suspect there's something going on inside dxc.exe when operating on the same included files concurrently. Maybe it's just on my CPU, I guess not everyone has 32-core CPUs :)

Compilation problem with HLSL 2021

Hi,

It recently comes to my attention that the NRD.hlsli has these two lines

float2 octWrap = ( 1.0 - abs( v.yx ) ) * ( v.xy >= 0.0 ? 1.0 : -1.0 );

line 233

n.xy += n.xy >= 0.0 ? -t : t;

line 246

It would be nice if the header file could add a macro to identify the HLSL version as in HLSL 2021 as these two lines won't compile.
https://github.com/microsoft/DirectXShaderCompiler/wiki/HLSL-2021#logical-operation-short-circuiting-for-scalars

So for project with HLSL 2021 setup, they would have to adjust the code manually to workaround the issue.
Ideally, this could be skipped by asking the user to define something like NRD_HLSL_2021

And do something like this

#if NRD_HLSL_2021
    float2 octWrap = (1.0 - abs(v.yx)) * select(v.xy >= 0., 1., -1.);
#else
    float2 octWrap = ( 1.0 - abs( v.yx ) ) * ( v.xy >= 0.0 ? 1.0 : -1.0 );
#endif

This way, the users of the denoiser would not need to touch the code inside the header to fix the compiling issue.

Thanks
Jiayin

how to fix the ImGui link error?

1>------ Build started: Project: 09_RayTracing_NRD, Configuration: Debug x64 ------
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::Separator(void)" (?Separator@ImGui@@yaxxz) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::NewLine(void)" (?NewLine@ImGui@@yaxxz) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::Spacing(void)" (?Spacing@ImGui@@yaxxz) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::Dummy(struct ImVec2 const &)" (?Dummy@ImGui@@YAXAEBUImVec2@@@z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::AlignTextToFramePadding(void)" (?AlignTextToFramePadding@ImGui@@yaxxz) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::TextUnformatted(char const *,char const *)" (?TextUnformatted@ImGui@@YAXPEBD0@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::Text(char const *,...)" (?Text@ImGui@@YAXPEBDZZ) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::TextV(char const *,char *)" (?TextV@ImGui@@YAXPEBDPEAD@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::TextColored(struct ImVec4 const &,char const *,...)" (?TextColored@ImGui@@YAXAEBUImVec4@@PEBDZZ) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::TextColoredV(struct ImVec4 const &,char const *,char *)" (?TextColoredV@ImGui@@YAXAEBUImVec4@@PEBDPEAD@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::TextDisabled(char const *,...)" (?TextDisabled@ImGui@@YAXPEBDZZ) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::TextDisabledV(char const *,char *)" (?TextDisabledV@ImGui@@YAXPEBDPEAD@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::TextWrapped(char const *,...)" (?TextWrapped@ImGui@@YAXPEBDZZ) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::TextWrappedV(char const *,char *)" (?TextWrappedV@ImGui@@YAXPEBDPEAD@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::LabelText(char const *,char const *,...)" (?LabelText@ImGui@@YAXPEBD0ZZ) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::LabelTextV(char const *,char const *,char *)" (?LabelTextV@ImGui@@YAXPEBD0PEAD@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::BulletText(char const *,...)" (?BulletText@ImGui@@YAXPEBDZZ) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "void __cdecl ImGui::BulletTextV(char const *,char *)" (?BulletTextV@ImGui@@YAXPEBDPEAD@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "bool __cdecl ImGui::Button(char const *,struct ImVec2 const &)" (?Button@ImGui@@YA_NPEBDAEBUImVec2@@@z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "bool __cdecl ImGui::SmallButton(char const *)" (?SmallButton@ImGui@@YA_NPEBD@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)
1>ImGui.lib(imgui_widgets.obj) : error LNK2005: "bool __cdecl ImGui::ArrowButton(char const *,int)" (?ArrowButton@ImGui@@YA_NPEBDH@Z) already defined in imguid.lib(imgui_widgets.cpp.obj)

NRI

Hi there.
Do I have to use NRI in order to use NRD?

VS 2019 build problem

Hi, thanks for the help fixing the build for vs2017 today, but there's a compile error relating to vector intrinsics when I call 1-Deploy.bat vs2019, or open it up manually and build in 2019.

Also 2-Build.bat is hardcoded to point to the vs2017 solution path instead of %1, but that's minor and doesn't matter as much (just wanted to let you know, for completeness).

Blue output in Vulkan mode (NRD Bistro scene)

Hello again, ran into another issue. It looks like the blue and red channels are swapped in Vulkan mode

Here's the repro command line:

_Build\vs2017\Bin\Release\09_RayTracing_NRD.exe --api=VULKAN --width=1280 --height=720 --testMode --scene=Bistro/BistroInterior.fbx

I just updated my drivers to 457.09 regular, I was on beta drivers (for KHR ray tracing support) 457.00 before, which was also blue but had some glitchy stuff at the top.

I'm using Visual Studio 2019, latest Windows 10 SDK, and Vulkan SDK 1.2.154.1. I also tried two different versions of the dxc.exe, the one that's included in Windows 10 kits, and another version that's needed for Vulkan 1.2 support. Both give me blue, but otherwise work fine.

I just checked the 3a--.bat test suite, and Vulkan looks identical to DX12 in all the tests, except the NRD one. The shaderballs uses an obj file format (instead of FBX) and it also has swapped R<->B colour channels, so we know the issue isn't in the loader but the NRD code. I'll look around to see if it's something obvious (probably is).

Great job by the way, the bistro scene is stunning. What amazing tech.

Intrinsics like _mm_sincos_ps undeclared

Hi,

Using Visual Studio 2019 Clang integration, I cannot seem to build the NRD library.
I keep getting this error

error : use of undeclared identifier '_mm_sincos_ps'; did you mean '_mm_min_ps'? 
error : use of undeclared identifier '_mm_tan_ps' 
error : use of undeclared identifier '_mm_tan_ps' 
error : use of undeclared identifier '_mm_asin_ps' 
error : use of undeclared identifier '_mm_asin_ps' 
 ...

Any extra architecture I should set via cmake ?

Small problems in CMakeLists.txt and Wrapper.cpp

I had these small problems with CMakeLists.txt on win32 and linux. Maybe there are better solutions (I don't know enough cmake):

CMakeLists.txt line 4: set(NRD_DXC_CUSTOM_PATH "custom/path/to/dxc")
I had to ~~uncomment~~ comment out this line to specify the dxc-path via cmake -D.
(Edit: The line was of course "commented out" ("removed"), not "uncommented").
CMakeLists.txt lines 48+49:
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /MT")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} /MTd")
I had to remove "/MT" and "/MTd" to build a static lib that can be used in a "/MD"-project.

And this prevents compilation on linux:

Wrapper.cpp: line 13 contains backslashes in #include "..\Resources\Version.h".
Changing to forward slashes fixes the problem.

the value of m_JitterDelta

I had a question while integrating NRD v3.7.0:

The value of m_JitterDelta:
As far as I see in DenoiserImpl.cpp : 654:

    float dx = ml::Abs(m_CommonSettings.cameraJitter[0] - m_JitterPrev.x);
    float dy = ml::Abs(m_CommonSettings.cameraJitter[1] - m_JitterPrev.y);
    m_JitterDelta = ml::Max(dx, dy);

but DenoiserImpl.cpp : 506:

    m_JitterPrev = ml::float2(m_CommonSettings.cameraJitter[0], m_CommonSettings.cameraJitter[1]);

it looks like m_JitterDelta will always be 0 since dx and dy are 0? or maybe I missed something?

Thanks

SSE instruction throws illegal instruction.

Is there any requirements for CPU? My CPU is 2 E5-2665, which should support SSE instructions.
For any sample scene, the program always crashes at these SSE instructions:

Texture clear dispatch thread count issue

Hi,
For non pool textures, width size of the clear command is init to the height of the texture in DenoiserImpl.cpp line 924
Regards

Question on Hit Distance Reconstruction: Why use border of 1 instead of 2?

Hi,

With the new release I was very interested in how you attacked hit distance reconstruction problem.
As I understand at sample code a Bayer sequence 4x4 is used with a neighbor search at denoiser (reLAX for example) in order to find a good estimation of specular hit distance in case the sample shot a "diffuse-ray".
Because the search is being done with border of 1 this mean at least 1 in 4 (0.25) samples should be specular sampled.
However if border of 2 is used this threshold can come down to 1 in 16 samples (0.0625).
Some test code: https://godbolt.org/z/YGYo1rjnM

What the reasoning behind choosing border of 1 instead of 2? I am curious, want to learn 😄
Maybe allow an option? Also providing more info at readme can be useful for others.

Thank you in advance.

NV -> KHR ray tracing extensions migration for Vulkan

Just wondering if this is on the roadmap (one would presume). I guess it doesn't matter if I'm just using the base NRD integration instead of NRI wrapper, but I wonder if this is coming soon.

Thanks.

ReLAX Prepass and/or Reproject turns zeroed radiance values into nan

Prepass:
Before:

After:

If prepass is disabled entirely, reproject will cause similar issues
Reproject:
Before:

After:

NSight graphics on Vulkan: "Acceleration structure tracking failed an integrity check"

Not sure if this is a serious issue, but thought I'd report it just the same. Also the bistro scene.

Using latest Nsight graphics version (2020.5.2), Vulkan SDK 1.2.154.1, and beta Nvidia drivers v457.33 on an RTX 2080 Ti.

nvidiagameworks / raytracingdenoiser Goto Github PK

raytracingdenoiser's Introduction

NVIDIA REAL-TIME DENOISERS v4.8.1 (NRD)

OVERVIEW

HOW TO BUILD?

HOW TO UPDATE?

HOW TO REPORT ISSUES?

API

NON-NOISY INPUTS

NOISY INPUTS

IMPROVING OUTPUT QUALITY

VALIDATION LAYER

MEMORY REQUIREMENTS

INTEGRATION VARIANTS

VARIANT 1: Black-box library (using the application-side Render Hardware Interface)

VARIANT 2: White-box library (using the application-side Render Hardware Interface)

VARIANT 3: Black-box library (using native API pointers)

RECOMMENDATIONS AND BEST PRACTICES: GREATER TIPS

MATERIAL DE-MODULATION (IRRADIANCE → RADIANCE)

COMBINED DENOISING OF DIRECT AND INDIRECT LIGHTING

INTERACTION WITH PRIMARY SURFACE REPLACEMENTS (PSR)

INTERACTION WITH INFs AND NANs

INTERACTION WITH FRAME GENERATION TECHNIQUES

HAIR DENOISING TIPS

RECOMMENDATIONS AND BEST PRACTICES: LESSER TIPS

raytracingdenoiser's People

Contributors

Stargazers

Watchers

Forkers

raytracingdenoiser's Issues

Recommend Projects

Recommend Topics

Recommend Org

INTERACTION WITH `INFs` AND `NANs`