21 KiB
Raw Blame History

title, date, excerpt, tags, rating
title date excerpt tags rating
VirtualTexture学习笔记 2024-02-20 18:26:49

前言

相关概念

  • Virtual Texture虚拟纹理以下简称 VT
  • Runtime Virtual TextureUE4 运行时虚拟纹理系统,以下简称 RVT
  • VT feedback存储当前屏幕像素对应的 VT Page 信息,用于加载 VT 数据。
  • VT Physical Texture虚拟纹理对应的物理纹理资源
  • PageTable虚拟纹理页表用来寻址 VT Physical Texture Page 数据。
  • PageTable Texture包含虚拟纹理页表数据的纹理资源通过此纹理资源可查询 Physical Texture Page 信息。有些 VT 系统也叫 Indirection Texture由于本文分析 UE4 VT 的内容,这里采用 UE4 术语。
  • PageTable Buffer包含虚拟纹理页表数据内容的 GPU Buffer 资源。

相关类:

  • URuntimeVirtualTextureUObject
    • FRuntimeVirtualTextureRenderResource
  • UVirtualTextureUObject
  • UVirtualTexture2DUTexture2D

UE5 VirtualHeightfieldMesh简述

https://zhuanlan.zhihu.com/p/575398476

可能的相关类

  • VirtualHeightfieldMesh
    • UVirtualHeightfieldMeshComponent
      • UHeightfieldMinMaxTexture
        • BuildTexture()
    • FVirtualHeightfieldMeshSceneProxy
    • FVirtualHeightfieldMeshRendererExtension
      • AddWork()
      • SubmitWork()
    • FVirtualTextureFeedbackBuffer 参考#Pass1的补充VirtualTextureFeedback
  • UNiagaraDataInterfaceLandscape
  • UNiagaraDataInterfaceVirtualTexture(NiagaraDataInterfaceVirtualTextureTemplate.ush)
    • GetAttributesValid()
    • SampleRVTLayer()
    • SampleRVT()
  • URuntimeVirtualTextureComponent

VirtualHeightfieldMesh

首先是MinMaxTexture。全称UHeightfieldMinMaxTexture下简称MinMaxTexture可以说是整个VHM中最重要的部分之一。它是离线生成的目的主要是以下几个

  1. 用作Instance的剔除遮挡剔除查询Frustum剔除
  2. 用作决定VHM的LOD
  3. 用作平滑VHM的顶点位置

其中比较关键的几个成员变量为:

  • TObjectPtr TextureBGRA8格式、贴图大小与RVT的Tile数量一致、有全部mipmap。每个像素存储RVT一个Tile中的最小值以及最大值各为16bit、encode在RGBA的4个通道上。
  • TObjectPtr LodBiasTextureG8格式、贴图大小与RTV的Tile数量一致、无mipmap。每个像素存储了Texture对应像素周围3x3blur之后的结果。
  • TObjectPtr LodBiasMinMaxTextureBGRA8格式、贴图大小与RTV的Tile数量一致、有全部mipmap。类似于HZB、每个像素存储LodBiasTexture的最小值以及最大值各为8bit、存在RG两个通道上。
  • int32 MaxCPULevels表示共需要在CPU端存储多少层level的数据。
  • TArray<FVector2D> TextureDataCPU端存储Texture贴图的数据共MaxCPULevels层mipmap。

TextureData的获取

因此要生成MinMaxTexture、最关键的就是要得到TextureData其入口函数为位于HeightfieldMinMaxTextureBuilder.cppVirtualHeightfieldmesh::BuildMinMaxTexture中。由于Texture存储的是RVT中每个Tile中最小最大值因此不难想象到其大致流程可以分为以下几步

  1. 遍历RVT的每个Tile并绘制到一张中间贴图上然后计算这张中间纹理的最小最大值、存储至目标贴图对应的位置上
  2. 为目标贴图生成mipmap
  3. 将目标贴图回读至CPU、得到TextureData。

将Tile绘制到一张中间贴图使用的是自带的RuntimeVirtualTexture::RenderPagesStandAlone函数计算最小最大值是通过Downsample的方式计算而成。如下图所示为2x2Tiles、4TileSize的RVT计算Tile0的最小最大值的示意过程图

Downsample的ComputeShader为TMinMaxTextureCS。遍历计算完每个Tile的最小最大值后同样通过Downsample为目标贴图生成全mipmap。

最后为了将贴图回读到CPU先是通过CopyTexture的方式将目标贴图的各个mipmap复制到各自带有CPUReadback Flag的贴图后再通过MapStagingSurface/UnmapStagingSurface的方式复制到CPU内存上。由于是比较常规的操作就不过多介绍了。

至此也就得到了带有所有mipmap的CPU端的TextureData接着将此作为参数调用UHeightfieldMinMaxTexture::BuildTexture以生成剩下的内容即Texture、LodBiasTexture、LodBiasMinMaxTexture

FVirtualHeightfieldMeshSceneProxy

至此离线生成的MinMaxTexture介绍完毕后面都是实时渲染内容的介绍部分。所有内容都围绕着VHM的SceneProxy也就是FVirtualHeightfieldMeshSceneProxy展开。

遮挡剔除

关于硬件的遮挡剔除用法可以参考DX12的官方sample[8]

首先是遮挡剔除部分VHM做了Tile级别且带有LOD的遮挡剔除。VHM的SceneProxy重写了函数GetOcclusionQueries函数实现只是简单地返回OcclusionVolumes OcclusionVolumes的构建在函数BuildOcclusionVolumes中其基本思路为取MinMaxTexture中CPU端的TextureData的数据、获得每个Tile的高度最小最大值来创建该Tile的Bounds信息。

可以看到OcclusionVolumes是带有Lod的。当然实际上这里的代码的LodIndex不一定从0开始因为Component中有一项成员变量NumOcclusionLod、表示创建多少层mipmap的OcclusionVolumes。另外有一点需要注意的是NumOcclusionLod默认值为0、也就是说VHM的遮挡剔除默认没有开启。

由于VHM需要在ComputePass中动态地构建Instance绘制的IndirectArgs、因此SceneProxy还重写了函数AcceptOcclusionResults用以获取遮挡剔除的结果。具体是将UE返回的遮挡剔除的结果存在贴图OcclusionTexture上、以便能够作为SRV在后续的Pass中访问

void FVirtualHeightfieldMeshSceneProxy::AcceptOcclusionResults(FSceneView const* View, TArray<bool>* Results, int32 ResultsStart, int32 NumResults)
{
    // 由于构建IndirectArgs跟SceneProxy不在同一个地方因此用了一个全局变量来保存遮挡剔除的结果
    FOcclusionResults& OcclusionResults = GOcclusionResults.Emplace(FOcclusionResultsKey(this, View));
    OcclusionResults.TextureSize = OcclusionGridSize;
    OcclusionResults.NumTextureMips = NumOcclusionLods;
        
    // 创建贴图并将遮挡剔除结果Copy至贴图上
    FRHIResourceCreateInfo CreateInfo(TEXT("VirtualHeightfieldMesh.OcclusionTexture"));
    OcclusionResults.OcclusionTexture = RHICreateTexture2D(OcclusionGridSize.X, OcclusionGridSize.Y, PF_G8, NumOcclusionLods, 1, TexCreate_ShaderResource, CreateInfo);
    bool const* Src = Results->GetData() + ResultsStart;
    FIntPoint Size = OcclusionGridSize;
    for (int32 MipIndex = 0; MipIndex < NumOcclusionLods; ++MipIndex)
    {
        uint32 Stride;
        uint8* Dst = (uint8*)RHILockTexture2D(OcclusionResults.OcclusionTexture, MipIndex, RLM_WriteOnly, Stride, false);
        for (int Y = 0; Y < Size.Y; ++Y)
        {
            for (int X = 0; X < Size.X; ++X)
            {
                Dst[Y * Stride + X] = *(Src++) ? 255 : 0;
            }
        }
        RHIUnlockTexture2D(OcclusionResults.OcclusionTexture, MipIndex, false);

        Size.X = FMath::Max(Size.X / 2, 1);
        Size.Y = FMath::Max(Size.Y / 2, 1);
    }       
}

整体思路

至此就开始真正的VHM的Mesh的数据构建了。为了后续的代码细节能够更加易懂这里再说明一下VHM构建mesh的整体思路假设我们有一个工作队列为QueueBuffer每一项工作就是从QueueBuffer中取出一项工作更准确地说取出一个Quad、对这个Quad判断是否需要进行细化、如果需要细分则将这个Quad细分为4个Quad并塞入QueueBuffer中。

重复这个取出→处理→放回的过程直到QueueBuffer中没有工作为止。示意图如下

RVT相关代码Pass1CollectQuad

如果不能细分那么就会增加一个Instance、将其Instance的数据写入RWQuadBuffer中。RWQuadBuffer将会用在后续的CullInstance Pass中以真正地构建IndirectArgs

// 无法继续细分的情况
// 用以后续对RVT进行采样
uint PhysicalAddress = PageTableTexture.Load(int3(Pos, Level));

InterlockAdd(RWQuadInfo.Write, 1, Write);
RWQuadBuffer[Write] = Pack(InitQuadRenderItem(Pos, Level, PhysicalAddress, bCull | bOcclude));

其中的RWQuadInfo是我编的变量名、实际的代码中并不存在。或者说实际上这里的变量名是RWIndirectArgsBuffer但是并不是前面所说的用以绘制的IndirectArgs。为了不让大家混淆这里改了下变量名
另外也能由此看出的是VHM也许曾经想过利用IndirectArgs数组来绘制即DXSample中将符合条件的生成IndirectArgs放进数组中。但是最后改成的是一个IndirectArgs但是Instance的绘制方式

PS. PageTableTexture的类型为RHITextuire。相关Shader代码位于VirtualHeightfieldMesh.usf

Pass1的补充VirtualTextureFeedback

不再继续进行细分后、说明后续就要对该Level的RVT进行采样因此需要处理对应的Feedback信息、让虚幻可以加载对应的Page。shader代码如下图所示

c++中则要将这个RWFeedbackBuffer喂给虚幻的函数SubmitVirtualTextureFeedbackBuffer

相关代码段

FVertexFactoryIntermediates GetVertexFactoryIntermediates(FVertexFactoryInput Input)
{
...

// Sample height from virtual texture.  
VTUniform Uniform = VTUniform_Unpack(VHM.VTPackedUniform);  
Uniform.vPageBorderSize -= .5f * VHM.PhysicalTextureSize.y; // Half texel offset is used in VT write and in sampling because we want texel locations to match landscape vertices.  
VTPageTableUniform PageTableUniform = VTPageTableUniform_Unpack(VHM.VTPackedPageTableUniform0, VHM.VTPackedPageTableUniform1);  
VTPageTableResult VTResult0 = TextureLoadVirtualPageTableLevel(VHM.PageTableTexture, PageTableUniform, NormalizedPos, VTADDRESSMODE_CLAMP, VTADDRESSMODE_CLAMP, floor(SampleLevel));  
float2 UV0 = VTComputePhysicalUVs(VTResult0, 0, Uniform);  
float Height0 = VHM.HeightTexture.SampleLevel(VHM.HeightSampler, UV0, 0);  
VTPageTableResult VTResult1 = TextureLoadVirtualPageTableLevel(VHM.PageTableTexture, PageTableUniform, NormalizedPos, VTADDRESSMODE_CLAMP, VTADDRESSMODE_CLAMP, ceil(SampleLevel));  
float2 UV1 = VTComputePhysicalUVs(VTResult1, 0, Uniform);  
float Height1 = VHM.HeightTexture.SampleLevel(VHM.HeightSampler, UV1, 0);  
float Height = lerp(Height0.x, Height1.x, frac(SampleLevel));

...
}

渲染线程创建VT的相关逻辑

void FVirtualHeightfieldMeshSceneProxy::CreateRenderThreadResources()
{
	if (RuntimeVirtualTexture != nullptr)
	{
		if (!bCallbackRegistered)
		{
			GetRendererModule().AddVirtualTextureProducerDestroyedCallback(RuntimeVirtualTexture->GetProducerHandle(), &OnVirtualTextureDestroyedCB, this);
			bCallbackRegistered = true;
		}

		//URuntimeVirtualTexture* RuntimeVirtualTexture;
		if (RuntimeVirtualTexture->GetMaterialType() == ERuntimeVirtualTextureMaterialType::WorldHeight)
		{
			AllocatedVirtualTexture = RuntimeVirtualTexture->GetAllocatedVirtualTexture();
			NumQuadsPerTileSide = RuntimeVirtualTexture->GetTileSize();

			if (AllocatedVirtualTexture != nullptr)
			{
				// Gather vertex factory uniform parameters.
				FVirtualHeightfieldMeshVertexFactoryParameters UniformParams;
				UniformParams.PageTableTexture = AllocatedVirtualTexture->GetPageTableTexture(0);
				UniformParams.HeightTexture = AllocatedVirtualTexture->GetPhysicalTextureSRV(0, false);
				UniformParams.HeightSampler = TStaticSamplerState<SF_Bilinear>::GetRHI();
				UniformParams.LodBiasTexture = LodBiasTexture ? LodBiasTexture->GetResource()->TextureRHI : GBlackTexture->TextureRHI;
				UniformParams.LodBiasSampler = TStaticSamplerState<SF_Point>::GetRHI();
				UniformParams.NumQuadsPerTileSide = NumQuadsPerTileSide;

				FUintVector4 PackedUniform;
				AllocatedVirtualTexture->GetPackedUniform(&PackedUniform, 0);
				UniformParams.VTPackedUniform = PackedUniform;
				FUintVector4 PackedPageTableUniform[2];
				AllocatedVirtualTexture->GetPackedPageTableUniform(PackedPageTableUniform);
				UniformParams.VTPackedPageTableUniform0 = PackedPageTableUniform[0];
				UniformParams.VTPackedPageTableUniform1 = PackedPageTableUniform[1];

				const float PageTableSizeX = AllocatedVirtualTexture->GetWidthInTiles();
				const float PageTableSizeY = AllocatedVirtualTexture->GetHeightInTiles();
				UniformParams.PageTableSize = FVector4f(PageTableSizeX, PageTableSizeY, 1.f / PageTableSizeX, 1.f / PageTableSizeY);

				const float PhysicalTextureSize = AllocatedVirtualTexture->GetPhysicalTextureSize(0);
				UniformParams.PhysicalTextureSize = FVector2f(PhysicalTextureSize, 1.f / PhysicalTextureSize);

				UniformParams.VirtualHeightfieldToLocal = FMatrix44f(UVToLocal);
				UniformParams.VirtualHeightfieldToWorld = FMatrix44f(UVToWorld);		// LWC_TODO: Precision loss

				UniformParams.MaxLod = AllocatedVirtualTexture->GetMaxLevel();
				UniformParams.LodBiasScale = LodBiasScale;

				// Create vertex factory.
				VertexFactory = new FVirtualHeightfieldMeshVertexFactory(GetScene().GetFeatureLevel(), UniformParams);
				VertexFactory->InitResource(FRHICommandListImmediate::Get());
			}
		}
	}
}

RVT生成相关

RVT相关操作总结

CPU端创建


作为UniformParameter传递到GPU端

AllocatedVirtualTexture = RuntimeVirtualTexture->GetAllocatedVirtualTexture();

//PageTableTexture、Texture&Sampler
FVirtualHeightfieldMeshVertexFactoryParameters UniformParams;  
UniformParams.PageTableTexture = AllocatedVirtualTexture->GetPageTableTexture(0);  
UniformParams.HeightTexture = AllocatedVirtualTexture->GetPhysicalTextureSRV(0, false);  
UniformParams.HeightSampler = TStaticSamplerState<SF_Bilinear>::GetRHI();

//VTPackedUniform&VTPackedPageTableUniform
FUintVector4 PackedUniform;  
AllocatedVirtualTexture->GetPackedUniform(&PackedUniform, 0);  
UniformParams.VTPackedUniform = PackedUniform;  
FUintVector4 PackedPageTableUniform[2];  
AllocatedVirtualTexture->GetPackedPageTableUniform(PackedPageTableUniform);  
UniformParams.VTPackedPageTableUniform0 = PackedPageTableUniform[0];  
UniformParams.VTPackedPageTableUniform1 = PackedPageTableUniform[1];  

//PageTableSize
const float PageTableSizeX = AllocatedVirtualTexture->GetWidthInTiles();  
const float PageTableSizeY = AllocatedVirtualTexture->GetHeightInTiles();  
UniformParams.PageTableSize = FVector4f(PageTableSizeX, PageTableSizeY, 1.f / PageTableSizeX, 1.f / PageTableSizeY);  

//PhysicalTextureSize
const float PhysicalTextureSize = AllocatedVirtualTexture->GetPhysicalTextureSize(0);  
UniformParams.PhysicalTextureSize = FVector2f(PhysicalTextureSize, 1.f / PhysicalTextureSize);  

//Local <=> World Matrix
UniformParams.VirtualHeightfieldToLocal = FMatrix44f(UVToLocal);  
UniformParams.VirtualHeightfieldToWorld = FMatrix44f(UVToWorld);       // LWC_TODO: Precision loss  

//MaxLod
UniformParams.MaxLod = AllocatedVirtualTexture->GetMaxLevel();  

GPU端采样

VTUniform Uniform = VTUniform_Unpack(VHM.VTPackedUniform);  
Uniform.vPageBorderSize -= .5f * VHM.PhysicalTextureSize.y; // Half texel offset is used in VT write and in sampling because we want texel locations to match landscape vertices.  
VTPageTableUniform PageTableUniform = VTPageTableUniform_Unpack(VHM.VTPackedPageTableUniform0, VHM.VTPackedPageTableUniform1);  
VTPageTableResult VTResult0 = TextureLoadVirtualPageTableLevel(VHM.PageTableTexture, PageTableUniform, NormalizedPos, VTADDRESSMODE_CLAMP, VTADDRESSMODE_CLAMP, floor(SampleLevel));  
float2 UV0 = VTComputePhysicalUVs(VTResult0, 0, Uniform);  
float Height0 = VHM.HeightTexture.SampleLevel(VHM.HeightSampler, UV0, 0);  
VTPageTableResult VTResult1 = TextureLoadVirtualPageTableLevel(VHM.PageTableTexture, PageTableUniform, NormalizedPos, VTADDRESSMODE_CLAMP, VTADDRESSMODE_CLAMP, ceil(SampleLevel));  
float2 UV1 = VTComputePhysicalUVs(VTResult1, 0, Uniform);  
float Height1 = VHM.HeightTexture.SampleLevel(VHM.HeightSampler, UV1, 0);  
float Height = lerp(Height0.x, Height1.x, frac(SampleLevel));

NiagaraDataInterfaceVirtualTextureTemplate.ush中的代码:

//其他相关VT操作函数位于VirtualTextureCommon.ush

float4 SampleRVTLayer_{ParameterName}(float2 SampleUV, Texture2D InTexture, Texture2D<uint4> InPageTable, uint4 InTextureUniforms)
{
	VTPageTableResult PageTable = TextureLoadVirtualPageTableLevel(InPageTable, VTPageTableUniform_Unpack({ParameterName}_PageTableUniforms[0], {ParameterName}_PageTableUniforms[1]), SampleUV, VTADDRESSMODE_CLAMP, VTADDRESSMODE_CLAMP, 0.0f);
	return TextureVirtualSample(InTexture, {ParameterName}_SharedSampler, PageTable, 0, VTUniform_Unpack(InTextureUniforms));
}

void SampleRVT_{ParameterName}(in float3 WorldPosition, out bool bInsideVolume, out float3 BaseColor, out float Specular, out float Roughness, out float3 Normal, out float WorldHeight, out float Mask)
{
	bInsideVolume = false;
	BaseColor = float3(0.0f, 0.0f, 0.0f);
	Specular = 0.5f;
	Roughness = 0.5f;
	Normal = float3(0.0f, 0.0f, 1.0f);
	WorldHeight = 0.0f;
	Mask = 1.0f;

	// Get Sample Location
	FLWCVector3 LWCWorldPosition = MakeLWCVector3({ParameterName}_SystemLWCTile, WorldPosition);
	FLWCVector3 LWCUVOrigin = MakeLWCVector3({ParameterName}_SystemLWCTile, {ParameterName}_UVUniforms[0].xyz);

	float2 SampleUV = VirtualTextureWorldToUV(LWCWorldPosition, LWCUVOrigin, {ParameterName}_UVUniforms[1].xyz, {ParameterName}_UVUniforms[2].xyz);

	// Test to see if we are inside the volume, but still take the samples as it will clamp to the edge
	bInsideVolume = all(SampleUV >- 0.0f) && all(SampleUV < 1.0f);

	// Sample Textures
	float4 LayerSample[3];
	LayerSample[0] = ({ParameterName}_ValidLayersMask & 0x1) != 0 ? SampleRVTLayer_{ParameterName}(SampleUV, {ParameterName}_VirtualTexture0, {ParameterName}_VirtualTexture0PageTable, {ParameterName}_VirtualTexture0TextureUniforms) : 0;
	LayerSample[1] = ({ParameterName}_ValidLayersMask & 0x2) != 0 ? SampleRVTLayer_{ParameterName}(SampleUV, {ParameterName}_VirtualTexture1, {ParameterName}_VirtualTexture1PageTable, {ParameterName}_VirtualTexture1TextureUniforms) : 0;
	LayerSample[2] = ({ParameterName}_ValidLayersMask & 0x4) != 0 ? SampleRVTLayer_{ParameterName}(SampleUV, {ParameterName}_VirtualTexture2, {ParameterName}_VirtualTexture2PageTable, {ParameterName}_VirtualTexture2TextureUniforms) : 0;

	// Sample Available Attributes
	switch ( {ParameterName}_MaterialType )
	{
		case ERuntimeVirtualTextureMaterialType_BaseColor:
		{
			BaseColor = LayerSample[0].xyz;
			break;
		}

		case ERuntimeVirtualTextureMaterialType_BaseColor_Normal_Roughness:
		{
			BaseColor = VirtualTextureUnpackBaseColorSRGB(LayerSample[0]);
			Roughness = LayerSample[1].y;
			Normal = VirtualTextureUnpackNormalBGR565(LayerSample[1]);
			break;
		}

		case ERuntimeVirtualTextureMaterialType_BaseColor_Normal_DEPRECATED:
		case ERuntimeVirtualTextureMaterialType_BaseColor_Normal_Specular:
		{
			BaseColor = LayerSample[0].xyz;
			Specular = LayerSample[1].x;
			Roughness = LayerSample[1].y;
			Normal = VirtualTextureUnpackNormalBC3BC3(LayerSample[0], LayerSample[1]);
			break;
		}
		
		case ERuntimeVirtualTextureMaterialType_BaseColor_Normal_Specular_YCoCg:
		{
			BaseColor = VirtualTextureUnpackBaseColorYCoCg(LayerSample[0]);
			Specular = LayerSample[2].x;
			Roughness = LayerSample[2].y;
			Normal = VirtualTextureUnpackNormalBC5BC1(LayerSample[1], LayerSample[2]);
			break;
		}
		
		case ERuntimeVirtualTextureMaterialType_BaseColor_Normal_Specular_Mask_YCoCg:
		{
			BaseColor = VirtualTextureUnpackBaseColorYCoCg(LayerSample[0]);
			Specular = LayerSample[2].x;
			Roughness = LayerSample[2].y;
			Normal = VirtualTextureUnpackNormalBC5BC1(LayerSample[1], LayerSample[2]);
			Mask = LayerSample[2].w;
			break;
		}
		
		case ERuntimeVirtualTextureMaterialType_WorldHeight:
		{
			WorldHeight = VirtualTextureUnpackHeight(LayerSample[0], {ParameterName}_WorldHeightUnpack);
			break;
		}
	}
}

VT还存在一个反馈机制具体可以参考#Pass1的补充VirtualTextureFeedback

/** GPU fence pool. Contains a fence array that is kept in sync with the FeedbackItems ring buffer. Fences are used to know when a transfer is ready to Map() without stalling. */  
/** GPU 栅栏池。其中包含一个与 FeedbackItems 环形缓冲区保持同步的栅栏数组。栅栏用于了解传输何时准备就绪,可在不停滞的情况下进行 Map()。 */
class FFeedbackGPUFencePool* Fences;