blog

Profiling the Scriptable Render Pipeline

2022-03-14

A summary of how to profile Unity’s Scriptable Render Pipeline. This includes URP, HDRP, and custom SRPs.

TL;DR:

  1. Forget about GPU instancing.
  2. Forget about MaterialPropertyBlocks.
  3. Careful with unintentional shader variants.

Profiling the Built-in Render Pipeline

Back in the old days, Unity only had a single render pipeline, now called the Built-in Render Pipeline (BiRP). Profiling what the BiRP is doing is easy enough.

Stats

Rendering statistics

The Rendering Statistics panel gives a quick overview of useful statistics like triangles and vertices rendered as well as, how many draw calls were batched together.

Rendering Profiler

Rendering profiler

The Rendering Profiler module shows a breakdown of watch got batched and why.

Batching

Besides obvious indicators like triangles and vertices, a lot of profiler effort goes into batching. Why? It turns out that draw calls are expensive, so optimizing rendering performance often involves reducing draw calls, e.g. through GPU instancing, static batching, or dynamic batching.

The reason why there are so many different methods addressing the same problem is that none of them are universally applicable.

Oh and all of them only work if it’s the same material, i.e. literally the same material instance. That’s why we get MaterialPropertyBlock to change material properties without breaking draw call batching. You might even go so far as to require a single material across your entire project, using only MaterialPropertyBlocks at runtime to assign textures.

Profiling the Scriptable Render pipeline

The first thing to note when profiling the Scriptable Render Pipeline is that the rendering profiler and the statistics panel are broken. The reported batching numbers don’t make sense.

Frame debugger

Frame debugger

We have to use the Frame Debugger to get an accurate picture of what is really going on under the hood. The frame debugger correctly reports draw calls, batching, and instancing.

GPU instancing

Using SRP, GPU instancing does not work. Try as you might, the number of batched draw calls due to instancing is always zero. That is because SRP comes with its own batcher and Unity prioritizes the SRP batcher over GPU instancing.

That sounds bad, but it turns out the SRP batcher is almost as fast as GPU instancing. Almost.

Render pipeline GPU instancing SRP batcher Draw calls CPU (ms) Render thread (ms)
BiRP off N/A 10,000 7.9 6.2
BiRP on N/A 20 6.6 4.5
URP off off 10,000 7.4 5.9
URP on off 30 6.2 4.5
URP on/off on 15 6.8 4.9

GPU instancing

Interestingly, URP with the SRP batcher turned off is slightly faster than BiRP and supports GPU instancing just the same.

Batching

Static batching works the same in SRP as in BiRP and produces the same speed-ups. Dynamic batching also works the same but is disabled in HDRP because it’s almost always a bad idea. Relying on the SRP batcher is usually the better choice.

SRP Batcher

The SRP batcher is a good default because it works in a wide variety of scenarios.

But most importantly it works across multiple materials, with a few caveats:

  1. Must not use MaterialPropertyBlock.
  2. Must use the same shader variant.
  3. Shader must be compatible with SRP batcher.

These are important caveats so let’s break them down.

MaterialPropertyBlock

MaterialPropertyBlocks are simply not supported by the SRP batcher. That means they do the opposite of what they do in BiRP. Instead of making things faster, they make things slower. The affected renderers get taken out of the SRP batching queue and put into the regular old queue.

All the old tricks apply to the old queue and, under very specific circumstances, it is possible to beat the SRP batcher. However, chances are you just wanted to help reduce draw calls and facilitate GPU instancing but ended up increasing draw calls because you blocked the SRP batcher.

Just use material instances instead.

Shader variants

While the SRP batcher works across different materials, the shader variant must be the same. A single shader can have multiple variants by using shader keywords.

Most importantly, the URP/Lit shader is a “smart” uber shader that strips out unused code automatically. For example, if there is a normal map assigned, the _NORMALMAP keyword will be enabled, enabling additional shader code. That means materials with a normal map will not batch with materials that don’t use a normal map.

Use as few variants as possible. Use the frame debugger to verify.

Compatibility

All lit and unlit shaders in the High Definition Render Pipeline (HDRP) and the Universal Render Pipeline (URP) are compatible with the SRP batcher. That’s probably all you need to know.

Compatibility

If you must use custom shaders, follow the requirements and double-check the inspector panel to see if it worked.

Summary

The SRP batcher is great but the profilers have not been updated yet. Use the frame debugger to get accurate reports. Don’t use MaterialPropertyBlocks anymore. Don’t bother with GPU instancing settings. Use as many materials as you want. Use URP/Lit for everything. Use the frame debugger to double-check that the shader keywords are the same across your materials.

References