Back in the old days, Unity only had a single render pipeline, now called the Built-in Render Pipeline (BiRP). Profiling what the BiRP is doing is easy enough.
The Rendering Statistics panel gives a quick overview of useful statistics like triangles and vertices rendered as well as, how many draw calls were batched together.
The Rendering Profiler module shows a breakdown of watch got batched and why.
Besides obvious indicators like triangles and vertices, a lot of profiler effort goes into batching. Why? It turns out that draw calls are expensive, so optimizing rendering performance often involves reducing draw calls, e.g. through GPU instancing, static batching, or dynamic batching.
The reason why there are so many different methods addressing the same problem is that none of them are universally applicable.
Oh and all of them only work if it’s the same material, i.e. literally the same material instance. That’s why we get MaterialPropertyBlock to change material properties without breaking draw call batching. You might even go so far as to require a single material across your entire project, using only
MaterialPropertyBlocks at runtime to assign textures.
We have to use the Frame Debugger to get an accurate picture of what is really going on under the hood. The frame debugger correctly reports draw calls, batching, and instancing.
Using SRP, GPU instancing does not work. Try as you might, the number of batched draw calls due to instancing is always zero. That is because SRP comes with its own batcher and Unity prioritizes the SRP batcher over GPU instancing.
That sounds bad, but it turns out the SRP batcher is almost as fast as GPU instancing. Almost.
|Render pipeline||GPU instancing||SRP batcher||Draw calls||CPU (ms)||Render thread (ms)|
Interestingly, URP with the SRP batcher turned off is slightly faster than BiRP and supports GPU instancing just the same.
Static batching works the same in SRP as in BiRP and produces the same speed-ups. Dynamic batching also works the same but is disabled in HDRP because it’s almost always a bad idea. Relying on the SRP batcher is usually the better choice.
The SRP batcher is a good default because it works in a wide variety of scenarios.
But most importantly it works across multiple materials, with a few caveats:
These are important caveats so let’s break them down.
MaterialPropertyBlocks are simply not supported by the SRP batcher. That means they do the opposite of what they do in BiRP. Instead of making things faster, they make things slower. The affected renderers get taken out of the SRP batching queue and put into the regular old queue.
All the old tricks apply to the old queue and, under very specific circumstances, it is possible to beat the SRP batcher. However, chances are you just wanted to help reduce draw calls and facilitate GPU instancing but ended up increasing draw calls because you blocked the SRP batcher.
Just use material instances instead.
Most importantly, the
URP/Lit shader is a “smart” uber shader that strips out unused code automatically. For example, if there is a normal map assigned, the
_NORMALMAP keyword will be enabled, enabling additional shader code. That means materials with a normal map will not batch with materials that don’t use a normal map.
Use as few variants as possible. Use the frame debugger to verify.
All lit and unlit shaders in the High Definition Render Pipeline (HDRP) and the Universal Render Pipeline (URP) are compatible with the SRP batcher. That’s probably all you need to know.
If you must use custom shaders, follow the requirements and double-check the inspector panel to see if it worked.
The SRP batcher is great but the profilers have not been updated yet. Use the frame debugger to get accurate reports. Don’t use
MaterialPropertyBlocks anymore. Don’t bother with GPU instancing settings. Use as many materials as you want. Use
URP/Lit for everything. Use the frame debugger to double-check that the shader keywords are the same across your materials.