elaborating on this: if you're only able to draw 8 unique decals at a time, you're locking the entire G-buffer to do this draw, which means the next 8 have to wait on the previous 8, creating a serial bottleneck. instanced draw + all textures available in the pipeline = the GPU can draw all of these decal meshes on all the attachments at the same time potentially in parallel. since a single draw is exponentially more expensive at higher resolutions, the parallelization helps A LOT
vulkan and family didn't add anything new in terms of pipeline capabilities (nvidia task and mesh shader stages notwithstanding), but they do enable these sorts of massive scale pipelines which games are going to need to hit 4k