Why is writing to a buffer from within a fragment shader disallowed in Metal?

问题

As stated in the Metal Shading Language Guide:

Writes to a buffer or a texture are disallowed from a fragment function.

I understand that this is the case, but I'm curious as to why. Being able to write to a buffer from within a fragment shader is incredibly useful; I understand that it is likely more complex on the hardware end to not know ahead of time the end location of memory writes for a particular thread, which you don't always know with raw buffer writes, but this is a capability exposed within Metal compute shaders, so why not within fragment shaders too?

Addendum

I should clarify why I think buffer writes from fragment functions are useful. In the most common usage case of the rasterization pipeline, triangles are being rasterized and shaded (per the fragment shader) and written into predefined memory locations, known before each fragment shader invocation and determined by the predefined mapping from the normalized device coordinates and the frame buffer. This fits most usage cases, since most of the time you just want to render triangles directly to a buffer or the screen.

There are other cases in which you might want to do a lazy write within the fragment shader, the end location of which is based off of fragment properties and not the fragment's exact location; effectively, rasterization with side effects. For instance, most GPU-based voxelization works by rendering the scene with orthographic projection from some desirable angle, and then writing into a 3D texture, mapping the XY coordinates of the fragment and its associated depth value to a location in the 3D texture. This is described here.

Other uses include some forms of order-independent transparency (transparency where draw order is unimportant, allowing for overlapping transparent objects). One solution is to use a multi-layered frame buffer, and then to sort and blend the fragments based upon their depth values in a separate pass. Since there's no hardware support for doing this (on most GPUs, Intel's I believe have hardware acceleration for this), you have to maintain atomic counters and manual texture/buffer writes from each pixel to coordinate writes to the layered frame buffer.

Yet another example might be extraction of virtual point lights for GI through rasterization (i.e. you write out point lights for relevant fragments as you rasterize). In all of these usage cases, buffer writes from fragment shaders are required, because ROPs only store one resulting fragment for each pixel. The only way to achieve equivalent results without this feature is by some manner of depth peeling, which is horribly slow for scenes of high depth complexity.

Now I realize that the examples I gave aren't really all about buffer writes in particular, but more generally about the idea of dynamic memory writes from fragment shaders, ideally along with support for atomicity. Buffer writes just seem like a simple issue, and their inclusion would go a long way towards improving the situation.

Since I wasn't getting any answers here, I ended up posting the question on Apple's developer forums. I got more feedback there, but still no real answer. Unless I am missing something, it seems that virtually every OS X device which officially supports Metal has hardware support for this feature. And as I understand, this feature first started popping up in GPUs around 2009. It's a common feature in both current DirectX and OpenGL (not even considering DX12 or Vulkan), so Metal would be the only "cutting-edge" API which lacks it.

I realize that this feature might not be supported on PowerVR hardware, but Apple has had no issue differentiating the Metal Shading Language by feature set. For instance, Metal on iOS allows for "free" frame buffer fetches within fragment shaders, which is directly supported in hardware by the cache-heavy PowerVR architecture. This feature manifests itself directly in the Metal Shading Language, as it allows you to declare fragment function inputs with the [[color(m)]] attribute qualifier for iOS shaders. Arguably allowing declaration of buffers with the device storage space qualifier, or textures with access::write, as inputs to fragment shaders, would be no greater semantic change to the language than what Apple has done to optimize for iOS. So, as far as I'm concerned, a lack of support by PowerVR would not explain the lack of the feature I'm looking for on OS X.

回答1:

Writes to buffers from fragment shaders is now supported, as mentioned in What’s New in iOS 10, tvOS 10, and macOS 10.12

Function Buffer Read-Writes Available in: iOS_GPUFamily3_v2, OSX_GPUFamily1_v2

Fragment functions can now write to buffers. Writable buffers must be declared in the device address space and must not be const. Use dynamic indexing to write to a buffer.

More over, line specifying the restriction (from the original question) is not there in Metal Shading Language Specification 2.0

回答2:

I think you can neither write arbitrary pixels or texels on a fragment function on OpenGL or DirectX. One thing is the rendering API and other thing are the fragment or vertex functions.

A fragment function is intended to produce as result a pixel / texel output, one per run, even is each one have multiple channels. Usually if you want to write to a buffer or texture you need to render something (a quad, triangle, or something using your fragment function over a surface (buffer or texture). As result each pixel / texel will be rendered using your fragment function. For example raycasting or raytracing fragment functions usually uses this approach.

There is a good reason for not to allowing you to write arbitrary pixels / texels: parallelization. The fragment function is executed usually for lots of different pixels / texels at once at a time on most GPUs in a very high parallelization mode, each GPU has its own manner to parallelize (SMP, vectorial...) but all do very high paralelization. So you can write only by returning one pixel or texel channels output as the return of the fragment function in order to avoid common parallelization problems like races. This applies for each graphic library I know.

来源：https://stackoverflow.com/questions/34459972/why-is-writing-to-a-buffer-from-within-a-fragment-shader-disallowed-in-metal

标签

buffer

fragment-shader

metal