How to Speed Up Metal Code for iOS/Mac OS

后端未结
关注
 2  789
轮回少年 2020-12-31 20:18
I\'m trying to implement code in Metal that performs a 1D convolution between two vectors with lengths. I\'ve implemented the following which works correctly

      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   悲哀的现实
                                             
                
                
                (楼主)
            
              
              
                2020-12-31 20:48
              

            
            
                        
The following code shows how to render encoded commands in parallel on the GPU using the Objective-C Metal API (the threading code above only divides rendering of the output into grid sections for parallel processing; the calculations are still not performed in parallel). It is what you're referring to in your question, even while it's not exactly what you want. I've provided this answer to help anyone who might have stumbled upon this question, thinking that it was going to provide an answer related to parallel rendering (when, in fact, it does not):

    - (void)drawInMTKView:(MTKView *)view
    {
        dispatch_async(((AppDelegate *)UIApplication.sharedApplication.delegate).cameraViewQueue, ^{
                    id  drawable = [view currentDrawable]; //[(CAMetalLayer *)view.layer nextDrawable];
                    MTLRenderPassDescriptor *renderPassDesc = [view currentRenderPassDescriptor];
                    renderPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear;
                    renderPassDesc.colorAttachments[0].clearColor = MTLClearColorMake(0.0,0.0,0.0,1.0);
                    renderPassDesc.renderTargetWidth = self.texture.width;
                    renderPassDesc.renderTargetHeight = self.texture.height;
                    renderPassDesc.colorAttachments[0].texture = drawable.texture;
                    if (renderPassDesc != nil)
                    {
                        dispatch_semaphore_wait(self._inflight_semaphore, DISPATCH_TIME_FOREVER);
                        id  commandBuffer = [self.metalContext.commandQueue commandBuffer];
                        [commandBuffer enqueue];
            // START PARALLEL RENDERING OPERATIONS HERE
                        id  parallelRCE = [commandBuffer parallelRenderCommandEncoderWithDescriptor:renderPassDesc];
// FIRST PARALLEL RENDERING OPERATION
                        id  renderEncoder = [parallelRCE renderCommandEncoder];

                        [renderEncoder setRenderPipelineState:self.metalContext.renderPipelineState];

                        [renderEncoder setVertexBuffer:self.metalContext.vertexBuffer offset:0 atIndex:0];
                        [renderEncoder setVertexBuffer:self.metalContext.uniformBuffer offset:0 atIndex:1];

                        [renderEncoder setFragmentBuffer:self.metalContext.uniformBuffer offset:0 atIndex:0];

                        [renderEncoder setFragmentTexture:self.texture
                                                  atIndex:0];

                        [renderEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip
                                          vertexStart:0
                                          vertexCount:4
                                        instanceCount:1];

                        [renderEncoder endEncoding];
            // ADD SECOND, THIRD, ETC. PARALLEL RENDERING OPERATION HERE
.
.
.
// SUBMIT ALL RENDERING OPERATIONS IN PARALLEL HERE
                        [parallelRCE endEncoding];

                        __block dispatch_semaphore_t block_sema = self._inflight_semaphore;
                        [commandBuffer addCompletedHandler:^(id buffer) {
                            dispatch_semaphore_signal(block_sema);

                        }];

                        if (drawable)
                            [commandBuffer presentDrawable:drawable];
                        [commandBuffer commit];
                        [commandBuffer waitUntilScheduled];
                    }
        });
    }


In the above example, you would duplicate the renderEncoder-related for each calculation you want to perform in parallel. I do not see how this would be of benefit to you in your code example, as one operation appears to be dependent on another. Probably, then, the best you could hope for is the code provided to you by warrenm, even though that doesn't really qualify as parallel rendering, though.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复