Template-ing a 'for' loop in C++?

后端未结

关注

 10  676

I have a C++ snippet below with a run-time for loop,

for(int i = 0; i < I; i++)
  for (int j = 0; j < J; j++)
    A( row(i,j), column(i,j)


                      
              相关标签:


      
      
        
          10条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  南方客        
                
              
                            
                2020-12-25 13:20
              
            
            
                                                                       
I'm not a fan of template meta-programming, so you may want to take this answer with a pinch of salt. But, before I invested any time on this problem, I'd ask myself the following:


Is my for loop a bottleneck?
Is unrolling it going to make any difference? Easily tested by hand-unrolling and measuring.


In many compilers/cpus, the "looped" version can give better performance due to cache effects.

Remember: Measure first, optimise later - if at all.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  故里飘歌        
                
              
                            
                2020-12-25 13:21
              
            
            
                                                                       
If you are willing to modify the syntax a bit you can do something like this:

template <int i, int ubound>
struct OuterFor {
    void operator()() {
        InnerFor<i, 0, J>()();
        OuterFor<i + 1, ubound>()();
    }
};

template <int ubound>
struct OuterFor <ubound, ubound> {
    void operator()() {
    }
};


In InnerFor, i is the outer loops counter (compile time constant), j is the inner loops counter (initially 0 - also compile time constant), so you can evaluate row as a compile time template.

Its a bit more complicated, but as you say, row(), col(), and f() are your complicated parts anyways. At least try it and see if the performance is worth it. It may be worth it to investigate other options to simplify your row(), etc functions.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一向        
                
              
                            
                2020-12-25 13:22
              
            
            
                                                                       
You could use Boost MPL.

An example of loop unrolling is on this mpl::for_each page.

for_each< range_c<int,0,10> >( value_printer() );


It doesn't seem that it's all evaluated at compile time, but it may be a good starting point.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天涯浪人        
                
              
                            
                2020-12-25 13:25
              
            
            
                                                                       
I've never tried to do this so take this idea with a grain of salt...

It seems like you could use Boost.Preprocessor to do the loop unrolling (specifically the BOOST_PP_FOR and BOOST_PP_FOR_r macros) and then use templates to generate the actual constant expression.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  后悔当初        
                
              
                            
                2020-12-25 13:26
              
            
            
                                                                       
You could use Boost.Mpl to implement the whole thing at compile-time, but I'm not sure it'll be faster. (Mpl essentially re-implements all the STL algorithms as compile-time metaprogramming templates)

The problem with that approach is that you end up unrolling and inlining a lot of code, which may thrash the instruction cache and eat up memory bandwidth that could have been saved. That may produce huge, bloated and slow code.

I would probably probably rather trust the compiler to inline the functions that make sense. As long as the row and column function definitions are visible from the loop, the compiler can trivially inline the calls and unroll as many iterations as it deems beneficial.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  臣服心动        
                
              
                            
                2020-12-25 13:29
              
            
            
                                                                       
I would say it is a false good-idea. 

In C++ this : 
row<i,j>::value

means you will have as many differents row<>() functions than you have i * j. You don't want this because it will increase the size of the code and do a lot of instruction cache misses. 

I observed this when I was doing template functions to avoid a single boolean check. 

If is a short function just inline it. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复