Why does foldr use a helper function?

后端未结

关注

 4  1961

In explaining foldr to Haskell newbies, the canonical definition is

foldr            :: (a -> b -> b) -> b -> [a] -> b
foldr _ z


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  春和景丽        
                
              
                            
                2020-12-30 21:57
              
            
            
                                                                       
I can add some important details about GHC's optimization system.

The naive definition of foldr passes around a function. There's an inherent overhead in calling a function - especially when the function isn't known at compile time. It'd be really nice to able to inline the definition of the function if it's known at compile time.

There are tricks available to perform that inlining in GHC - and this is an example of them.  First, foldr needs to be inlined (I'll get to why later). foldr's naive implementation is recursive, so cannot be inlined. So a worker/wrapper transformation is applied to the definition. The worker is recursive, but the wrapper is not. This allows foldr to be inlined, despite the recursion over the structure of the list.

When foldr is inlined, it creates a copy of all of its local bindings, too. It's more or less a direct textual inlining (modulo some renaming, and happening after the desugaring pass). This is where things get interesting. go is a local binding, and the optimizer gets to look inside it. It notices that it calls a function in the local scope, which it names k. GHC will often remove the k variable entirely, and will just replace it with the expression k reduces to. And then afterwards, if the function application is amenable to inlining, it can be inlined at this time - removing the overhead of calling a first-class function entirely.

Let's look at a simple, concrete example. This program will echo a line of input with all trailing 'x' characters removed:

dropR :: Char -> String -> String
dropR x r = if x == 'x' && null r then "" else x : r

main :: IO ()
main = do
    s <- getLine
    putStrLn $ foldr dropR "" s


First, the optimizer will inline foldr's definition and simplify, resulting in code that looks something like this:

main :: IO ()
main = do
    s <- getLine
    -- I'm changing the where clause to a let expression for the sake of readability
    putStrLn $ let { go [] = ""; go (x:xs) = dropR x (go xs) } in go s


And that's the thing the worker-wrapper transformation allows.. I'm going to skip the remaining steps, but it should be obvious that GHC can now inline the definition of dropR, eliminating the function call overhead. This is where the big performance win comes from.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  甜味超标        
                
              
                            
                2020-12-30 22:06
              
            
            
                                                                       
One tiny important detail not mentioned in other answers is that GHC, given a function definition like

f x y z w q = ...


cannot inline f until all of the arguments x, y, z, w, and q are applied. This means that it's often advantageous to use the worker/wrapper transformation to expose a minimal set of function arguments which must be applied before inlining can occur.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  慢半拍i        
                
              
                            
                2020-12-30 22:15
              
            
            
                                                                       
GHC cannot inline recursive functions, so

foldr            :: (a -> b -> b) -> b -> [a] -> b
foldr _ z []     =  z
foldr f z (x:xs) =  f x (foldr f z xs)


cannot be inlined. But 

foldr k z = go
      where
        go []     = z
        go (y:ys) = y `k` go ys


is not a recursive function. It is a non-recursive function with a local recursive definition!

This means that, as @bheklilr writes, in map (foldr (+) 0) the foldr can be inlined and hence f and z replaced by (+) and 0 in the new go, and great things can happen, such as unboxing of the intermediate value.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  温柔的废话        
                
              
                            
                2020-12-30 22:17
              
            
            
                                                                       
As the comments say:

-- Inline only in the final stage, after the foldr/cons rule has had a chance
-- Also note that we inline it when it has *two* parameters, which are the
-- ones we are keen about specialising!


In particular, note the "we inline it when it has two parameters, which are the ones we are keen about specialising!"

What this is saying is that when foldr gets inlined, it's getting inlined only for the specific choice of f and z, not for the choice of the list getting folded.  I'm not expert, but it would seem it would make it possible to inline it in situations like

map (foldr (+) 0) some_list


so that the inline happens in this line and not after map has been applied.  This makes it optimizable in more situations and more easily.  All the helper function does is mask the 3rd argument so {-# INLINE #-} can do its thing.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复