Can GHC really never inline map, scanl, foldr, etc.?

2018-06-28 19:53:05

I've noticed the GHC manual says "for a self-recursive function, the loop breaker can only be the function itself, so an INLINE pragma is always ignored."

Doesn't this say every application of common recursive functional constructs like map , zip , scan* , fold* , sum , etc. cannot be inlined?

You could always rewrite all these function when you employ them, adding appropriate strictness tags, or maybe employ fancy techniques like the "stream fusion" recommended here.

Yet, doesn't all this dramatically constrain our ability to write code that's simultaneously fast and elegant?

Indeed, GHC cannot at present inline recursive functions. However:

GHC will still specialise recursive functions. For instance, given

fac :: (Eq a, Num a) => a -> a
fac 0 = 1
fac n = n * fac (n-1)

f :: Int -> Int
f x = 1 + fac x

GHC will spot that fac is used at type Int -> Int and generate a specialised version of fac for that type, which uses fast integer arithmetic.

This specialisation happens automatically within a module (eg if fac and f are defined in the same module). For cross-module specialisation (eg if f and fac are defined in different modules), mark the to-be-specialised function with an INLINABLE pragma:

{-# INLINABLE fac #-}
fac :: (Eq a, Num a) => a -> a
...

There are manual transformations which make functions nonrecursive. The lowest-power technique is the static argument transformation, which applies to recursive functions with arguments which don't change on recursive calls (eg many higher-order functions such as map , filter , fold* ). This transformation turns

map f []     = []
map f (x:xs) = f x : map f xs

into

map f xs0 = go xs0
  where
    go []     = []
    go (x:xs) = f x : go xs

so that a call such as

 g :: [Int] -> [Int]
 g xs = map (2*) xs

will have map inlined and become

 g [] = []
 g (x:xs) = 2*x : g xs

This transformation has been applied to Prelude functions such as foldr and foldl .

Fusion techniques are also make many functions nonrecursive, and are more powerful than the static argument transformation. The main approach for lists, which is built into the Prelude, is shortcut fusion. The basic approach is to write as many functions as possible as non-recursive functions which use foldr and/or build ; then all the recursion is captured in foldr , and there are special RULES for dealing with foldr .

Taking advantage of this fusion is in principle easy: avoid manual recursion, preferring library functions such as foldr , map , filter , and any functions in this list. In particular, writing code in this style produces code which is "simultaneously fast and elegant".

Modern libraries such as text and vector use stream fusion behind the scenes. Don Stewart wrote a pair of blog posts (1, 2) demonstrating this in action in the now obsolete library uvector, but the same principles apply to text and vector.

As with shortcut fusion, taking advantage of stream fusion in text and vector is in principle easy: avoid manual recursion, preferring library functions which have been marked as "subject to fusion".

There is ongoing work on improving GHC to support inlining of recursive functions. This falls under the general heading of supercompilation, and recent work on this seems to have been led by Max Bolingbroke and Neil Mitchell.

In short, not as often as you would think. The reason is that the "fancy techniques" such as stream fusion are employed when the libraries are implemented, and library users don't need to worry about them.

Consider Data.List.map . The base package defines map as

map :: (a -> b) -> [a] -> [b]
map _ []     = []
map f (x:xs) = f x : map f xs

This map is self-recursive, so GHC won't inline it.

However, base also defines the following rewrite rules:

{-# RULES
"map"       [~1] forall f xs.   map f xs                = build (c n -> foldr (mapFB c f) n xs)
"mapList"   [1]  forall f.      foldr (mapFB (:) f) []  = map f
"mapFB"     forall c f g.       mapFB (mapFB c f) g     = mapFB c (f.g) 
  #-}

This replaces uses of map via foldr/build fusion, then, if the function cannot be fused, replaces it with the original map . Because the fusion happens automatically, it doesn't depend on the user being aware of it.

As proof that this all works, you can examine what GHC produces for specific inputs. For this function:

proc1 = sum . take 10 . map (+1) . map (*2)

eval1 = proc1 [1..5]
eval2 = proc1 [1..]

when compiled with -O2, GHC fuses all of proc1 into a single recursive form (as seen in the core output with -ddump-simpl ).

Of course there are limits to what these techniques can accomplish. For example, the naive average function, mean xs = sum xs / length xs is easily manually transformed into a single fold, and frameworks exist that can do so automatically, however at present there's no known way to automatically translate between standard functions and the fusion framework. So in this case, the user does need to be aware of the limitations of the compiler-produced code.

So in many cases compilers are sufficiently advanced to create code that's fast and elegant. Knowing when they will do so, and when the compiler is likely to fall down, is IMHO a large part of learning how to write efficient Haskell code.

for a self-recursive function, the loop breaker can only be the function itself, so an INLINE pragma is always ignored.

If something is recursive, to inline it, you would have to know how many times it is executed at compile time. Considering it will be a variable length input, that is not possible.

Yet, doesn't all this dramatically constrain our ability to write code that's simultaneously fast and elegant?

There are certain techniques though that can make recursive calls much, much faster than their normal situation. For example, tail call optimization SO Wiki

链接地址: http://www.djcxy.com/p/80506.html

上一篇: 递归消除？

下一篇: GHC能否真的不内联地图，scanl，foldr等？