Writing foldl using foldr
In Real World Haskell , Chapter 4. Functional Programming
Write foldl with foldr:
-- file: ch04/Fold.hs
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
The above code confused me a lot, and some guy called dps rewrote it with some meaningful name to make it clearer a bit:
myFoldl stepL zeroL xs = (foldr stepR id xs) zeroL
where stepR lastL accR accInitL = accR (stepL accInitL lastL)
One guy Jef G then did a excellent job by providing a example and showint the underlying machanism step by step:
myFoldl (+) 0 [1, 2, 3]
= (foldR step id [1, 2, 3]) 0
= (step 1 (step 2 (step 3 id))) 0
= (step 1 (step 2 (a3 -> id ((+) a3 3)))) 0
= (step 1 (a2 -> (a3 -> id ((+) a3 3)) ((+) a2 2))) 0
= (a1 -> (a2 -> (a3 -> id ((+) a3 3)) ((+) a2 2)) ((+) a1 1)) 0
= (a1 -> (a2 -> (a3 -> (+) a3 3) ((+) a2 2)) ((+) a1 1)) 0
= (a1 -> (a2 -> (+) ((+) a2 2) 3) ((+) a1 1)) 0
= (a1 -> (+) ((+) ((+) a1 1) 2) 3) 0
= (+) ((+) ((+) 0 1) 2) 3
= ((0 + 1) + 2) + 3
But I still cannot fully understand that, here are my questions:
Is there anyone who can help me? Thanks a lot!
Some explanations are in order!
What is the id function for? What is the role of? Why should we need it here?
id
is the identity function, id x = x
, and is used as the equivalent of zero when building up a chain of functions with function composition, (.)
. You can find it defined in the Prelude.
In the above example, id function is the accumulator in the lambda function?
The accumulator is a function that is being built up via repeated function application. There's no explicit lambda, since we name the accumulator, step
. You can write it with a lambda if you want:
foldl f a bs = foldr (b g x -> g (f x b)) id bs a
Or as Graham Hutton would write:
foldr's prototype is foldr :: (a -> b -> b) -> b -> [a] -> b
A Haskell programmer would say that the type of foldr
is (a -> b -> b) -> b -> [a] -> b
.
and the first parameter is a function which need two parameters, but the step function in the myFoldl's implementation uses 3 parameters, I'm complelely confused
This is confusing and magical! We play a trick and replace the accumulator with a function, which is in turn applied to the initial value to yield a result.
Graham Hutton explains the trick to turn foldl
into foldr
in the above article. We start by writing down a recursive definition of foldl
:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f v [] = v
foldl f v (x : xs) = foldl f (f v x) xs
And then refactor it via the static argument transformation on f
:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f v xs = g xs v
where
g [] v = v
g (x:xs) v = g xs (f v x)
Let's now rewrite g
so as to float the v
inwards:
foldl f v xs = g xs v
where
g [] = v -> v
g (x:xs) = v -> g xs (f v x)
Which is the same as thinking of g
as a function of one argument, that returns a function:
foldl f v xs = g xs v
where
g [] = id
g (x:xs) = v -> g xs (f v x)
Now we have g
, a function that recursively walks a list, apply some function f
. The final value is the identity function, and each step results in a function as well.
But, we have handy already a very similar recursive function on lists, foldr
!
This looks like a very similar recursive scheme to our g
function. Now the trick: using all the available magic at hand (aka Bird, Meertens and Malcolm) we apply a special rule, the universal property of fold , which is an equivalence between two definitions for a function g
that processes lists, stated as:
So, the universal property of folds states that:
g = foldr k v
where g
must be equivalent to the two equations, for some k
and v
:
g [] = v
g (x:xs) = k x (g xs)
From our earlier foldl designs, we know v == id
. For the second equation though, we need to calculate the definition of k
:
g (x:xs) = k x (g xs)
<=> g (x:xs) v = k x (g xs) v -- accumulator of functions
<=> g xs (f v x) = k x (g xs) v -- definition of foldl
<= g' (f v x) = k x g' v -- generalize (g xs) to g'
<=> k = x g' -> (a -> g' (f v x)) -- expand k. recursion captured in g'
Which, substituting our calculated definitions of k
and v
yields a definition of foldl as:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f v xs =
foldr
(x g -> (a -> g (f v x)))
id
xs
v
The recursive g
is replaced with the foldr combinator, and the accumulator becomes a function built via a chain of compositions of f
at each element of the list, in reverse order (so we fold left instead of right).
This is definitely somewhat advanced, so to deeply understand this transformation, the universal property of folds, that makes the transformation possible, I recommend Hutton's tutorial, linked below.
References
Consider the type of foldr
:
foldr :: (b -> a -> a) -> a -> [b] -> a
Whereas the type of step
is something like b -> (a -> a) -> a -> a
. Since step is getting passed to foldr
, we can conclude that in this case the fold has a type like (b -> (a -> a) -> (a -> a)) -> (a -> a) -> [b] -> (a -> a)
.
Don't be confused by the different meanings of a
in different signatures; it's just a type variable. Also, keep in mind that the function arrow is right associative, so a -> b -> c
is the same thing as a -> (b -> c)
.
So, yes, the accumulator value for the foldr
is a function of type a -> a
, and the initial value is id
. This makes some sense, because id
is a function that doesn't do anything--it's the same reason you'd start with zero as the initial value when adding all the values in a list.
As for step
taking three arguments, try rewriting it like this:
step :: b -> (a -> a) -> (a -> a)
step x g = a -> g (f a x)
Does that make it easier to see what's going on? It takes an extra parameter because it's returning a function, and the two ways of writing it are equivalent. Note also the extra parameter after the foldr
: (foldr step id xs) z
. The part in parentheses is the fold itself, which returns a function, which is then applied to z
.
Here's my proof that foldl
can be expressed in terms of foldr
, which I find pretty simple apart from the name spaghetti the step
function introduces.
The proposition is that foldl fz xs
is equivalent to
myfoldl f z xs = foldr step_f id xs z
where step_f x g a = g (f a x)
The first important thing to notice here is that the right hand side of the first line is actually evaluated as
(foldr step_f id xs) z
since foldr
only takes three parameters. This already hints that the foldr
will calculate not a value but a curried function, which is then applied to z
. There are two cases to investigate to find out whether myfoldl
is foldl
:
Base case: empty list
myfoldl f z []
= foldr step_f id [] z (by definition of myfoldl)
= id z (by definition of foldr)
= z
foldl f z []
= z (by definition of foldl)
Non-empty list
myfoldl f z (x:xs)
= foldr step_f id (x:xs) z (by definition of myfoldl)
= step_f x (foldr step_f id xs) z (-> apply step_f)
= (foldr step_f id xs) (f z x) (-> remove parentheses)
= foldr step_f id xs (f z x)
= myfoldl f (f z x) xs (definition of myfoldl)
foldl f z (x:xs)
= foldl f (f z x) xs
Since in 2. the first and the last line have the same form in both cases, it can be used to fold the list down until xs == []
, in which case 1. guarantees the same result. So by induction, myfoldl == foldl
.
上一篇: 如何检查gcc是否执行尾部
下一篇: 使用foldr书写foldl