Why can't GHC reason about some infinite lists?

2018-06-15 06:51:09

This recent question got me thinking about Haskell's ability to work with infinite lists. There are plenty of other questions and answers about infinite lists on StackOverflow, and I understand why we can't have a general solution for all infinite lists, but why can't Haskell reason about some infinite lists?

Let's use the example from the first linked question:

list1 = [1..]
list2 = [x | x <- list1, x <= 4]
print list2
$ [1,2,3,4

@user2297560 writes in the comments:

Pretend you're GHCI. Your user gives you an infinite list and asks you to find all the values in that list that are less than or equal to 4. How would you go about doing it? (Keep in mind that you don't know that the list is in order.)

In this case, the user didn't give you an infinite list. GHC generated it! In fact, it generated it following it's own rules. The Haskell 2010 Standard states the following:

enumFrom       :: a -> [a]            -- [n..]

For the types Int and Integer, the enumeration functions have the following meaning:

The sequence enumFrom e1 is the list [ e1 , e1 + 1, e1 + 2,…].

In his answer to the other question, @chepner writes:

You know that the list is monotonically increasing, but Haskell does not.

The statements these users made don't seem to line up with the standard to me. Haskell created the list in an ordered fashion using a monotonic increase. Haskell should know that the list is both ordered and monotonic. So why can't it reason about this infinite list to turn [x | x <- list1, x <= 4] [x | x <- list1, x <= 4] into takeWhile (<= 4) list1 automatically?

Theoretically, one could imagine a rewrite rule such as

{-# RULES
  "filterEnumFrom" forall (n :: Int) (m :: Int).
                     filter (< n) (enumFrom m) = [m..(n-1)]
  #-}

And that automatically would convert expressions such as filter (< 4) (enumFrom 1) to [1..3] . So it is possible. There is a glaring problem though: any variation from this exact syntactical pattern won't work. The result is that you end up defining a bunch of rules and you can longer ever be sure if they are triggering or not. If you can't rely on the rules, you eventually just don't use them. (Also, note I've specialized the rule to Int s - as was briefly posted as a comment, this may break down in subtle ways for other types.)

At the end of the day, to perform more advanced analysis, GHC would have to have some tracking information attached to lists to say how they were generated. That would either make lists less lightweight of an abstraction or mean that GHC would have some special machinery in it just for optimizing lists at compile time. Neither of these options is nice.

That said, you can always add your own tracking information by making a list type on top of lists.

data List a where
  EnumFromTo :: Enum a => a -> Maybe a -> List a
  Filter :: (a -> Bool) -> List a -> List a 
  Unstructured :: [a] -> List a

This may end up being easier to optimize.

So why can't it reason about this infinite list to turn [x | x <- list1, x <= 4] [x | x <- list1, x <= 4] into takeWhile (<= 4) list1 automatically?

The answer isn't any more specific than "It doesn't use takeWhile because it doesn't use takeWhile ". The spec says:

Translation : List comprehensions satisfy these identities, which may be used as a translation into the kernel:

[ e | True ]         = [ e ]
[ e | q ]            = [ e | q, True ]
[ e | b, Q ]         = if b then [ e | Q ] else []
[ e | p <- l, Q ]    = let ok p = [ e | Q ]
                           ok _ = []
                       in concatMap ok l
[ e | let decls, Q ] = let decls in [ e | Q ]

That is, the meaning of a list comprehension is given by translation into a simpler language with if -expressions, let -bindings, and calls to concatMap . We can figure out the meaning of your example by translating it through the following steps:

[x | x <- [1..], x <= 4]

-- apply rule 4 --
let ok x = [ x | x <= 4 ]
    ok _ = []
in concatMap ok [1..]

-- eliminate unreachable clause in ok --
let ok x = [ x | x <= 4 ]
in concatMap ok [1..]

-- apply rule 2 --
let ok x = [ x | x <= 4, True ]
in concatMap ok [1..]

-- apply rule 3 --
let ok x = if x <= 4 then [ x | True ] else []
in concatMap ok [1..]

-- apply rule 1 --
let ok x = if x <= 4 then [ x ] else []
in concatMap ok [1..]

-- inline ok --
concatMap (x -> if x <= 4 then [ x ] else []) [1..]

链接地址: http://www.djcxy.com/p/43390.html

上一篇: GHC核心为“字节码”？

下一篇: 为什么GHC不能推理一些无限列表？