Preventing observable sharing for certain subtrees in Haskell
I have an EDSL which offers list-like combinators for arrays ( map
, zipWith
, etc..)
Some combinators require certain inputs to be random access. Eg the data array of Gather
picking the elements from data array at the indices specified by the other:
Gather (Manifest [10,11,12]) (Manifest [2,0,0,1,2]) = [12,10,10,11,12]
The language makes use of data-reify
package for recovering sharing. The problem is that sometimes the same subtree contains both nodes that need to provide random access and those that can be computed sequentially. Having them shared breaks the subsequent evaluator.
For example in the tree below I would like for [1,2,3]
to keep being shared, but the Manifests
wrapping them to be different nodes in the recovered graph:
[1, 2, 3]
/
Manifest Manifest
| |
| Map (+1)
/
Gather
A more complex example may include a number of shared nodes all of which should become distinct (even though Map (+1) (Manifest [1,2,3])
could be shared.
[1, 2, 3]
/
Manifest Manifest
| |
Map (+1) Map (+1)
| /|
Map (*2) / |
/ |
Gather |
|
ZipWith (+)
Even if I find a solution for the simple case ( Gather
references Manifest
directly), it will already cover most of the use cases.
Any pointers are welcome!
Below is a simple mock-up of the language.
module NoSharing where
data AST = Manifest [Int]
| Map (Int -> Int) AST
| ZipWith (Int -> Int -> Int) AST AST
| Gather AST -- ^ Data
AST -- ^ Indices
complex = ZipWith (+) gathered indexes
where
gathered = Gather (Map (*2) indexes) indexes
indexes = Map (+1) $ Manifest [1,2,3]
simple = Gather dat indexes
where
dat = Manifest [1,2,3]
indexes = Map (+1) dat
One way you could do this is to manually eliminate the sharing before you call data-reify
. For example, this function should hopefully unshare a top-level Manifest
constructor, but leave its argument shared:
rebuildManifest :: AST -> AST
rebuildManifest (Manifest xs) = Manifest xs
rebuildManifest t = t
Now to unshare any Manifest
under a Gather
, you could do the same thing recursively, taking care to reuse the original when appropriate
rebuildAllManifestsUnderGather :: AST -> (AST, Bool)
rebuildAllManifestsUnderGather t@(Map f t') =
let (newt', reuse) = rebuildAllManifestsUnderGather t'
in if reuse then (t, True) else (Map f newt', False)
rebuildAllManifestsUnderGather t@(ZipWith f t1 t2) =
let (newt1, reuse1) = rebuildAllManifestsUnderGather t1
(newt2, reuse2) = rebuildAllManifestsUnderGather t2
in if reuse1 && reuse2 then (t, True) else (ZipWith f newt1 newt2, False)
rebuildAllManifestsUnderGather t@(Gather t1 t2) =
let (newt1, reuse1) = rebuildManifest $ rebuildAllManifestsUnderGather t1
(newt2, reuse2) = rebuildManifest $ rebuildAllManifestsUnderGather t2
in if reuse1 && reuse2 then (t, True) else (Gather newt1 newt2, False)
where rebuildManifest (Manifest xs, _) = (Manifest xs, False)
rebuildManifest (t, reuse) = (t, reuse)
rebuildAllManifestsUnderGather t@(Manifest xs) = (t, True)
However, be warned : observable sharing is not guaranteed and might be unreliable in both directions. The GHC optimiser could quite legally "re-share" the attempts to unshare Manifest
above. I don't know what it would do in practice.
Also this solution is quite complex, so given the fragility, it might be better to have an explicit unsharing pass after calling data-reify
.
下一篇: 防止Haskell中某些子树的可观察共享