What is the monomorphism restriction?
I'm puzzled by how the haskell compiler sometimes infers types that are less polymorphic than what I'd expect, for example when using point-free definitions.
It seems like the issue is the "monomorphism restriction", which is on by default on older versions of the compiler.
Consider the following haskell program:
{-# LANGUAGE MonomorphismRestriction #-}
import Data.List(sortBy)
plus = (+)
plus' x = (+ x)
sort = sortBy compare
main = do
print $ plus' 1.0 2.0
print $ plus 1.0 2.0
print $ sort [3, 1, 2]
If I compile this with ghc
I obtain no erros and the output of the executable is:
3.0
3.0
[1,2,3]
If I change the main
body to:
main = do
print $ plus' 1.0 2.0
print $ plus (1 :: Int) 2
print $ sort [3, 1, 2]
I get no compile time errors and the output becomes:
3.0
3
[1,2,3]
as expected. However if I try to change it to:
main = do
print $ plus' 1.0 2.0
print $ plus (1 :: Int) 2
print $ plus 1.0 2.0
print $ sort [3, 1, 2]
I get a type error:
test.hs:13:16:
No instance for (Fractional Int) arising from the literal ‘1.0’
In the first argument of ‘plus’, namely ‘1.0’
In the second argument of ‘($)’, namely ‘plus 1.0 2.0’
In a stmt of a 'do' block: print $ plus 1.0 2.0
The same happens when trying to call sort
twice with different types:
main = do
print $ plus' 1.0 2.0
print $ plus 1.0 2.0
print $ sort [3, 1, 2]
print $ sort "cba"
produces the following error:
test.hs:14:17:
No instance for (Num Char) arising from the literal ‘3’
In the expression: 3
In the first argument of ‘sort’, namely ‘[3, 1, 2]’
In the second argument of ‘($)’, namely ‘sort [3, 1, 2]’
ghc
suddenly think that plus
isn't polymorphic and requires an Int
argument? The only reference to Int
is in an application of plus
, how can that matter when the definition is clearly polymorphic? ghc
suddenly think that sort
requires a Num Char
instance? Moreover if I try to place the function definitions into their own module, as in:
{-# LANGUAGE MonomorphismRestriction #-}
module TestMono where
import Data.List(sortBy)
plus = (+)
plus' x = (+ x)
sort = sortBy compare
I get the following error when compiling:
TestMono.hs:10:15:
No instance for (Ord a0) arising from a use of ‘compare’
The type variable ‘a0’ is ambiguous
Relevant bindings include
sort :: [a0] -> [a0] (bound at TestMono.hs:10:1)
Note: there are several potential instances:
instance Integral a => Ord (GHC.Real.Ratio a)
-- Defined in ‘GHC.Real’
instance Ord () -- Defined in ‘GHC.Classes’
instance (Ord a, Ord b) => Ord (a, b) -- Defined in ‘GHC.Classes’
...plus 23 others
In the first argument of ‘sortBy’, namely ‘compare’
In the expression: sortBy compare
In an equation for ‘sort’: sort = sortBy compare
ghc
able to use the polymorphic type Ord a => [a] -> [a]
for sort
? ghc
treat plus
and plus'
differently? plus
should have the polymorphic type Num a => a -> a -> a
and I don't really see how this is different from the type of sort
and yet only sort
raises an error. Last thing: if I comment the definition of sort
the file compiles. However if I try to load it into ghci
and check the types I get:
*TestMono> :t plus
plus :: Integer -> Integer -> Integer
*TestMono> :t plus'
plus' :: Num a => a -> a -> a
Why isn't the type for plus
polymorphic?
This is the canonical question about monomorphism restriction in Haskell as discussed in the meta question.
What is the monomorphism restriction?
The monomorphism restriction as stated by the Haskell wiki is:
a counter-intuitive rule in Haskell type inference. If you forget to provide a type signature, sometimes this rule will fill the free type variables with specific types using "type defaulting" rules.
What this means is that, in some circumstances, if your type is ambiguous (ie polymorphic) the compiler will choose to instantiate that type to something not ambiguous.
How do I fix it?
First of all you can always explicitly provide a type signature and this will avoid the triggering of the restriction:
plus :: Num a => a -> a -> a
plus = (+) -- Okay!
-- Runs as:
Prelude> plus 1.0 1
2.0
Alternatively, if you are defining a function, you can avoid point-free style, and for example write:
plus x y = x + y
Turning it off
It is possible to simply turn off the restriction so that you don't have to do anything to your code to fix it. The behaviour is controlled by two extensions: MonomorphismRestriction
will enable it (which is the default) while NoMonomorphismRestriction
will disable it.
You can put the following line at the very top of your file:
{-# LANGUAGE NoMonomorphismRestriction #-}
If you are using GHCi you can enable the extension using the :set
command:
Prelude> :set -XNoMonomorphismRestriction
You can also tell ghc
to enable the extension from the command line:
ghc ... -XNoMonomorphismRestriction
Note: You should really prefer the first option over choosing extension via command-line options.
Refer to GHC's page for an explanation of this and other extensions.
A complete explanation
I'll try to summarize below everything you need to know to understand what the monomorphism restriction is, why it was introduced and how it behaves.
An example
Take the following trivial definition:
plus = (+)
you'd think to be able to replace every occurrence of +
with plus
. In particular since (+) :: Num a => a -> a -> a
you'd expect to also have plus :: Num a => a -> a -> a
.
Unfortunately this is not the case. For example in we try the following in GHCi:
Prelude> let plus = (+)
Prelude> plus 1.0 1
We get the following output:
<interactive>:4:6:
No instance for (Fractional Integer) arising from the literal ‘1.0’
In the first argument of ‘plus’, namely ‘1.0’
In the expression: plus 1.0 1
In an equation for ‘it’: it = plus 1.0 1
You may need to :set -XMonomorphismRestriction
in newer GHCi versions.
And in fact we can see that the type of plus
is not what we would expect:
Prelude> :t plus
plus :: Integer -> Integer -> Integer
What happened is that the compiler saw that plus
had type Num a => a -> a -> a
, a polymorphic type. Moreover it happens that the above definition falls under the rules that I'll explain later and so he decided to make the type monomorphic by defaulting the type variable a
. The default is Integer
as we can see.
Note that if you try to compile the above code using ghc
you won't get any errors. This is due to how ghci
handles (and must handle) the interactive definitions. Basically every statement entered in ghci
must be completely type checked before the following is considered; in other words it's as if every statement was in a separate module. Later I'll explain why this matter.
Some other example
Consider the following definitions:
f1 x = show x
f2 = x -> show x
f3 :: (Show a) => a -> String
f3 = x -> show x
f4 = show
f5 :: (Show a) => a -> String
f5 = show
We'd expect all these functions to behave in the same way and have the same type, ie the type of show
: Show a => a -> String
.
Yet when compiling the above definitions we obtain the following errors:
test.hs:3:12:
No instance for (Show a1) arising from a use of ‘show’
The type variable ‘a1’ is ambiguous
Relevant bindings include
x :: a1 (bound at blah.hs:3:7)
f2 :: a1 -> String (bound at blah.hs:3:1)
Note: there are several potential instances:
instance Show Double -- Defined in ‘GHC.Float’
instance Show Float -- Defined in ‘GHC.Float’
instance (Integral a, Show a) => Show (GHC.Real.Ratio a)
-- Defined in ‘GHC.Real’
...plus 24 others
In the expression: show x
In the expression: x -> show x
In an equation for ‘f2’: f2 = x -> show x
test.hs:8:6:
No instance for (Show a0) arising from a use of ‘show’
The type variable ‘a0’ is ambiguous
Relevant bindings include f4 :: a0 -> String (bound at blah.hs:8:1)
Note: there are several potential instances:
instance Show Double -- Defined in ‘GHC.Float’
instance Show Float -- Defined in ‘GHC.Float’
instance (Integral a, Show a) => Show (GHC.Real.Ratio a)
-- Defined in ‘GHC.Real’
...plus 24 others
In the expression: show
In an equation for ‘f4’: f4 = show
So f2
and f4
don't compile. Moreover when trying to define these function in GHCi we get no errors, but the type for f2
and f4
is () -> String
!
Monomorphism restriction is what makes f2
and f4
require a monomorphic type, and the different behaviour bewteen ghc
and ghci
is due to different defaulting rules.
When does it happen?
In Haskell, as defined by the report, there are two distinct type of bindings. Function bindings and pattern bindings. A function binding is nothing else than a definition of a function:
f x = x + 1
Note that their syntax is:
<identifier> arg1 arg2 ... argn = expr
Modulo guards and where
declarations. But they don't really matter.
where there must be at least one argument.
A pattern binding is a declaration of the form:
<pattern> = expr
Again, modulo guards.
Note that variables are patterns, so the binding:
plus = (+)
is a pattern binding. It's binding the pattern plus
(a variable) to the expression (+)
.
When a pattern binding consists of only a variable name it's called a simple pattern binding.
The monomorphism restriction applies to simple pattern bindings!
Well, formally we should say that:
A declaration group is a minimal set of mutually dependent bindings.
Section 4.5.1 of the report.
And then (Section 4.5.5 of the report):
a given declaration group is unrestricted if and only if:
every variable in the group is bound by a function binding (eg fx = x
) or a simple pattern binding (eg plus = (+)
; Section 4.4.3.2 ), and
an explicit type signature is given for every variable in the group that is bound by simple pattern binding. (eg plus :: Num a => a -> a -> a; plus = (+)
).
Examples added by me.
So a restricted declaration group is a group where, either there are non-simple pattern bindings (eg (x:xs) = f something
or (f, g) = ((+), (-))
) or there is some simple pattern binding without a type signature (as in plus = (+)
).
The monomorphism restriction affects restricted declaration groups.
Most of the time you don't define mutual recursive functions and hence a declaration group becomes just a binding.
What does it do?
The monomorphism restriction is described by two rules in Section 4.5.5 of the report.
First rule
The usual Hindley-Milner restriction on polymorphism is that only type variables that do not occur free in the environment may be generalized. In addition, the constrained type variables of a restricted declaration group may not be generalized in the generalization step for that group. (Recall that a type variable is constrained if it must belong to some type class; see Section 4.5.2 .)
The highlighted part is what the monomorphism restriction introduces. It says that if the type is polymorphic (ie it contain some type variable) and that type variable is constrained (ie it has a class constraint on it: eg the type Num a => a -> a -> a
is polymorphic because it contains a
and also contrained because the a
has the constraint Num
over it.) then it cannot be generalized.
In simple words not generalizing means that the uses of the function plus
may change its type.
If you had the definitions:
plus = (+)
x :: Integer
x = plus 1 2
y :: Double
y = plus 1.0 2
then you'd get a type error. Because when the compiler sees that plus
is called over an Integer
in the declaration of x
it will unify the type variable a
with Integer
and hence the type of plus
becomes:
Integer -> Integer -> Integer
but then, when it will type check the definition of y
, it will see that plus
is applied to a Double
argument, and the types don't match.
Note that you can still use plus
without getting an error:
plus = (+)
x = plus 1.0 2
In this case the type of plus
is first inferred to be Num a => a -> a -> a
but then its use in the definition of x
, where 1.0
requires a Fractional
constraint, will change it to Fractional a => a -> a -> a
.
Rationale
The report says:
Rule 1 is required for two reasons, both of which are fairly subtle.
Rule 1 prevents computations from being unexpectedly repeated. For example, genericLength
is a standard function (in library Data.List
) whose type is given by
genericLength :: Num a => [b] -> a
Now consider the following expression:
let len = genericLength xs
in (len, len)
It looks as if len
should be computed only once, but without Rule 1 it might be computed twice, once at each of two different overloadings. If the programmer does actually wish the computation to be repeated, an explicit type signature may be added:
let len :: Num a => a
len = genericLength xs
in (len, len)
For this point the example from the wiki is, I believe, clearer. Consider the function:
f xs = (len, len)
where
len = genericLength xs
If len
was polymorphic the type of f
would be:
f :: Num a, Num b => [c] -> (a, b)
So the two elements of the tuple (len, len)
could actually be different values! But this means that the computation done by genericLength
must be repeated to obtain the two different values.
The rationale here is: the code contains one function call, but not introducing this rule could produce two hidden function calls, which is counter intuitive.
With the monomorphism restriction the type of f
becomes:
f :: Num a => [b] -> (a, a)
In this way there is no need to perform the computation multiple times.
Rule 1 prevents ambiguity. For example, consider the declaration group
[(n,s)] = reads t
Recall that reads
is a standard function whose type is given by the signature
reads :: (Read a) => String -> [(a,String)]
Without Rule 1, n
would be assigned the type ∀ a. Read a ⇒ a
∀ a. Read a ⇒ a
and s
the type ∀ a. Read a ⇒ String
∀ a. Read a ⇒ String
. The latter is an invalid type, because it is inherently ambiguous. It is not possible to determine at what overloading to use s
, nor can this be solved by adding a type signature for s
. Hence, when non-simple pattern bindings are used (Section 4.4.3.2 ), the types inferred are always monomorphic in their constrained type variables, irrespective of whether a type signature is provided. In this case, both n
and s
are monomorphic in a
.
Well, I believe this example is self-explanatory. There are situations when not applying the rule results in type ambiguity.
If you disable the extension as suggest above you will get a type error when trying to compile the above declaration. However this isn't really a problem: you already know that when using read
you have to somehow tell the compiler which type it should try to parse...
Second rule
This means that. If you have your usual definition:
plus = (+)
This will have a type Num a => a -> a -> a
where a
is a monomorphic type variable due to rule 1 described above. Once the whole module is inferred the compiler will simply choose a type that will replace that a
according to the defaulting rules.
The final result is: plus :: Integer -> Integer -> Integer
.
Note that this is done after the whole module is inferred.
This means that if you have the following declarations:
plus = (+)
x = plus 1.0 2.0
inside a module, before type defaulting the type of plus
will be: Fractional a => a -> a -> a
(see rule 1 for why this happens). At this point, following the defaulting rules, a
will be replaced by Double
and so we will have plus :: Double -> Double -> Double
and x :: Double
.
Defaulting
As stated before there exist some defaulting rules, described in Section 4.3.4 of the Report, that the inferencer can adopt and that will replace a polymorphic type with a monomorphic one. This happens whenever a type is ambiguous.
For example in the expression:
let x = read "<something>" in show x
here the expression is ambiguous because the types for show
and read
are:
show :: Show a => a -> String
read :: Read a => String -> a
So the x
has type Read a => a
. But this constraint is satisfied by a lot of types: Int
, Double
or ()
for example. Which one to choose? There's nothing that can tell us.
In this case we can resolve the ambiguity by telling the compiler which type we want, adding a type signature:
let x = read "<something>" :: Int in show x
Now the problem is: since Haskell uses the Num
type class to handle numbers, there are a lot of cases where numerical expressions contain ambiguities.
Consider:
show 1
What should the result be?
As before 1
has type Num a => a
and there are many type of numbers that could be used. Which one to choose?
Having a compiler error almost every time we use a number isn't a good thing, and hence the defaulting rules were introduced. The rules can be controlled using a default
declaration. By specifying default (T1, T2, T3)
we can change how the inferencer defaults the different types.
An ambiguous type variable v
is defaultable if:
v
appears only in contraints of the kind C v
were C
is a class (ie if it appears as in: Monad (mv)
then it is not defaultable). Num
or a subclass of Num
. A defaultable type variable is replaced by the first type in the default
list that is an instance of all the ambiguous variable's classes.
The default default
declaration is default (Integer, Double)
.
For example:
plus = (+)
minus = (-)
x = plus 1.0 1
y = minus 2 1
The types inferred would be:
plus :: Fractional a => a -> a -> a
minus :: Num a => a -> a -> a
which, by defaulting rules, become:
plus :: Double -> Double -> Double
minus :: Integer -> Integer -> Integer
Note that this explains why in the example in the question only the sort
definition raises an error. The type Ord a => [a] -> [a]
cannot be defaulted because Ord
isn't a numeric class.
Extended defaulting
Note that GHCi comes with extended defaulting rules (or herefor GHC8), which can be enabled in files as well using the ExtendedDefaultRules
extensions.
The defaultable type variables need not only appear in contraints where all the classes are standard and there must be at least one class that is among Eq
, Ord
, Show
or Num
and its subclasses.
Moreover the default default
declaration is default ((), Integer, Double)
.
This may produce odd results. Taking the example from the question:
Prelude> :set -XMonomorphismRestriction
Prelude> import Data.List(sortBy)
Prelude Data.List> let sort = sortBy compare
Prelude Data.List> :t sort
sort :: [()] -> [()]
in ghci we don't get a type error but the Ord a
constraints results in a default of ()
which is pretty much useless.
Useful links
There are a lot of resources and discussions about the monomorphism restriction.
Here are some links that I find useful and that may help you understand or deep further into the topic:
上一篇: 从参数的约束推断类型族的约束
下一篇: 什么是单态限制?