Load pure global variable from file
I have a file with some data in it. This data never changes and I want to make it available outside of the IO monad. How can I do that?
Example (note that this is just an example, my data is not computable):
primes.txt:
2 3 5 7 13
code.hs:
primes :: [Int]
primes = map read . words . unsafePerformIO . readFile $ "primes.txt"
Is this a "legal" use of unsafePerformIO
? Are there alternatives?
You could use TemplateHaskell to read in the file at compile time. The data of the file would then be stored as an actual string in the program.
In one module ( Text/Literal/TH.hs
in this example), define this:
module Text.Literal.TH where
import Language.Haskell.TH
import Language.Haskell.TH.Quote
literally :: String -> Q Exp
literally = return . LitE . StringL
lit :: QuasiQuoter
lit = QuasiQuoter { quoteExp = literally }
litFile :: QuasiQuoter
litFile = quoteFile lit
In your module, you can then do:
{-# LANGUAGE QuasiQuotes #-}
module MyModule where
import Text.Literal.TH (litFile)
primes :: [Int]
primes = map read . words $ [litFile|primes.txt|]
When you compile your program, GHC will open the primes.txt
file and insert its contents where the [litFile|primes.txt|]
part is.
Using unsafePerformIO
in that way isn't great.
The declaration primes :: [Int]
says that primes
is a list of numbers. One particular list of numbers, that doesn't depend on anything.
In fact, however, it depends on the state of file "primes.txt" when the definition happens to be evaluated. Someone could alter this file to alter the value that primes
appears to have, which shouldn't be possible according to its type.
In the presence of a hypothetical optimisation which decides that primes
should be recomputed on demand rather than stored in memory in full (after all, its type says we'll get the same thing every time we recompute it), primes
could even appear to have two different values during a single run of the program. This is the sort of problem that can come with using unsafePerformIO
to lie to the compiler.
In practice, all of the above are probably unlikely to be a problem.
But the theoretically correct thing to do is to not make primes
a global constant (because it's not a constant). Instead, you make the computation that needs it parameterised on it (ie take primes
as an argument), and in the outer IO
program you read the file and then call the pure computation by passing the pure value the IO
program extracted from the file. You get the best of both worlds; you don't have to lie to the compiler, and you don't have to put your entire program in IO
. You can use constructs such as the Reader monad to avoid having to manually pass primes
around everywhere, if that helps.
So you can use unsafePerformIO
if you want to just get on with it. It's theoretically wrong, but unlikely to cause issues in practice.
Or you can refactor your program to reflect what's really going on.
Or, if primes
really is a global constant and you just don't want to literally include a huge chunk of data in your program source, you can use TemplateHaskell as demonstrated by dflemstr.
Yes, it should be fine. You could add a {-# NOINLINE primes #-}
pragma to be safe — not sure whether GHC would ever inline a CAF.
The only alternative I can think of is to do the same thing during compile time (using Template Haskell), essentially embedding the primes into the binary. However, I prefer your version — note that the primes
list will be actually read & created lazily!
上一篇: 为什么GHC这么大/大?
下一篇: 从文件加载纯全局变量