如何提高Haskell程序的性能？

2018-06-08 00:13:09

作为学习Haskell的一种方式，我正在解决Project Euler中的问题，并且发现即使在编译时，我的程序也比可比较的C版本慢很多。我能做些什么来加快我的Haskell程序？

例如，我对问题14的蛮力解决方案是：

import Data.Int
import Data.Ord
import Data.List

searchTo = 1000000

nextNumber :: Int64 -> Int64
nextNumber n
    | even n    = n `div` 2
    | otherwise = 3 * n + 1

sequenceLength :: Int64 -> Int
sequenceLength 1 = 1
sequenceLength n = 1 + (sequenceLength next)
    where next = nextNumber n

longestSequence = maximumBy (comparing sequenceLength) [1..searchTo]

main = putStrLn $ show $ longestSequence

这需要大约220秒，而“等效”强力C版本只需要1.2秒。

#include <stdio.h>

int main(int argc, char **argv)
{
    int longest = 0;
    int terms = 0;
    int i;
    unsigned long j;

    for (i = 1; i <= 1000000; i++)
    {
        j = i;
        int this_terms = 1;

        while (j != 1)
        {
            this_terms++;

            if (this_terms > terms)
            {
                terms = this_terms;
                longest = i;
            }

            if (j % 2 == 0)
                j = j / 2;
            else
                j = 3 * j + 1;
        }
    }

    printf("%dn", longest);
    return 0;
}

我究竟做错了什么？或者我天真地认为Haskell甚至可以接近C的速度？

（我使用gcc -O2编译C版本，使用ghc -make -O编译Haskell版本）。

为了测试目的，我只设置searchTo = 100000 。所花费的时间是7.34s 。一些修改导致了一些重大改进：

使用Integer而不是Int64 。这将时间缩短到1.75秒 。

使用累加器（你不需要sequenceLength是懒惰的吗？） 1.54s 。

seqLen2 :: Int -> Integer -> Int
seqLen2 a 1 = a
seqLen2 a n = seqLen2 (a+1) (nextNumber n)

sequenceLength :: Integer -> Int
sequenceLength = seqLen2 1

重写nextNumber使用quotRem ，从而避免计算分裂两次（一次在even ，一次在div ）。 1.27s 。

nextNumber :: Integer -> Integer
nextNumber n 
    | r == 0    = q
    | otherwise = 6*q + 4
    where (q,r) = quotRem n 2

使用Schwartzian变换而不是maximumBy 。 maximumBy . comparing的问题maximumBy . comparing maximumBy . comparing而言， sequenceLength函数对于每个值都被调用多次。 0.32s 。

longestSequence = snd $ maximum [(sequenceLength a, a) | a <- [1..searchTo]]

注意：

我通过编译ghc -O检查时间，并使用+RTS -s运行）

我的机器在Mac OS X 10.6上运行。 GHC版本是6.12.2。编译的文件在i386架构中。）

C问题运行在0.078s与相应的参数。它是用gcc -O3 -m32编译的。

尽管这已经很老了，让我来参加，但还有一个关键问题没有得到解决。

首先，我的箱子上的不同节目的时间安排。由于我在64位Linux系统上，它们显示出不同的特征：使用Integer而不是Int64不会像使用32位GHC那样提高性能，其中每个Int64操作都会产生C调用的代价而Integer适用于带符号32位整数的计算不需要外部调用（因为在这里只有少数操作超过此范围， Integer在32位GHC上是更好的选择）。

C：0.3秒

原始Haskell：14.24秒，使用Integer而不是Int64 ：33.96秒

KennyTM的改进版本：5.55秒，使用Int ：1.85秒

Chris Kuklewicz的版本：5.73秒，使用Int ：1.90秒

FUZxxl的版本：3.56秒，使用quotRem而不是divMod ：1.79秒

那么我们有什么？

用累加器计算长度，以便编译器可以将它（基本上）转换为循环

不要重新计算比较的序列长度

不要使用div resp。 divMod当它不是必要的， quot resp。 quotRem要快得多

还有什么遗漏？

if (j % 2 == 0)
    j = j / 2;
else
    j = 3 * j + 1;

我使用的任何C编译器都将测试j % 2 == 0转换为位掩码，并且不使用分割指令。 GHC还没有（那）那样做。因此，测试even n或计算n `quotRem` 2是相当昂贵的操作。用nextNumber Integer版本替换nextNumber

nextNumber :: Integer -> Integer
nextNumber n
    | fromInteger n .&. 1 == (0 :: Int) = n `quot` 2
    | otherwise = 3*n+1

将其运行时间减少到3.25秒（注意：对于Integer ， n `quot` 2 n `shiftR` 1比n `shiftR` 1快，这需要12.69秒！）。

在Int版本中执行相同操作可将运行时间缩短到0.41秒。对于Int ，除2的比特移位比quot运算quot快一点，将其运行时间减少到0.39秒。

消除列表的构建（即不在C版本中出现），

module Main (main) where

import Data.Bits

result :: Int
result = findMax 0 0 1

findMax :: Int -> Int -> Int -> Int
findMax start len can
    | can > 1000000 = start
    | canlen > len = findMax can canlen (can+1)
    | otherwise = findMax start len (can+1)
      where
        canlen = findLen 1 can

findLen :: Int -> Int -> Int
findLen l 1 = l
findLen l n
    | n .&. 1 == 0  = findLen (l+1) (n `shiftR` 1)
    | otherwise     = findLen (l+1) (3*n+1)

main :: IO ()
main = print result

产生更小的加速，导致0.37秒的运行时间。

因此，与C版紧密对应的Haskell版本花费的时间并不长，这是1.3的一个因素。

那么，让我们公平一点，C版本中存在的效率低下，在Haskell版本中不存在，

if (this_terms > terms)
{
    terms = this_terms;
    longest = i;
}

出现在内部循环中。在C版本中取消内循环将其运行时间减少到0.27秒，使得系数为1.4。

比较可能是重新计算sequenceLength长度太长。这是我最好的版本：

type I = Integer
data P = P {-# UNPACK #-} !Int {-# UNPACK #-} !I deriving (Eq,Ord,Show)

searchTo = 1000000

nextNumber :: I -> I
nextNumber n = case quotRem n 2 of
                  (n2,0) -> n2
                  _ -> 3*n+1

sequenceLength :: I -> Int
sequenceLength x = count x 1 where
  count 1 acc = acc
  count n acc = count (nextNumber n) (succ acc)

longestSequence = maximum . map (i -> P (sequenceLength i) i) $ [1..searchTo]

main = putStrLn $ show $ longestSequence

答案和时间比C慢，但它使用任意精度整数（通过Integer类型）：

ghc -O2 --make euler14-fgij.hs
time ./euler14-fgij
P 525 837799

real 0m3.235s
user 0m3.184s
sys  0m0.015s

链接地址: http://www.djcxy.com/p/24345.html

上一篇: How to improve the performance of this Haskell program?

下一篇: Hibernate/Gorm query with distinct results and JOIN on non primary key