为什么别名函数会使Haskell的性能下降

Question

[[澄清我的问题]：我想知道为什么简写的代码会降低性能当我使用State Monad。时，您会看到State -freeST的纯版本和the master branch of my repo monad版本代码和基准测试结果。

[当我尝试在我的代码中使用简写时，我的简写函数得到了意外的结果。

在LazinessTest branch of this repository中，

我试图简化this code，

runTimeSlot' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot' (target : idx : rest) inst operand = do
  d <- get
  case inst of
    -- this code ↓↓↓↓
    0 -> case (rem target sizeOfTarget) of
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      ...
    1 -> case (rem target sizeOfTarget) of
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      ...

如下所示

runTimeSlot'' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'' (target : idx : rest) inst operand = do
  d <- get
  case inst of
    -- as like as ↓↓↓↓
    0 -> case targetInData of
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      ...
    1 -> case targetInData of
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      ...
  where targetInData = rem target sizeOfTarget

它显示出性能从622μs大幅降低到1.767 ms。

即使仅在下一步targetInData中将评估值case，我认为我可以通过使targetInData严格类似于

来解决它

runTimeSlot''' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''' (target : idx : rest) inst operand = do
  d <- get
  -- evaluate it ↓↓ here before it used
  targetInData `seq` case inst of
    0 -> case targetInData of
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      ...
    1 -> case targetInData of
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      ...
  where targetInData = rem target sizeOfTarget

但是这也不起作用。（耗时1.758毫秒）

基于@AndrásKovács的评论（感谢@AndrásKovács）我已将BangPatterns添加到targetInData，就像

runTimeSlot''b :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''b (target : idx : rest) inst operand = do
  d <- get
  case inst of
    0 -> case targetInData of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      ...
    1 -> case targetInData of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      ...
  where !targetInData = rem target sizeOfTarget

runTimeSlot'''b :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'''b (target : idx : rest) inst operand = do
  d <- get
  -- evaluate it ↓↓ here before it used
  targetInData `seq` case inst of
    0 -> case targetInData of
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      ...
    1 -> case targetInData of
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      ...
  where !targetInData = rem target sizeOfTarget

虽然有所帮助，但不能完全解决意外情况。

1.767毫秒@ runTimeSlot''-> 1.527毫秒@ runTimeSlot''b
1.758毫秒@ runTimeSlot'''-> 1.503毫秒@ runTimeSlot'''b

为什么不像[622] @ @ runTimeSlot那样大？

我自己不能懒惰地解释这种情况。

您能解释一下为什么仅将(rem target sizeOfTaregt)替换为速记代码会导致性能变差吗？

这里是单个可编译的示例代码：（很抱歉我无法减少不必要的代码）

-- dependencies: base, containers, criterion, deepseq, mtl, splitmix
{-# LANGUAGE BangPatterns #-}

module Main where


import           Criterion.Main
import           Criterion.Types

import           Control.DeepSeq
import           Control.Monad.State.Strict

import           Data.Bifunctor
import           Data.Maybe
import qualified Data.IntMap                   as IM
import           Data.List

import           System.Random.SplitMix


myConfig60s =
  defaultConfig { timeLimit = 60.0, resamples = 10000, verbosity = Verbose }

randomInts :: SMGen -> [Int]
randomInts = unfoldr (Just . (first fromIntegral . bitmaskWithRejection64 64))

main :: IO ()
main = do
  putStrLn "Initialize"
  let size                             = 10000
  let instSize                         = 2
  let targetSize                       = 16
  let operandSize                      = 256
  let i0Gen                            = (mkSMGen . fromIntegral) 0
  let (targetGen, i1Gen)               = splitSMGen i0Gen
  let (instGen, i2Gen)                 = splitSMGen i1Gen
  let (operGen, iGen)                  = splitSMGen i2Gen
  let infTargetList = map (\x -> rem x targetSize) $ randomInts targetGen
  let infInstList = map (\x -> rem x instSize) $ randomInts instGen
  let infOperandList = map (\x -> rem x operandSize + 1) $ randomInts operGen
  let (iTime : iBalance : iStatus : _) = randomInts iGen
  let targetList                       = take (size * 2) infTargetList
  let instList                         = take size infInstList
  let operandList                      = take size infOperandList

  targetList `deepseq` instList `deepseq` operandList `deepseq` putStrLn
    "Evaluated"

  let iData = Data iTime iBalance iStatus IM.empty

  let
    ssBench =
      bgroup "SingleState Simulation"
        $ [ bench "SingleState.StrictState'" $ nf
            ( runState
            $ runSimulator' size targetList instList operandList
            )
            iData
          , bench "SingleState.StrictState''" $ nf
            ( runState
            $ runSimulator'' size targetList instList operandList
            )
            iData
          , bench "SingleState.StrictState''b" $ nf
            ( runState
            $ runSimulator''b size targetList instList operandList
            )
            iData
          , bench "SingleState.StrictState'''" $ nf
            ( runState
            $ runSimulator''' size targetList instList operandList
            )
            iData
          , bench "SingleState.StrictState'''b" $ nf
            ( runState
            $ runSimulator'''b size targetList instList operandList
            )
            iData
          , bench "SingleState.StrictState''''" $ nf
            ( runState
            $ runSimulator'''' size targetList instList operandList
            )
            iData
          , bench "SingleState.StrictState'''''" $ nf
            ( runState
            $ runSimulator''''' size targetList instList operandList
            )
            iData
          ]
  putStrLn "Do bench"
  defaultMainWith myConfig60s [ssBench]

runSimulator :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator 0    _     _           _           = get
runSimulator size tList (i : iList) (o : oList) = do
  restTList <- runTimeSlot tList i o
  runSimulator (size - 1) restTList iList oList

runTimeSlot :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot (target : idx : rest) inst operand = do
  d <- get
  case inst of
    0 -> case (rem target sizeOfTarget) of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      1 -> state $ \s -> ((idx : rest), setBalance operand d)
      2 -> state $ \s -> ((idx : rest), setStatus operand d)
      3 -> state $ \s -> (rest, setEntry idx operand d)
    1 -> case (rem target sizeOfTarget) of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime rF d)
      1 -> state $ \s -> ((idx : rest), modifyBalance rF d)
      2 -> state $ \s -> ((idx : rest), modifyStatus rF d)
      3 -> state $ \s -> (rest, modifyEntry rF idx d)
    -- 2 -> Add
    -- 3 -> Div
  where rF x = rem x operand


runSimulator' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator' 0    _     _           _           = get
runSimulator' size tList (i : iList) (o : oList) = do
  restTList <- runTimeSlot' tList i o
  runSimulator' (size - 1) restTList iList oList

runTimeSlot' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot' (target : idx : rest) inst operand = do
  d <- get
  case inst of
    0 -> case (rem target sizeOfTarget) of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      1 -> state $ \s -> ((idx : rest), setBalance operand d)
      2 -> state $ \s -> ((idx : rest), setStatus operand d)
      3 -> state $ \s -> (rest, setEntry idx operand d)
    1 -> case (rem target sizeOfTarget) of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
      2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
      3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
    -- 2 -> Add
    -- 3 -> Div


runSimulator'' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator'' 0    _     _           _           = get
runSimulator'' size tList (i : iList) (o : oList) = do
  restTList <- runTimeSlot'' tList i o
  runSimulator'' (size - 1) restTList iList oList

runTimeSlot'' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'' (target : idx : rest) inst operand = do
  d <- get
  case inst of
    0 -> case targetInData of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      1 -> state $ \s -> ((idx : rest), setBalance operand d)
      2 -> state $ \s -> ((idx : rest), setStatus operand d)
      3 -> state $ \s -> (rest, setEntry idx operand d)
    1 -> case targetInData of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
      2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
      3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
    -- 2 -> Add
    -- 3 -> Div
  where targetInData = rem target sizeOfTarget


runSimulator''b :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator''b 0    _     _           _           = get
runSimulator''b size tList (i : iList) (o : oList) = do
  restTList <- runTimeSlot''b tList i o
  runSimulator''b (size - 1) restTList iList oList

runTimeSlot''b :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''b (target : idx : rest) inst operand = do
  d <- get
  case inst of
    0 -> case targetInData of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      1 -> state $ \s -> ((idx : rest), setBalance operand d)
      2 -> state $ \s -> ((idx : rest), setStatus operand d)
      3 -> state $ \s -> (rest, setEntry idx operand d)
    1 -> case targetInData of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
      2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
      3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
    -- 2 -> Add
    -- 3 -> Div
  where !targetInData = rem target sizeOfTarget


runSimulator''' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator''' 0    _     _           _           = get
runSimulator''' size tList (i : iList) (o : oList) = do
  restTList <- runTimeSlot''' tList i o
  runSimulator''' (size - 1) restTList iList oList

runTimeSlot''' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''' (target : idx : rest) inst operand = do
  d <- get
  targetInData `seq` case inst of
    0 -> case targetInData of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      1 -> state $ \s -> ((idx : rest), setBalance operand d)
      2 -> state $ \s -> ((idx : rest), setStatus operand d)
      3 -> state $ \s -> (rest, setEntry idx operand d)
    1 -> case targetInData of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
      2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
      3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
    -- 2 -> Add
    -- 3 -> Div
  where targetInData = rem target sizeOfTarget


runSimulator'''b :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator'''b 0    _     _           _           = get
runSimulator'''b size tList (i : iList) (o : oList) = do
  restTList <- runTimeSlot'''b tList i o
  runSimulator'''b (size - 1) restTList iList oList

runTimeSlot'''b :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'''b (target : idx : rest) inst operand = do
  d <- get
  targetInData `seq` case inst of
    0 -> case targetInData of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      1 -> state $ \s -> ((idx : rest), setBalance operand d)
      2 -> state $ \s -> ((idx : rest), setStatus operand d)
      3 -> state $ \s -> (rest, setEntry idx operand d)
    1 -> case targetInData of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime (\x -> rem x operand) d)
      1 -> state $ \s -> ((idx : rest), modifyBalance (\x -> rem x operand) d)
      2 -> state $ \s -> ((idx : rest), modifyStatus (\x -> rem x operand) d)
      3 -> state $ \s -> (rest, modifyEntry (\x -> rem x operand) idx d)
    -- 2 -> Add
    -- 3 -> Div
  where !targetInData = rem target sizeOfTarget


runSimulator'''' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator'''' 0    _     _           _           = get
runSimulator'''' size tList (i : iList) (o : oList) = do
  restTList <- runTimeSlot'''' tList i o
  runSimulator'''' (size - 1) restTList iList oList

runTimeSlot'''' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot'''' (target : idx : rest) inst operand = do
  d <- get
  case inst of
    0 -> case targetInData of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      1 -> state $ \s -> ((idx : rest), setBalance operand d)
      2 -> state $ \s -> ((idx : rest), setStatus operand d)
      3 -> state $ \s -> (rest, setEntry idx operand d)
    1 -> case targetInData of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime rF d)
      1 -> state $ \s -> ((idx : rest), modifyBalance rF d)
      2 -> state $ \s -> ((idx : rest), modifyStatus rF d)
      3 -> state $ \s -> (rest, modifyEntry rF idx d)
    -- 2 -> Add
    -- 3 -> Div
  where
    targetInData = rem target sizeOfTarget
    rF x = rem x operand


runSimulator''''' :: Int -> [Int] -> [Int] -> [Int] -> State Data Data
runSimulator''''' 0    _     _           _           = get
runSimulator''''' size tList (i : iList) (o : oList) = do
  restTList <- runTimeSlot''''' tList i o
  runSimulator''''' (size - 1) restTList iList oList

runTimeSlot''''' :: [Int] -> Int -> Int -> State Data [Int]
runTimeSlot''''' (target : idx : rest) inst operand = do
  d <- get
  targetInData `seq` case inst of
    0 -> case targetInData of -- Set
      0 -> state $ \s -> ((idx : rest), setTime operand d)
      1 -> state $ \s -> ((idx : rest), setBalance operand d)
      2 -> state $ \s -> ((idx : rest), setStatus operand d)
      3 -> state $ \s -> (rest, setEntry idx operand d)
    1 -> case targetInData of -- Mod
      0 -> state $ \s -> ((idx : rest), modifyTime rF d)
      1 -> state $ \s -> ((idx : rest), modifyBalance rF d)
      2 -> state $ \s -> ((idx : rest), modifyStatus rF d)
      3 -> state $ \s -> (rest, modifyEntry rF idx d)
    -- 2 -> Add
    -- 3 -> Div
  where
    targetInData = rem target sizeOfTarget
    rF x = rem x operand

type Balance = Int
type Time = Int
type Status = Int
type Idx = Int
type Datum = Int

data Data = Data
  { time :: Time
  , balance :: Balance
  , status :: Status
  , aMap :: IM.IntMap Datum
  } deriving (Show,Eq)

sizeOfTarget :: Int
sizeOfTarget = 4

instance NFData Data where
  rnf (Data t b s m) = rnf t `seq` rnf b `seq` rnf s `seq` rnf m

getTime    = time
getBalance = balance
getStatus  = status
getEntry idx = fromMaybe 0 . IM.lookup idx . aMap
setTime newTime d = d { time = newTime }
setBalance newBalance d = d { balance = newBalance }
setStatus newStatus d = d { status = newStatus }
setEntry idx aDatum d = d { aMap = IM.insert idx aDatum (aMap d) }
modifyTime f d = d { time = f (time d) }
modifyBalance f d = d { balance = f (balance d) }
modifyStatus f d = d { status = f (status d) }
modifyEntry f idx d = d { aMap = IM.adjust f idx (aMap d) }

更新

基于@AndrásKovács 's comment，我添加了新的测试功能。
使用新功能在每个测量的时间更新。

P.S。

我已经用GHC options without -threaded and with -O2运行此代码。
您可以在-threaded中查看基准测试的整个结果。
没有State monad版本的纯函数不会通过此简化代码显示任何性能变化。
您可以从-O2建立有关此问题的整个基准。

Answer 1

据我观察，通过here查看Core输出，如果执行以下操作，GHC只会错过this code元组拆箱和工作程序包装：

ghc -O2 -ddump-simpl -dsuppress-all

但是在以下情况下有效。我们也可以将State放在runTimeSlot2 :: [Int] -> Int -> Int -> State Data [Int] runTimeSlot2 (target : idx : rest) inst operand = do d <- get case inst of 0 -> ... 1 -> .. where targetInData = rem target sizeOfTarget之前的targetInData中。

let

是什么原因？我不知道。但这是一个示例，无论如何我们对GHC的信任度过高，并且该程序从一开始就不是最佳选择。首先，我将case设置为严格，并在基准测试中使用runTimeSlot2 :: [Int] -> Int -> Int -> State Data [Int] runTimeSlot2 (target : idx : rest) inst operand = do d <- get case inst of 0 -> ... 1 -> .. where targetInData = rem target sizeOfTarget代替Data：

whnf

第二，在这个特定示例中，我不认为nf会给我们带来很多好处，我们可以编写一个尾部递归函数：

data Data = Data
  { time :: !Time
  , balance :: !Balance
  , status :: !Status
  , aMap :: !(IM.IntMap Datum)
  } deriving (Show,Eq)

这在我的计算机上的运行速度是原始基准测试中性能更好的变体的两倍。

我注意到原始代码中的性能问题：

State

上面，所有返回的状态，例如runSimulator1 :: Int -> [Int] -> [Int] -> [Int] -> Data -> Data runSimulator1 = go where go 0 _ _ _ d = d go size (target : (idx : rest)) (i : iList) (o : oList) d = let targetInData = rem target sizeOfTarget in case i of 0 -> case targetInData of 0 -> go (size - 1) (idx : rest) iList oList (setTime o d) 1 -> go (size - 1) (idx : rest) iList oList (setBalance o d) 2 -> go (size - 1) (idx : rest) iList oList (setStatus o d) 3 -> go (size - 1) rest iList oList (setEntry idx o d) 1 -> case targetInData of 0 -> go (size - 1) (idx : rest) iList oList (modifyTime (\x -> rem x o) d) 1 -> go (size - 1) (idx : rest) iList oList (modifyBalance (\x -> rem x o) d) 2 -> go (size - 1) (idx : rest) iList oList (modifyStatus (\x -> rem x o) d) 3 -> go (size - 1) rest iList oList (modifyEntry (\x -> rem x o) idx d)都是惰性的。因此，我们得到了大量的重击。我们可以改为：

...
0 -> case targetInData of
  0 -> state $ \s -> ((idx : rest), setTime operand d)
  1 -> state $ \s -> ((idx : rest), setBalance operand d)
  2 -> state $ \s -> ((idx : rest), setStatus operand d)
  3 -> state $ \s -> (rest, setEntry idx operand d)
...

[这给我们带来了性能上的提升，但它仍然比不带setTime operand d的版本慢一些，因为GHC可以将0 -> case targetInData of -- Set 0 -> (idx : rest) <$ (put $! setTime operand d) 1 -> (idx : rest) <$ (put $! setBalance operand d) 2 -> (idx : rest) <$ (put $! setStatus operand d) 3 -> rest <$ (put $! setEntry idx operand d)拆箱为普通函数参数或结果，但无法将State内的Data拆箱。 ]元组。

通常，如果您确实要优化，则最可靠的解决方案是纯函数（非monadic）和严格的纯函数，最好是尾递归。是否需要进行如此程度的优化取决于开发情况。

为什么别名函数会使Haskell的性能下降

问题描述投票：5回答：1

1个回答

最新问题

为什么别名函数会使Haskell的性能下降

问题描述 投票：5回答：1

1个回答

最新问题

问题描述投票：5回答：1