如何加速旋转一个位块?

问题描述 投票:0回答:1

我正在编写一个程序,在 Rust 和 Haskell 中执行相同的操作。这是一个完整消息密码;它读取整个文件,对其进行加密,然后将其写出。给定 1 MiB,Rust 代码需要 2.4 秒,但 Haskell 代码需要 18.7 秒。我对其进行了分析,发现 36.6% 的时间和分配被

rotBitcount
函数占用。这会将一个数组复制到另一个数组,并将整个数组旋转 1 位的位数。这是代码:

module Cryptography.WringTwistree.RotBitcount
  ( rotBitcount
  ) where

{- This module is used in both Wring and Twistree.
 - It rotates an array of bytes by a multiple of its bitcount,
 - producing another array of the same size. As long as the multiplier
 - is relatively prime to the number of bits in the array, this
 - operation satisfies the strict avalanche criterion. Changing *two*
 - bits, however, has half a chance of changing only two bits in
 - the output.
 -
 - Bit 0 of byte 0 is bit 0 of the array. Bit 0 of byte 1 is bit 8 of the array.
 - e1 00 00 00 00 00 00 00, rotated by its bitcount (4), becomes
 - 10 0e 00 00 00 00 00 00.
 -}

import Data.Bits
import Data.Word
import Data.Array.Unboxed

rotBitcount :: (Integral a,Ix a,Bits a) => UArray a Word8 -> a -> UArray a Word8
-- The type a may be signed or unsigned, but the array index must begin at 0.
-- a should hold the square of eight times the bounds; so if the bounds are
-- (0..31), Word16 is adequate, but Int16 and Word8 are not.
rotBitcount src mult = array bnd
  [ (i, (src ! ((i+len-byte)   `mod` len) `shift` bit) .|.
        (src ! ((i+len-byte-1) `mod` len) `shift` (bit-8))) | i <- [0..(len-1)]]
  where
    bnd = bounds src
    len = (snd bnd) +1
    multmod = mult `mod` (len * 8)
    bitcount = fromIntegral $ sum $ map popCount $ elems src
    rotcount = (bitcount * multmod) `mod` (len * 8)
    byte = rotcount `shift` (-3)
    bit = fromIntegral (rotcount .&. 7)

我用一个简单的

stack run
来编译它,除了分析时。堆栈版本是2.11.1。存储库位于 https://github.com/phma/wring-twistree 。我怎样才能让它更快?

haskell optimization bit-manipulation bitwise-operators
1个回答
0
投票

在 IRC 上与一些人交谈后,我让它运行得更快了。步骤是:

  1. 专业化
    rotBitcount
    一直向上。主程序使用
    UArray Int Word8
    ,所以我添加了这些行:
{-# SPECIALIZE rotBitcount :: UArray Int Word8 -> Int -> UArray Int Word8 #-}
{-# SPECIALIZE roundEncrypt :: Int -> UArray (Word8,Word8) Word8 ->
    UArray Int Word8 -> UArray Int Word8 -> Int -> UArray Int Word8 #-}
{-# SPECIALIZE roundDecrypt :: Int -> UArray (Word8,Word8) Word8 ->
    UArray Int Word8 -> UArray Int Word8 -> Int -> UArray Int Word8 #-}
{-# SPECIALIZE encrypt :: Wring -> UArray Int Word8 -> UArray Int Word8 #-}
{-# SPECIALIZE decrypt :: Wring -> UArray Int Word8 -> UArray Int Word8 #-}

如果有人想将它与由

Word
Int64
索引的数组一起使用,我可以添加更多专业化。

  1. where
    子句中添加严格性注释。原因是,如果消息长度为零,这些数字将不会被延迟计算,因此编译器必须将它们装箱。
{-# LANGUAGE BangPatterns #-}
...
    !len = (snd bnd) +1
    !byte = rotcount .>>. 3
    !bit = fromIntegral (rotcount .&. 7)
© www.soinside.com 2019 - 2024. All rights reserved.