使用SIMD计算基于另一个矢量位值的值乘积

问题描述 投票:2回答:1

我有两个向量。大小为a的双精度N的向量和大小为b的无符号字符ceil(N/8)的向量。目标是计算某些a值的乘积。 b将逐位读取,其中每个位指示是否要在产品中考虑从double中给定的a

  // Let's create some data      
  unsigned nbBits  = 1e7;
  unsigned nbBytes = nbBits / 8;
  unsigned char nbBitsInLastByte = nbBits % 8;
  assert(nbBits == nbBytes * 8 + nbBitsInLastByte);
  std::vector<double> a(nbBits, 0.999999);   // In practice a values will vary. It is just an easy to build example I am showing here
  std::vector<unsigned char> b(nbBytes, false); // I am not using `vector<bool>` nor `bitset`. I've got my reasons!
  assert(a.size() == b.size() * 8);

  // Set a few bits to true
  for (unsigned byte = 0 ; byte < (nbBytes-1) ; byte+=2)
  {
    b[byte] |= 1 << 2; // set second (zero-based counting) bit to 'true'
    b[byte] |= 1 << 7; // set last bit to 'true'
                //  ^ This is the bit index
  }

如上所述,我的目标是在a为真时计算b中值的乘积。可以通过

来实现
  // Initialize the variable we want to compute
  double product = 1.0;

  // Product for the first nbByts-1 bytes
  for (unsigned byte = 0 ; byte < (nbBytes-1) ; ++byte)
  {
    for (unsigned bit = 0 ; bit < 8 ; ++bit) // inner loop could be manually unrolled
    {
      if((b[byte] >> bit) & 1) // gets the bit value
        product *= a[byte*8+bit];
    }
  }

  // Product for the last byte
  for (unsigned bit = 0 ; bit < nbBitsInLastByte ; ++bit)
  {
    if((b[nbBytes-1] >> bit) & 1) // gets the bit value
      product *= a[(nbBytes-1)*8+bit];
  }

此产品计算是我的代码的最慢部分。我想知道是否显式向量化(SIMD)这个过程在这里是否有帮助?我一直在看'xmmintrin.h'中提供的功能,但是我对SIMD知之甚少,因此未能找到有帮助的东西。你能帮我吗?

c++ performance sse simd
1个回答
0
投票

尝试此更改可能会有所增强:

int nf = 0;
for (unsigned byte = 0; byte < (nbBytes - 1); ++byte)
{
    unsigned char bb = b[byte];
    if (bb & (1 << 0)) product *= a[nf++];
    if (bb & (1 << 1)) product *= a[nf++];
    if (bb & (1 << 2)) product *= a[nf++];
    if (bb & (1 << 3)) product *= a[nf++];
    if (bb & (1 << 4)) product *= a[nf++];
    if (bb & (1 << 5)) product *= a[nf++];
    if (bb & (1 << 6)) product *= a[nf++];
    if (bb & (1 << 7)) product *= a[nf++];
}
© www.soinside.com 2019 - 2024. All rights reserved.