共享内存的指针算法

问题描述 投票:0回答:1

我不明白下面几行到底发生了什么:

  1. unsigned char *membershipChanged = (unsigned char *)sharedMemory;
    
  2. float *clusters = (float *)(sharedMemory + blockDim.x);
    

我假设在#1 中

sharedMemory
被有效地重命名为
membershipChanged
,但是为什么要将
blockDim
添加到
sharedMemory
指针。这个地址指向哪里?

sharedMemory
是用
extern __shared__ char sharedMemory[];

创建的

我在 CUDA kmeans 实现中找到的代码.

void find_nearest_cluster(int numCoords,
                          int numObjs,
                          int numClusters,
                          float *objects,           //  [numCoords][numObjs]
                          float *deviceClusters,    //  [numCoords][numClusters]
                          int *membership,          //  [numObjs]
                          int *intermediates)
{
extern __shared__ char sharedMemory[];

//  The type chosen for membershipChanged must be large enough to support
//  reductions! There are blockDim.x elements, one for each thread in the
//  block.
unsigned char *membershipChanged = (unsigned char *)sharedMemory;
float *clusters = (float *)(sharedMemory + blockDim.x);

membershipChanged[threadIdx.x] = 0;

//  BEWARE: We can overrun our shared memory here if there are too many
//  clusters or too many coordinates!
for (int i = threadIdx.x; i < numClusters; i += blockDim.x) {
    for (int j = 0; j < numCoords; j++) {
        clusters[numClusters * j + i] = deviceClusters[numClusters * j + i];
    }
}
.....
c++ cuda pointer-arithmetic gpu-shared-memory
1个回答
5
投票

sharedMemory + blockDim.x
指向距共享内存区域底部
blockDim.x
个字节。

您可能会这样做的原因是在共享内存中进行再分配。包含

find_nearest_cluster
的内核启动站点为内核动态分配了一些共享存储空间。该代码暗示两个逻辑上不同的数组驻留在
sharedMemory
指向的共享存储中 -
membershipChanged
clusters
。指针算法只是获取指向第二个数组的指针的一种方法。

© www.soinside.com 2019 - 2024. All rights reserved.