CUDA中的恒定内存的动态分配
我试图利用常量内存,但我很难知道如何嵌套数组。我所拥有的是一个数据数组,它对内部数据进行计数,但对于每个条目,这些数据是不同的。因此,基于以下简化代码,我有两个问题。首先我不知道如何分配我的数据结构的成员指向的数据。第二,因为我不能使用cudaGetSymbolAddress作为常量内存我不知道我是否可以传递全局指针(你不能用普通的__device__内存)。
I'm trying to take advantage of the constant memory, but I'm having a hard time figuring out how to nest arrays. What I have is an array of data that has counts for internal data but those are different for each entry. So based around the following simplified code I have two problems. First I don't know how to allocate the data pointed to by the members of my data structure. Second, since I can't use cudaGetSymbolAddress for constant memory I'm not sure if I can just pass the global pointer (which you cannot do with plain __device__ memory).
struct __align(16)__ data{
int nFiles;
int nNames;
int* files;
int* names;
};
__device__ __constant__ data *mydata;
__host__ void initMemory(...)
{
cudaMalloc( (void **) &(mydata), sizeof(data)*dynamicsize );
for(int i=; i lessthan dynamicsize; i++)
{
cudaMemcpyToSymbol(mydata, &(nFiles[i]), sizeof(int), sizeof(data)*i, cudaMemcpyHostToDevice);
//...
//Problem 1: Allocate & Set mydata[i].files
}
}
__global__ void myKernel(data *constDataPtr)
{
//Problem 2: Access constDataPtr[n].files, etc
}
int main()
{
//...
myKernel grid, threads (mydata);
}
感谢您提供任何帮助。 : - )
Thanks for any help offered. :-)
我认为常数内存是64K,你不能使用CudaMalloc动态分配。必须声明为常数,例如
I think constant memory is 64K and you cannot allocate it dynamically using CudaMalloc. It has to be declared constant, say,
__device__ __constant__ data mydata[100];
同样,你也不需要释放它。此外,你不应该通过指针传递它的引用,只是作为一个全局变量访问它。我试着做一个类似的事情,它给了我segfault(在devicemu)。
Similarly you also don't need to free it. Also, you shouldn't pass the reference to it via pointer, just access it as a global variable. I tried doing a similar thing and it gave me segfault (in devicemu).