cuda - How do I pass a shared pointer to a cublas function? -


i'm trying run cublas function within kernel in following way:

__device__ void dolinear(const float *w,const float *input, unsigned i, float *out, unsigned o) {     unsigned idx = blockidx.x*blockdim.x+threadidx.x;      const float alpha = 1.0f;     const float beta = 0.0f;      if(idx == 0) {         cublashandle_t cnphandle;         cublasstatus_t status = cublascreate(&cnphandle);         cublassgemv(cnphandle, cublas_op_n, o, i, &alpha, w, 1, input, 1, &beta, out, 1);     }     __syncthreads(); } 

this function works if input pointer allocated using cudamalloc.

my issue is, if input pointer points shared memory, contains data generated within kernel, error: cuda_exception_14 - warp illegal address.

is not possible pass pointers shared memory cublas function being called kernel?

what correct way allocate memory here? (at moment i'm doing cudamalloc , using 'shared' memory, it's making me feel bit dirty)

you can't pass shared memory cublas device api routine because violates cuda dynamic parallelism memory model on device side cublas based. best can use malloc() or new allocate thread local memory on runtime heap cublas routine use, or portion of a priori allocated buffer allocated 1 of host side apis (as presently doing).


Comments

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

sql - MySQL query optimization using coalesce -

unity3d - Unity local avoidance in user created world -