cuda - How do I pass a shared pointer to a cublas function? -

- June 15, 2013

i'm trying run cublas function within kernel in following way:

__device__ void dolinear(const float *w,const float *input, unsigned i, float *out, unsigned o) {     unsigned idx = blockidx.x*blockdim.x+threadidx.x;      const float alpha = 1.0f;     const float beta = 0.0f;      if(idx == 0) {         cublashandle_t cnphandle;         cublasstatus_t status = cublascreate(&cnphandle);         cublassgemv(cnphandle, cublas_op_n, o, i, &alpha, w, 1, input, 1, &beta, out, 1);     }     __syncthreads(); }

this function works if input pointer allocated using cudamalloc.

my issue is, if input pointer points shared memory, contains data generated within kernel, error: cuda_exception_14 - warp illegal address.

is not possible pass pointers shared memory cublas function being called kernel?

what correct way allocate memory here? (at moment i'm doing cudamalloc , using 'shared' memory, it's making me feel bit dirty)

you can't pass shared memory cublas device api routine because violates cuda dynamic parallelism memory model on device side cublas based. best can use malloc() or new allocate thread local memory on runtime heap cublas routine use, or portion of a priori allocated buffer allocated 1 of host side apis (as presently doing).

Search This Blog

Post

cuda - How do I pass a shared pointer to a cublas function? -

Comments

Post a Comment

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

Maven Javadoc 'Cannot find default setter' and fails -

javascript - SAPUI5 Filling SmartTable with OData from XMII -