cuda - Thrust Histogram with weights -
i want compute density of particles on grid. therefore, have vector contains cellid
of each particle, vector given mass
not have uniform.
i have taken non-sparse example thrust
compute histogram of particles. however, compute density, need include weight of each particle, instead of summing number of particles per cell, i.e. i'm interested in rho[i] = sum w[j]
j
satify cellid[j]=i
(probably unnecessary explain, since knows that).
implementing thrust
has not worked me. tried use cuda kernel , thrust_raw_pointer_cast
, did not succeed either.
edit:
here minimal working example should compile via nvcc file.cu
under cuda 6.5 , thrust installed.
#include <thrust/device_vector.h> #include <thrust/sort.h> #include <thrust/copy.h> #include <thrust/binary_search.h> #include <thrust/adjacent_difference.h> // predicate struct is_out_of_bounds { __host__ __device__ bool operator()(int i) { return (i < 0); // out of bounds elements have negative id; } }; // cf.: https://code.google.com/p/thrust/source/browse/examples/histogram.cu, modified template<typename t1, typename t2> void computehistogram(const t1& input, t2& histogram) { typedef typename t1::value_type valuetype; // input value type typedef typename t2::value_type indextype; // histogram index type // copy input data (could skipped if input allowed modified) thrust::device_vector<valuetype> data(input); // sort data bring equal elements thrust::sort(data.begin(), data.end()); // there elements don't want count, have id -1; data.erase(thrust::remove_if(data.begin(), data.end(), is_out_of_bounds()),data.end()); // number of histogram bins equal maximum value plus 1 indextype num_bins = histogram.size(); // find end of each bin of values thrust::counting_iterator<indextype> search_begin(0); thrust::upper_bound(data.begin(), data.end(), search_begin, search_begin + num_bins, histogram.begin()); // compute histogram taking differences of cumulative histogram thrust::adjacent_difference(histogram.begin(), histogram.end(), histogram.begin()); } int main(void) { thrust::device_vector<int> cellid(5); cellid[0] = -1; cellid[1] = 1; cellid[2] = 0; cellid[3] = 2; cellid[4]=1; thrust::device_vector<float> mass(5); mass[0] = .5; mass[1] = 1.0; mass[2] = 2.0; mass[3] = 3.0; mass[4] = 4.0; thrust::device_vector<int> histogram(3); thrust::device_vector<float> density(3); computehistogram(cellid,histogram); std::cout<<"\nhistogram:\n"; thrust::copy(histogram.begin(), histogram.end(), std::ostream_iterator<int>(std::cout, " ")); std::cout << std::endl; // print: " histogram 1 2 1 " // meaning 1 element id 0, 2 elements id 1 // , 1 element id 2 /* here unable implement: * * * computedensity(cellid,mass,density); * * print(density): 2.0 5.0 3.0 * * */ }
i hope comment @ end of file makes clear mean computing density. if there question open, please feel free ask. thanks!
there still seems problem in understanding problem, sorry for! therefore added pictures. consider first picture. understanding, histogram count of particles per grid cell. in case histogram array of size 36, since there 36 cells. also, there lot of 0 entries in vector, since example in upper left corner no cell contains particle. have in code.
now consider more complicated case. here each particle has different mass, indicated different size in plot. compute density can't add number of particles per cell, have add mass of particles per cell. i'm unable implement.
what described in example not histogram rather segmented reduction.
the following example code uses thrust::reduce_by_key
sum masses of particles within same cell:
density.cu
#include <thrust/device_vector.h> #include <thrust/sort.h> #include <thrust/reduce.h> #include <thrust/copy.h> #include <thrust/scatter.h> #include <iostream> #define printer(name) print(#name, (name)) template <template <typename...> class v, typename t, typename ...args> void print(const char* name, const v<t,args...> & v) { std::cout << name << ":\t\t"; thrust::copy(v.begin(), v.end(), std::ostream_iterator<t>(std::cout, "\t")); std::cout << std::endl << std::endl; } int main() { const int particle_count = 5; const int cell_count = 10; thrust::device_vector<int> cellid(particle_count); cellid[0] = -1; cellid[1] = 1; cellid[2] = 0; cellid[3] = 2; cellid[4]=1; thrust::device_vector<float> mass(particle_count); mass[0] = .5; mass[1] = 1.0; mass[2] = 2.0; mass[3] = 3.0; mass[4] = 4.0; std::cout << "input data" << std::endl; printer(cellid); printer(mass); thrust::sort_by_key(cellid. begin(), cellid.end(), mass.begin()); std::cout << "after sort_by_key" << std::endl; printer(cellid); printer(mass); thrust::device_vector<int> reduced_cellid(particle_count); thrust::device_vector<float> density(particle_count); int new_size = thrust::reduce_by_key(cellid. begin(), cellid.end(), mass.begin(), reduced_cellid.begin(), density.begin() ).second - density.begin(); if (reduced_cellid[0] == -1) { density.erase(density.begin()); reduced_cellid.erase(reduced_cellid.begin()); new_size--; } density.resize(new_size); reduced_cellid.resize(new_size); std::cout << "after reduce_by_key" << std::endl; printer(density); printer(reduced_cellid); thrust::device_vector<float> final_density(cell_count); thrust::scatter(density.begin(), density.end(), reduced_cellid.begin(), final_density.begin()); printer(final_density); }
compile using
nvcc -std=c++11 density.cu -o density
output
input data cellid: -1 1 0 2 1 mass: 0.5 1 2 3 4 after sort_by_key cellid: -1 0 1 1 2 mass: 0.5 2 1 4 3 after reduce_by_key density: 2 5 3 reduced_cellid: 0 1 2 final_density: 2 5 3 0 0 0 0 0 0 0
Comments
Post a Comment