Thrust:使用device_ptr时如何获取copy_if函数复制的元素数量

问题描述 投票:0回答:1

我正在使用 Thrust 库的

thrust::copy_if
函数以及计数迭代器来获取数组中非零元素的索引。我还需要获取复制元素的数量。

我正在使用“counting_iterator.cu”示例中的代码,但在我的应用程序中,我需要重用预先分配的数组,因此我用

thrust::device_ptr
包装它们,然后将它们传递给
thrust::copy_if
函数。这是代码:

using namespace thrust;

int output[5];
thrust::device_ptr<int> tp_output = device_pointer_cast(output);

float stencil[5];
stencil[0] = 0;
stencil[1] = 0;
stencil[2] = 1;
stencil[3] = 0;
stencil[4] = 1;
device_ptr<float> tp_stencil = device_pointer_cast(stencil);

device_vector<int>::iterator output_end = copy_if(make_counting_iterator<int>(0), 
     make_counting_iterator<int>(5), 
     tp_stencil, 
     tp_output, 
     _1 == 1);

int number_of_ones = output_end - tp_output;

如果我注释最后一行代码,该函数将正确填充输出数组。但是,当我取消注释时,出现以下编译错误:

1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include\thrust/iterator/iterator_adaptor.h(223): error : no operator "-" matches these operands

1>              operand types are: int *const - const thrust::device_ptr<int>

1>            detected during:
1>              instantiation of "thrust::iterator_adaptor<Derived, Base, Value, System, Traversal, Reference, Difference>::difference_type thrust::iterator_adaptor<Derived, Base, Value, System, Traversal, Reference, Difference>::distance_to(const thrust::iterator_adaptor<OtherDerived, OtherIterator, V, S, T, R, D> &) const [with Derived=thrust::detail::normal_iterator<thrust::device_ptr<int>>, Base=thrust::device_ptr<int>, Value=thrust::use_default, System=thrust::use_default, Traversal=thrust::use_default, Reference=thrust::use_default, Difference=thrust::use_default, OtherDerived=thrust::device_ptr<int>, OtherIterator=int *, V=signed int, S=thrust::device_system_tag, T=thrust::random_access_traversal_tag, R=thrust::device_reference<int>, D=ptrdiff_t]" 
1>  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include\thrust/iterator/iterator_facade.h(181): here
1>              instantiation of "Facade1::difference_type thrust::iterator_core_access::distance_from(const Facade1 &, const Facade2 &, thrust::detail::true_type) [with Facade1=thrust::detail::normal_iterator<thrust::device_ptr<int>>, Facade2=thrust::device_ptr<int>]" 
1>  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include\thrust/iterator/iterator_facade.h(202): here
1>              instantiation of "thrust::detail::distance_from_result<Facade1, Facade2>::type thrust::iterator_core_access::distance_from(const Facade1 &, const Facade2 &) [with Facade1=thrust::detail::normal_iterator<thrust::device_ptr<int>>, Facade2=thrust::device_ptr<int>]" 
1>  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include\thrust/iterator/iterator_facade.h(506): here
1>              instantiation of "thrust::detail::distance_from_result<thrust::iterator_facade<Derived1, Value1, System1, Traversal1, Reference1, Difference1>, thrust::iterator_facade<Derived2, Value2, System2, Traversal2, Reference2, Difference2>>::type thrust::operator-(const thrust::iterator_facade<Derived1, Value1, System1, Traversal1, Reference1, Difference1> &, const thrust::iterator_facade<Derived2, Value2, System2, Traversal2, Reference2, Difference2> &) [with Derived1=thrust::detail::normal_iterator<thrust::device_ptr<int>>, Value1=signed int, System1=thrust::device_system_tag, Traversal1=thrust::random_access_traversal_tag, Reference1=thrust::device_reference<int>, Difference1=signed int, Derived2=thrust::device_ptr<int>, Value2=signed int, System2=thrust::device_system_tag, Traversal2=thrust::random_access_traversal_tag, Reference2=thrust::device_reference<int>, Difference2=signed int]" 
1>  C:/ProgramData/NVIDIA Corporation/CUDA Samples/v5.5/7_CUDALibraries/nsgaIIparallelo_23ott/rank_cuda.cu(70): here  

如果我使用

thrust::device_vector
作为输出数组,一切都可以:

using namespace thrust;

thrust::device_vector<int> output(5);

float stencil[5];
stencil[0] = 0;
stencil[1] = 0;
stencil[2] = 1;
stencil[3] = 0;
stencil[4] = 1;
device_ptr<float> tp_stencil = device_pointer_cast(stencil);

device_vector<int>::iterator output_end = copy_if(make_counting_iterator<int>(0), 
     make_counting_iterator<int>(5), 
     tp_stencil, 
     output.begin(), 
     _1 == 1);

int number_of_ones = output_end - output.begin();

您能建议解决这个问题吗?谢谢。

cuda thrust
1个回答
4
投票

尝试在 copy_if 调用中使用 device_ptr 而不是 device_vector::iterator :

thrust::device_ptr<int> output_end = copy_if(make_counting_iterator<int>(0),
 make_counting_iterator<int>(5),
 tp_stencil,
 tp_output,
 _1 == 1); 
© www.soinside.com 2019 - 2024. All rights reserved.