fast CUDA thrust custom comparison operator
问题 I'm evaluating CUDA and currently using Thrust library to sort numbers. I'd like to create my own comparer for thrust::sort, but it slows down drammatically! I created my own less implemetation by just copying code from functional.h . However it seems to be compiled in some other way and works very slowly. default comparer: thrust::less() - 94 ms my own comparer: less() - 906 ms I'm using Visual Studio 2010. What should I do to get the same performance as at option 1? Complete code: #include