Cuda PTX registers declaration and using
问题 I am trying to reduce number of using registers in my kernel, so I am decide to try inline PTX. This kernel: #define Feedback(a, b, c, d, e) d^e^(a&c)^(a&e)^(b&c)^(b&e)^(c&d)^(d&e)^(a&d&e)^(a&c&e)^(a&b&d)^(a&b&c) __global__ void Test(unsigned long a, unsigned long b, unsigned long c, unsigned long d, unsigned long e, unsigned long f, unsigned long j, unsigned long h, unsigned long* res) { res[0] = Feedback( a, b, c, d, e ); res[1] = Feedback( b, c, d, e, f ); res[2] = Feedback( c, d, e, f, j