Detect OpenCL device vendor in kernel code

两盒软妹~` 提交于 2019-12-12 13:32:16

问题


I'm writing some platform specific optimizations and while I'm aware of the fact that I could parse the vendor string in the host code and send that to the kernel using the -D option, it is perhaps more convenient to detect the vendor in the kernel directly, without host involvement (that way it is possible to optimize kernels even without access to host source code, ...).

So far, I have come up with the following:

#ifdef __NV_CL_C_VERSION
/**
 *  @def NVIDIA
 *  @brief defined when compiling on NVIDIA GPUs
 */
#define NVIDIA
#endif // __NV_CL_C_VERSION

#if defined(__WinterPark__) || defined(__BeaverCreek__) || defined(__Turks__) || \
    defined(__Caicos__) || defined(__Tahiti__) || defined(__Pitcairn__) || \
    defined(__Capeverde__) || defined(__Cayman__) || defined(__Barts__) || \
    defined(__Cypress__) || defined(__Juniper__) || defined(__Redwood__) || \
    defined(__Cedar__) || defined(__ATI_RV770__) || defined(__ATI_RV730__) || \
    defined(__ATI_RV710__) || defined(__Loveland__) || defined(__GPU__) || \
    defined(__Hawaii__)
#define AMD
/**
 *  @def AMD
 *  @brief defined when compiling on AMD GPUs
 *  @note This list was originally found at https://github.com/magnumripper/JohnTheRipper/wiki/Predefined-macros-in-OpenCL-(standard-and-proprietary) and copied shamelessly. It is most definitely incomplete and contains the troubling  __GPU__.
 *  @note AMD also defines __CPU__ when compiling for CL_DEVICE_TYPE_CPU.
 */
#endif // ...

Any additions or corrections? Anyone knows what Intel defines?


回答1:


I have just tried on AMD Fury X with the 1912.5 driver. The following three tests all print the message:

#ifdef cl_amd_device_attribute_query
#pragma message "here goes AMD"
#endif

#ifdef __GPU__
#pragma message "here goes AMD GPU"
#endif

#ifdef __Fiji__
#pragma message "here goes Fiji AMD"
#endif

However, note that cl_amd_device_attribute_query is not a good test for an AMD device as the AMD platform also includes the Intel CPU as a device and gives the same extension for it. Bummer.

I was going through the amdocl64.dll and noticed the following:

-cl-std=CL2.0
#define __clang__ 1
#define __clang_major__ 3
#define __clang_minor__ 6
#define __ENDIAN_LITTLE__ 1
#define __SPIR32 1
#define __SPIR32__ 1
#define __STDC__ 1
#define __STDC_HOSTED__ 1
#define __STDC_VERSION__ 199901L
#define __STDC_UTF_16__ 1
#define __STDC_UTF_32__ 1
#define __OPENCL_C_VERSION__ 200
#define __OPENCL_VERSION__ 200
-Wf,--force_disable_spir
-fno-lib-no-inline
-fno-sc-keep-calls
-fno-enable-dump
-cl-internal-kernel
-cl-std=CL
-cl-std=CL1.2
-just-kernel=
-DFP_FAST_FMAF=1
-DFP_FAST_FMA=1
-cl-denorms-are-zero
cl-kernel-arg-info
-fno-bin-llvmir
-fno-image-support
-mfast-fmaf
-mfast-fma kernel-arg-alignment

Note that neither __GPU__ or __Fiji__ are found in this dll. Otherwise seems like a bunch of interesting options. Note that not all of them work, some of them likely need to be prefixed with a -.



来源:https://stackoverflow.com/questions/34244673/detect-opencl-device-vendor-in-kernel-code

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!