Enable code indexing of Cuda in Clion

ぃ、小莉子 提交于 2019-11-30 07:13:43

Right click file in project tool window -> Associate with file type -> C++

However, Clion doesn't support cuda officially now, it cannot parse cuda syntax.

First, make sure you tell CLion to treat .cu and .cuh files as C++ using the File Types settings menu.

CLion is not able to parse CUDA's language extensions, but it does provide a preprocessor macro that is defined only when clion is parsing the code. You can use this to implement almost complete CUDA support yourself.

Much of the problem is that CLion's parser is derailed by keywords like __host__ or __device__, causing it to fail to do things it otherwise knows how to do:

CLion has failed to understand Dtype in this example, because the CUDA stuff confused its parsing.

The most minimal solution to this problem is to give clion preprocessor macros to ignore the new keywords, fixing the worst of the brokenness:

#ifdef __JETBRAINS_IDE__
    #define __host__
    #define __device__
    #define __shared__
    #define __constant__
    #define __global__
#endif

This fixes the above example:

However, CUDA functions like __syncthreads, __popc will still fail to index. So will CUDA builtins like threadIdx. One option is to provide endless preprocessor macros (or even struct definitions) for these, but that's ugly and sacrifices type-safety.

If you're using Clang's CUDA frontend, you can do better. Clang implements the implicitly-defined CUDA builtins by defining them in headers, which it then includes when compiling your code. These provide definitions of things like threadIdx. By pretending to be the CUDA compiler's preprocessor and including device_functions.h, we can get __popc and friends to work, too:

#ifdef __JETBRAINS_IDE__
    #define __host__
    #define __device__
    #define __shared__
    #define __constant__
    #define __global__

    // This is slightly mental, but gets it to properly index device function calls like __popc and whatever.
    #define __CUDACC__
    #include <device_functions.h>

    // These headers are all implicitly present when you compile CUDA with clang. Clion doesn't know that, so
    // we include them explicitly to make the indexer happy. Doing this when you actually build is, obviously,
    // a terrible idea :D
    #include <__clang_cuda_builtin_vars.h>
    #include <__clang_cuda_intrinsics.h>
    #include <__clang_cuda_math_forward_declares.h>
    #include <__clang_cuda_complex_builtins.h>
    #include <__clang_cuda_cmath.h>
#endif // __JETBRAINS_IDE__

This will get you perfect indexing of virtually all CUDA code. CLion even gracefully copes with <<<...>>> syntax. It puts a little red line under one character on each end of the launch block, but otherwise treats it as a function call - which is perfectly fine:

Thanks! I added more "fake" declarations to allow CLion to parse CUDA better:

#ifdef __JETBRAINS_IDE__
#define __CUDACC__ 1
#define __host__
#define __device__
#define __global__
#define __forceinline__
#define __shared__
inline void __syncthreads() {}
inline void __threadfence_block() {}
template<class T> inline T __clz(const T val) { return val; }
struct __cuda_fake_struct { int x; };
extern __cuda_fake_struct blockDim;
extern __cuda_fake_struct threadIdx;
extern __cuda_fake_struct blockIdx;
#endif

I've expanded upon this answer using the method found in this answer to provide a more comprehensive parsing macro, you can now have .x, .y and .z work properly with out issue, and use grid dim. In addition to that I've updated the list to include most intrinsics and values found in the CUDA 8.0 documentation guide. Note that this should have full C++ compatibility, and maybe C. This does not have all functions accounted for (missing atomics, math functions (just include math.h for most), texture, surface, timing, warp votie and shuffle, assertion, launch bounds, and video function)

#ifdef __JETBRAINS_IDE__
    #include "math.h"
    #define __CUDACC__ 1
    #define __host__
    #define __device__
    #define __global__
    #define __noinline__
    #define __forceinline__
    #define __shared__
    #define __constant__
    #define __managed__
    #define __restrict__  
    // CUDA Synchronization
    inline void __syncthreads() {};
    inline void __threadfence_block() {};
    inline void __threadfence() {};
    inline void __threadfence_system();
    inline int __syncthreads_count(int predicate) {return predicate};
    inline int __syncthreads_and(int predicate) {return predicate};
    inline int __syncthreads_or(int predicate) {return predicate};
    template<class T> inline T __clz(const T val) { return val; }
    template<class T> inline T __ldg(const T* address){return *address};
    // CUDA TYPES
    typedef unsigned short uchar;
    typedef unsigned short ushort;
    typedef unsigned int uint;
    typedef unsigned long ulong;
    typedef unsigned long long ulonglong;
    typedef long long longlong;

    typedef struct uchar1{
        uchar x;
    }uchar1;

    typedef struct uchar2{
        uchar x;
        uchar y;
    }uchar2;

    typedef struct uchar3{
        uchar x;
        uchar y;
        uchar z;
    }uchar3;

    typedef struct uchar4{
        uchar x;
        uchar y;
        uchar z;
        uchar w;
    }uchar4;

    typedef struct char1{
        char x;
    }char1;

    typedef struct char2{
        char x;
        char y;
    }char2;

    typedef struct char3{
        char x;
        char y;
        char z;
    }char3;

    typedef struct char4{
        char x;
        char y;
        char z;
        char w;
    }char4;

    typedef struct ushort1{
        ushort x;
    }ushort1;

    typedef struct ushort2{
        ushort x;
        ushort y;
    }ushort2;

    typedef struct ushort3{
        ushort x;
        ushort y;
        ushort z;
    }ushort3;

    typedef struct ushort4{
        ushort x;
        ushort y;
        ushort z;
        ushort w;
    }ushort4;

    typedef struct short1{
        short x;
    }short1;

    typedef struct short2{
        short x;
        short y;
    }short2;

    typedef struct short3{
        short x;
        short y;
        short z;
    }short3;

    typedef struct short4{
        short x;
        short y;
        short z;
        short w;
    }short4;

    typedef struct uint1{
        uint x;
    }uint1;

    typedef struct uint2{
        uint x;
        uint y;
    }uint2;

    typedef struct uint3{
        uint x;
        uint y;
        uint z;
    }uint3;

    typedef struct uint4{
        uint x;
        uint y;
        uint z;
        uint w;
    }uint4;

    typedef struct int1{
        int x;
    }int1;

    typedef struct int2{
        int x;
        int y;
    }int2;

    typedef struct int3{
        int x;
        int y;
        int z;
    }int3;

    typedef struct int4{
        int x;
        int y;
        int z;
        int w;
    }int4;

    typedef struct ulong1{
        ulong x;
    }ulong1;

    typedef struct ulong2{
        ulong x;
        ulong y;
    }ulong2;

    typedef struct ulong3{
        ulong x;
        ulong y;
        ulong z;
    }ulong3;

    typedef struct ulong4{
        ulong x;
        ulong y;
        ulong z;
        ulong w;
    }ulong4;

    typedef struct long1{
        long x;
    }long1;

    typedef struct long2{
        long x;
        long y;
    }long2;

    typedef struct long3{
        long x;
        long y;
        long z;
    }long3;

    typedef struct long4{
        long x;
        long y;
        long z;
        long w;
    }long4;

    typedef struct ulonglong1{
        ulonglong x;
    }ulonglong1;

    typedef struct ulonglong2{
        ulonglong x;
        ulonglong y;
    }ulonglong2;

    typedef struct ulonglong3{
        ulonglong x;
        ulonglong y;
        ulonglong z;
    }ulonglong3;

    typedef struct ulonglong4{
        ulonglong x;
        ulonglong y;
        ulonglong z;
        ulonglong w;
    }ulonglong4;

    typedef struct longlong1{
        longlong x;
    }longlong1;

    typedef struct longlong2{
        longlong x;
        longlong y;
    }longlong2;

    typedef struct float1{
        float x;
    }float1;

    typedef struct float2{
        float x;
        float y;
    }float2;

    typedef struct float3{
        float x;
        float y;
        float z;
    }float3;

    typedef struct float4{
        float x;
        float y;
        float z;
        float w;
    }float4;  

    typedef struct double1{
        double x;
    }double1;

    typedef struct double2{
        double x;
        double y;
    }double2;

    typedef uint3 dim3;

    extern dim3 gridDim;
    extern uint3 blockIdx;
    extern dim3 blockDim;
    extern uint3 threadIdx;
    extern int warpsize;
#endif

if you want clion to parse all your .cu files as .cpp or any other supported file type, you can do this:

  1. Go to File -> Settings -> Editor -> File Types
  2. Select the file type you want it to be parsed as in the first column (.cpp)
  3. Click the plus sign of the second column and write *.cu

  4. Press apply and clion will parse all your .cu files as it was the file type you specified in the upper column (.cpp)

you can see more documentation here

I've found that clion seems to code-index all build targets, not just the target you've selected to build. My strategy has been to make .cpp symbolic links out of my .cu files and make a child clion/cmake c++ build target (for indexing only) that references those .cpp links. This approach appears to be working on small cuda/thrust c++11 projects in clion 2017.3.3 in Unbuntu 16.04.3.

I do this by:

  • register the .cu/cuh files with clion, as in the other answers
  • add the cuda/clion macro voodoo to my .cu files, as in the other answers (the position of the voodoo may be important, but I haven't run into any troubles yet)
  • make .cpp/.hpp symbolic links to your .cu/.cuh files in your project directory
  • make a new folder with the single file named clionShadow/CMakeLists.txt that contains:
cmake_minimum_required(VERSION 3.9)
project(cudaNoBuild)
set(CMAKE_CXX_STANDARD 11)
add_executable(cudaNoBuild ../yourcudacode.cpp ../yourcudacode.hpp)
target_include_directories(cudaNoBuild PUBLIC ${CUDA_INCLUDE_DIRS})
  • add a dependency to clionShadow/CMakeLists.txt at the end of your main CMakeLists.txt with a line like this:
add_subdirectory(clionShadow)

Now, clion parses and code-indexes .cu files 'through' the .cpp files.

Remember, the cudaNoBuild target is not for building - it will use the c++ toolchain which won't work. If you suddenly get compilation errors check clion's build target settings - I've noticed that it sometimes mixes and matches the current build settings between the projects. In this case go to the Edit_Configurations dialog under the Run menu and ensure that clion has not changed the target_executable to be from the cudaNoBuild target.

Edit: Gah! Upon rebuilding the CMake and ide cache after an update to clion 2017.3.3 things are not really working the way they did before. Indexing only works for .cpp files and breakpoints only work for .cu files.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!