Using Keras & Tensorflow with AMD GPU

后端 未结 8 1694
不思量自难忘°
不思量自难忘° 2020-12-07 08:22

I\'m starting to learn Keras, which I believe is a layer on top of Tensorflow and Theano. However, I only have access to AMD GPUs such as the AMD R9 280X.

How can I

相关标签:
8条回答
  • 2020-12-07 08:49

    Tensorflow 1.3 has been supported on AMD ROCm stack:

    • https://github.com/ROCmSoftwarePlatform/tensorflow

    A pre-built docker image has also been posted publicly:

    • https://hub.docker.com/r/rocm/tensorflow/
    0 讨论(0)
  • 2020-12-07 08:51

    Theano does have support for OpenCL but it is still in its early stages. Theano itself is not interested in OpenCL and relies on community support.

    Most of the operations are already implemented and it is mostly a matter of tuning and optimizing the given operations.

    To use the OpenCL backend you have to build libgpuarray yourself.

    From personal experience I can tell you that you will get CPU performance if you are lucky. The memory allocation seems to be very naively implemented (therefore computation will be slow) and will crash when it runs out of memory. But I encourage you to try and maybe even optimize the code or help reporting bugs.

    0 讨论(0)
  • 2020-12-07 08:53

    Technically you can if you use something like OpenCL, but Nvidia's CUDA is much better and OpenCL requires other steps that may or may not work. I would recommend if you have an AMD gpu, use something like Google Colab where they provide a free Nvidia GPU you can use when coding.

    0 讨论(0)
  • 2020-12-07 08:58

    This is an old question, but since I spent the last few weeks trying to figure it out on my own:

    1. OpenCL support for Theano is hit and miss. They added a libgpuarray back-end which appears to still be buggy (i.e., the process runs on the GPU but the answer is wrong--like 8% accuracy on MNIST for a DL model that gets ~95+% accuracy on CPU or nVidia CUDA). Also because ~50-80% of the performance boost on the nVidia stack comes from the CUDNN libraries now, OpenCL will just be left in the dust. (SEE BELOW!) :)
    2. ROCM appears to be very cool, but the documentation (and even a clear declaration of what ROCM is/what it does) is hard to understand. They're doing their best, but they're 4+ years behind. It does NOT NOT NOT work on an RX550 (as of this writing). So don't waste your time (this is where 1 of the weeks went :) ). At first, it appears ROCM is a new addition to the driver set (replacing AMDGPU-Pro, or augmenting it), but it is in fact a kernel module and set of libraries that essentially replace AMDGPU-Pro. (Think of this as the equivalent of Nvidia-381 driver + CUDA some libraries kind of). https://rocm.github.io/dl.html (Honestly I still haven't tested the performance or tried to get it to work with more recent Mesa drivers yet. I will do that sometime.
    3. Add MiOpen to ROCM, and that is essentially CUDNN. They also have some pretty clear guides for migrating. But better yet.
    4. They created "HIP" which is an automagical translator from CUDA/CUDNN to MiOpen. It seems to work pretty well since they lined the API's up directly to be translatable. There are concepts that aren't perfect maps, but in general it looks good.

    Now, finally, after 3-4 weeks of trying to figure out OpenCL, etc, I found this tutorial to help you get started quickly. It is a step-by-step for getting hipCaffe up and running. Unlike nVidia though, please ensure you have supported hardware!!!! https://rocm.github.io/hardware.html. Think you can get it working without their supported hardware? Good luck. You've been warned. Once you have ROCM up and running (AND RUN THE VERIFICATION TESTS), here is the hipCaffe tutorial--if you got ROCM up you'll be doing an MNIST validation test within 10 minutes--sweet! https://rocm.github.io/ROCmHipCaffeQuickstart.html

    0 讨论(0)
  • 2020-12-07 09:01

    I'm writing an OpenCL 1.2 backend for Tensorflow at https://github.com/hughperkins/tensorflow-cl

    This fork of tensorflow for OpenCL has the following characteristics:

    • it targets any/all OpenCL 1.2 devices. It doesnt need OpenCL 2.0, doesnt need SPIR-V, or SPIR. Doesnt need Shared Virtual Memory. And so on ...
    • it's based on an underlying library called 'cuda-on-cl', https://github.com/hughperkins/cuda-on-cl
      • cuda-on-cl targets to be able to take any NVIDIA® CUDA™ soure-code, and compile it for OpenCL 1.2 devices. It's a very general goal, and a very general compiler
    • for now, the following functionalities are implemented:
      • per-element operations, using Eigen over OpenCL, (more info at https://bitbucket.org/hughperkins/eigen/src/eigen-cl/unsupported/test/cuda-on-cl/?at=eigen-cl )
      • blas / matrix-multiplication, using Cedric Nugteren's CLBlast https://github.com/cnugteren/CLBlast
      • reductions, argmin, argmax, again using Eigen, as per earlier info and links
      • learning, trainers, gradients. At least, StochasticGradientDescent trainer is working, and the others are commited, but not yet tested
    • it is developed on Ubuntu 16.04 (using Intel HD5500, and NVIDIA GPUs) and Mac Sierra (using Intel HD 530, and Radeon Pro 450)

    This is not the only OpenCL fork of Tensorflow available. There is also a fork being developed by Codeplay https://www.codeplay.com , using Computecpp, https://www.codeplay.com/products/computesuite/computecpp Their fork has stronger requirements than my own, as far as I know, in terms of which specific GPU devices it works on. You would need to check the Platform Support Notes (at the bottom of hte computecpp page), to determine whether your device is supported. The codeplay fork is actually an official Google fork, which is here: https://github.com/benoitsteiner/tensorflow-opencl

    0 讨论(0)
  • 2020-12-07 09:11

    One can use AMD GPU via the PlaidML Keras backend.

    Fastest: PlaidML is often 10x faster (or more) than popular platforms (like TensorFlow CPU) because it supports all GPUs, independent of make and model. PlaidML accelerates deep learning on AMD, Intel, NVIDIA, ARM, and embedded GPUs.

    Easiest: PlaidML is simple to install and supports multiple frontends (Keras and ONNX currently)

    Free: PlaidML is completely open source and doesn't rely on any vendor libraries with proprietary and restrictive licenses.

    For most platforms, getting started with accelerated deep learning is as easy as running a few commands (assuming you have Python (v2 or v3) installed):

    virtualenv plaidml
    source plaidml/bin/activate
    pip install plaidml-keras plaidbench
    

    Choose which accelerator you'd like to use (many computers, especially laptops, have multiple):

    plaidml-setup
    

    Next, try benchmarking MobileNet inference performance:

    plaidbench keras mobilenet
    

    Or, try training MobileNet:

    plaidbench --batch-size 16 keras --train mobilenet
    

    To use it with keras set

    os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
    

    For more information

    https://github.com/plaidml/plaidml

    https://github.com/rstudio/keras/issues/205#issuecomment-348336284

    0 讨论(0)
提交回复
热议问题