How to compile Tensorflow with SSE4.2 and AVX instructions?

前端 未结 12 895
南笙
南笙 2020-11-22 04:14

This is the message received from running a script to check if Tensorflow is working:

I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUD         


        
12条回答
  •  死守一世寂寞
    2020-11-22 04:50

    Let's start with the explanation of why do you see these warnings in the first place.


    Most probably you have not installed TF from source and instead of it used something like pip install tensorflow. That means that you installed pre-built (by someone else) binaries which were not optimized for your architecture. And these warnings tell you exactly this: something is available on your architecture, but it will not be used because the binary was not compiled with it. Here is the part from documentation.

    TensorFlow checks on startup whether it has been compiled with the optimizations available on the CPU. If the optimizations are not included, TensorFlow will emit warnings, e.g. AVX, AVX2, and FMA instructions not included.

    Good thing is that most probably you just want to learn/experiment with TF so everything will work properly and you should not worry about it


    What are SSE4.2 and AVX?

    Wikipedia has a good explanation about SSE4.2 and AVX. This knowledge is not required to be good at machine-learning. You may think about them as a set of some additional instructions for a computer to use multiple data points against a single instruction to perform operations which may be naturally parallelized (for example adding two arrays).

    Both SSE and AVX are implementation of an abstract idea of SIMD (Single instruction, multiple data), which is

    a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously. Thus, such machines exploit data level parallelism, but not concurrency: there are simultaneous (parallel) computations, but only a single process (instruction) at a given moment

    This is enough to answer your next question.


    How do these SSE4.2 and AVX improve CPU computations for TF tasks

    They allow a more efficient computation of various vector (matrix/tensor) operations. You can read more in these slides


    How to make Tensorflow compile using the two libraries?

    You need to have a binary which was compiled to take advantage of these instructions. The easiest way is to compile it yourself. As Mike and Yaroslav suggested, you can use the following bazel command

    bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package

提交回复
热议问题