Fist, you have to have CUDA installed. That was not obvious as it comes with:
$sudo equo install nvidia-cuda-toolkit
So after we install the nvidia-cuda-toolkit, we have to tell Linux, where the CUDA bin and libs are:
$nano /home/$USER/.bashrc
Add the following lines:
export PATH="/usr/local/cuda-8.0/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH"
In my case, cuda is located in /opt/cuda so that replaces /usr/local/cuda. Save file.
$source /home/$USER/.bashrc
Now, you have nvcc. source
We can test if it works with this simple code. (see comment
Compile and run a CUDA hello world)
Also you can do
$nvcc -V
to see the version and everything.
I think at some point I did:
export CUDA_VISIBLE_DEVICES=1
but then I added it to my notebook as:
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
Now for the fun part, it seems that TensorFlow requires not only cuda, but CudaNN. Which doesn't come with cuda-toolkit but has to be downloaded independently from Nvidia's website https://developer.nvidia.com/cudnn).
After registration and a survey and a warning for ethical use of AI (WTF?). The installation is very simple. You download the tar file and then you do:
$ sudo cp cuda/include/cudnn*.h /opt/cuda/include
$ sudo cp -P cuda/lib64/libcudnn* /opt/cuda/lib64
$ sudo chmod a+r /opt/cuda/include/cudnn*.h /opt/cuda/lib64/libcudnn*
where /opt/cuda is the path to my cuda installation, but it may also be in /usr/local/cuda. And you have to do that from Downloads i.e. from outside of the folder where you untarred your file. And that's it, tensorflow works with GPU.
Also to test it, you can use:
import tensorflow as tf
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
with tf.Session() as sess:
print (sess.run(c))
Another way to test cuda with tensorflow is:
python3 -c "import tensorflow as tf;import os; os.environ['TF_XLA_FLAGS'] = '--tf_xla_enable_xla_devices';print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
or just :
python3 -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"