Is It Still Necessary to Install Cuda Before Using the Conda Tensorflow-Gpu Package

Is it still necessary to install CUDA before using the conda tensorflow-gpu package?

Do i now have two versions of CUDA installed and how do I check this?

No.

conda installs the bare minimum redistributable library components required to support the CUDA accelerated packages they offer. The package name cudatoolkit is a complete misnomer. It is nothing of the sort. Even though it is now greatly expanded in scope from what it used to be (literally 5 files -- I think at some point they must have gotten a licensing deal from NVIDIA because some of this wasn't/isn't on the official "freely redistributable" list AFAIK), it still is basically just a handful of libraries.

You can check this for yourself:

cat /opt/miniconda3/conda-meta/cudatoolkit-10.1.168-0.json 
{
"build": "0",
"build_number": 0,
"channel": "https://repo.anaconda.com/pkgs/main/linux-64",
"constrains": [],
"depends": [],
"extracted_package_dir": "/opt/miniconda3/pkgs/cudatoolkit-10.1.168-0",
"features": "",
"files": [
"lib/cudatoolkit_config.yaml",
"lib/libcublas.so",
"lib/libcublas.so.10",
"lib/libcublas.so.10.2.0.168",
"lib/libcublasLt.so",
"lib/libcublasLt.so.10",
"lib/libcublasLt.so.10.2.0.168",
"lib/libcudart.so",
"lib/libcudart.so.10.1",
"lib/libcudart.so.10.1.168",
"lib/libcufft.so",
"lib/libcufft.so.10",
"lib/libcufft.so.10.1.168",
"lib/libcufftw.so",
"lib/libcufftw.so.10",
"lib/libcufftw.so.10.1.168",
"lib/libcurand.so",
"lib/libcurand.so.10",
"lib/libcurand.so.10.1.168",
"lib/libcusolver.so",
"lib/libcusolver.so.10",
"lib/libcusolver.so.10.1.168",
"lib/libcusparse.so",
"lib/libcusparse.so.10",
"lib/libcusparse.so.10.1.168",
"lib/libdevice.10.bc",
"lib/libnppc.so",
"lib/libnppc.so.10",
"lib/libnppc.so.10.1.168",
"lib/libnppial.so",
"lib/libnppial.so.10",
"lib/libnppial.so.10.1.168",
"lib/libnppicc.so",
"lib/libnppicc.so.10",
"lib/libnppicc.so.10.1.168",
"lib/libnppicom.so",
"lib/libnppicom.so.10",
"lib/libnppicom.so.10.1.168",
"lib/libnppidei.so",
"lib/libnppidei.so.10",
"lib/libnppidei.so.10.1.168",
"lib/libnppif.so",
"lib/libnppif.so.10",
"lib/libnppif.so.10.1.168",
"lib/libnppig.so",
"lib/libnppig.so.10",
"lib/libnppig.so.10.1.168",
"lib/libnppim.so",
"lib/libnppim.so.10",
"lib/libnppim.so.10.1.168",
"lib/libnppist.so",
"lib/libnppist.so.10",
"lib/libnppist.so.10.1.168",
"lib/libnppisu.so",
"lib/libnppisu.so.10",
"lib/libnppisu.so.10.1.168",
"lib/libnppitc.so",
"lib/libnppitc.so.10",
"lib/libnppitc.so.10.1.168",
"lib/libnpps.so",
"lib/libnpps.so.10",
"lib/libnpps.so.10.1.168",
"lib/libnvToolsExt.so",
"lib/libnvToolsExt.so.1",
"lib/libnvToolsExt.so.1.0.0",
"lib/libnvblas.so",
"lib/libnvblas.so.10",
"lib/libnvblas.so.10.2.0.168",
"lib/libnvgraph.so",
"lib/libnvgraph.so.10",
"lib/libnvgraph.so.10.1.168",
"lib/libnvjpeg.so",
"lib/libnvjpeg.so.10",
"lib/libnvjpeg.so.10.1.168",
"lib/libnvrtc-builtins.so",
"lib/libnvrtc-builtins.so.10.1",
"lib/libnvrtc-builtins.so.10.1.168",
"lib/libnvrtc.so",
"lib/libnvrtc.so.10.1",
"lib/libnvrtc.so.10.1.168",
"lib/libnvvm.so",
"lib/libnvvm.so.3",
"lib/libnvvm.so.3.3.0"
]

.....

i.e. what you get is (keeping in mind most of those "files" above are just symlinks)

  • CUBLAS runtime
  • The CUDA runtime library
  • CUFFT runtime
  • CUrand runtime
  • CUsparse rutime
  • CUsolver runtime
  • NPP runtime
  • nvblas runtime
  • NVTX runtime
  • NVgraph runtime
  • NVjpeg runtime
  • NVRTC/NVVM runtime

The CUDNN package that conda installs is the redistributable binary distribution which is identical to what NVIDIA distribute -- which is exactly two files, a header file and a library.

You would still require a supported NVIDIA driver installation to make the tensorflow which conda installs work.

If you want to actually compile and build CUDA code, you need to install a separate CUDA toolkit which contains all the the development components which conda deliberately omits from their distribution.

Is it needed to install Cuda Toolkit from NVIDIA in Win10 to use Cuda when you already have cudatoolkit package in your Anaconda environment?

I think something wrong with your environment variables. I can recommend these steps:

  • Create a new environment

  • Install required packages:
    conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0 python3 -m pip install tensorflow

  • Verify the installation, first import tensorflow
    python3 -c "import tensorflow as tf;

  • Then check the GPU
    print(tf.config.list_physical_devices('GPU'))"

This should be work
`

I already have a CUDA toolkit installed, why is conda installing CUDA again?

  1. Why is it happening?

Conda expects to manage any packages you install and all their dependencies. The intention is that you literally never have to install anything else by hand for any packages they distribute in their own channel. If a GPU accelerated package requires a CUDA runtime, conda will try to select and install a correctly versioned CUDA runtime for the version of the Python package it has selected for installation.


  1. How to stop conda from installing cuda and cudnn again?

You probably can't, or at least can't without winding up with a non-functional Tensorflow installation. But see here -- what conda installs is only the necessary, correctly versioned CUDA runtime components to make their GPU accelerated packages work. All they don't/can't install is a GPU driver for the hardware.


  1. Can I just use cuda and cudnn that I have already installed?

You say you installed CUDA 11.2. If you look at the conda output, you can see that it wants to install a CUDA 10.2 runtime. As you are now fully aware, versioning is critical to Tensorflow and a Tensorflow build requiring CUDA 10.2 won't work with CUDA 11.2. So even if you were to stop conda from performing the dependency installation, there is a version mismatch so it wouldn't work.


  1. If yes, how?

See above.

Nvidia Cudatoolkit vs Conda Cudatoolkit

If using anaconda to install tensorflow-gpu, yes it will install cuda and cudnn for you in same conda environment as tensorflow-gpu. All you need to install yourself is the latest nvidia-driver (so that it works with the latest CUDA level and all older CUDA levels you use.)

This has many advantages over the pip install tensorflow-gpu method:

  1. Anaconda will always install the CUDA and CuDNN version that the TensorFlow code was compiled to use.
  2. You can have multiple conda environments with different levels of TensorFlow, CUDA, and CuDNN and just use conda activate to switch between them.
  3. You don't have to deal with installing CUDA and cuDNN manaually at the system wide level.

The disadvantage when compared to pip install tensorflow-gpu, is the latest version of tensorflow is added to pypi weeks before Anaconda is able to update the conda recipe and publish their builds of the latest TensorFlow version.

Why is Tensorflow not recognizing my GPU after conda install?

August 2021 Conda install may be working now, as according to @ComputerScientist in the comments below, conda install tensorflow-gpu==2.4.1 will give cudatoolkit-10.1.243 and cudnn-7.6.5

The following was written in Jan 2021 and is out of date

Currently conda install tensorflow-gpu installs tensorflow v2.3.0 and does NOT install the conda cudnn or cudatoolkit packages. Installing them manually (e.g. with conda install cudatoolkit=10.1) does not seem to fix the problem either.

A solution is to install an earlier version of tensorflow, which does install cudnn and cudatoolkit, then upgrade with pip

conda install tensorflow-gpu=2.1
pip install tensorflow-gpu==2.3.1

(2.4.0 uses cuda 11.0 and cudnn 8.0, however cudnn 8.0 is not in anaconda as of 16/12/2020)

Edit: please also see @GZ0's answer, which links to a github discussion with a one-line solution

To get gpu support to create neural networks is conda environment necessary?

No, you can just use pip install tensorflow-gpu to install TensorFlow with GPU support. It's your choice if you want to create Conda environment or not. But before using that pip command, make sure you have CUDA 11.2 and cuDNN 8.1.

And in case of PyTorch just go to this site and copy the command and install it.



Related Topics



Leave a reply



Submit