Installing Nvidia’s CUDA 8.0 on Fedora 25

UPDATE (March 28, 2017): I have found a way to use at least some of the C++11 features. See the end of the post for the changes.

It’s been a long time since I’ve posted anything. Since my last post I had to update the operating system on my work desktop as Fedora 23 went end-of-life. While it is possible to simply use dnf to upgrade your system, I have the proprietary Nvidia driver installed on my system for two reason. First, with my graphics cards (GeForce GTX 650’s), the Nouveau driver doesn’t seem to really work. Without fail, after about 30 minutes the screen no longer refreshes, making it seem as though the system is locked up. Second, I do some GPU programming with CUDA which requires the proprietary driver. With the proprietary driver installed, it’s a bit more difficult to upgrade, so I tend to just back everything up and do a clean install.

Having done that recently, I find myself this morning needing to re-install CUDA as I have a computational problem which could benefit from some massive parallelism. I figured I’d go ahead and post the procedure here for my future reference and in case anyone else might benefit from it.

  1. Start by installing the Nvidia proprietary driver using one of the following methods:
    • Download directly from Nvidia and follow the directions here. This works well for desktops.
    • If you have a laptop with Nvidia Optimus (i.e. an integrated Intel card with discrete Nvidia card), use Bumblebee following the directions here. To check if this is the case for you, look at the output of $ lspci | egrep "VGA|3D" if the output is as follows (from my laptop):
      $ lspci | egrep "VGA|3D"
      00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
      01:00.0 3D controller: NVIDIA Corporation GM107M [GeForce GTX 960M] (rev ff)

      then you’ll want to use Bumblebee.

    • If you prefer, there are repos that have the Nvidia proprietary driver such as rpmfusion and negativo17. I have no recent experience with either of these repos so I can’t say how well they will work. I will note that negativo17 also has CUDA in their repo, but again I have no experience installing in that way.
  2. Once you have installed the driver, rebooted your system, and ensured that everything is working, then we can move on to installing CUDA. Download CUDA 8.0 for Fedora 23 (I know, it’s ridiculous that they don’t officially support version of Fedora that are not end-of-life) from here, making sure to select the “runfile (local)” option.
  3. Install a few libraries that CUDA would like to have around
    $ sudo dnf install libGLU-devel libXi-devel libXmu-devel
    $ sudo dnf groupinstall "C Development Tools and Libraries"
  4. Change directory to where you downloaded CUDA and then run the install script with the --override option which will allow it to install with an unsupported version of GCC and Fedora 25 instead of 23
    $ sudo sh cuda_8.0.44_linux.run --override
  5. You will first be presented with the End User Agreement which you should read carefully (i.e. simply press and hold enter until you reach the end), then type accept.
  6. You will be warned that you are installing on an unsupported configuration and asked if you’d like to continue. Enter y to proceed.
  7. You will be asked if you’d like to install the proprietary driver, which we already did in step 1, so enter n to proceed.
  8. You will be asked if you want to install the CUDA 8.0 Toolkit, which is kind of the whole point of this, so enter y, and then use either the default location or a location of your choosing for the installation.
  9. You’ll probably want to create the symbolic link when asked, and you’ll definitely want to install the samples as they will allow us to test that the installation is working and they will provide you with lots of example code to look at. You can specify the location in which to install the samples, or use the default of your home directory. After that the install will proceed, ignore the warning about the incomplete installation.
  10. Now we need to edit our .bash_profile a bit, so open it up in your favorite text editor and add/modify the following lines
    PATH=$PATH:<Other locations you may have>:/usr/local/cuda/bin
    LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<Other locations you may have>:/usr/local/cuda/lib64
    CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:$HOME/NVIDIA_CUDA-8.0_Samples/common/inc
    
    export PATH
    export LD_LIBRARY_PATH
    export CPLUS_INCLUDE_PATH
  11. Once that is out of the way, you will of course have to log out and back in again or type $ source $HOME/.bash_profile for the changes to take effect.
  12. Okay, now we have to deal with the fact that Nvidia is way behind the times and doesn’t officially support any GCC version passed 5. Fedora 25 currently uses 6.3.1, so we need to edit one of the CUDA header files. Open /usr/local/cuda-8.0/include/host_config.h in your editor of choice and change line 117 to something like
    #if __GNUC__ > 8

    I find the easiest thing to do is to copy the file to your home directory, make the changes and then use sudo to copy it back.

  13. Now that all of that is done, there is one more caveat. Since GCC version 5 defaulted to -std=c++98 when compiling, this is the behavior that CUDA expects. So, when compiling any CUDA code, you’ll need to include the --compiler-options "-std=c++98" option. This will of course preclude you from using any C++11, C++14 or C++17 features in your CUDA code (see below for an update on this).
  14. To test that everything is working, change directory to
    $ cd <PATH_TO_SAMPLES>/1_Utilities/deviceQuery

    open the makefile and change line 155 to read

    CCFLAGS := -std=c++98

    save the changes and then in the terminal type $ make to build the code, then $ ./deviceQuery (add optirun or primusrun before it if you use Bumblebee) to run the code. If you see information about the CUDA devices on your system, then everything should be working!

Unfortunately, I have yet to find a way to make all of the samples at once using the make file in the top samples directory. But you can build them individually in a similar manner to how we made the deviceQuery sample.

I hope that this is helpful to others. The procedure above has worked on my laptop with Bumblebee, and works on my desktop. If you try this is and it works on your system please let me know in the comments by providing the output of $ uname -r, $ lspci | grep "VGA" or $lspci | egrep "VGA|3D" if you use Bumblebee, and $ nvcc --version. Also, if you have problems or have to deviate from the above procedures let us all know. Happy coding!

UPDATE ABOUT C++11:

Today I needed to compile some code to run Markov Chain Monte Carlo (MCMC) where the model is computed on the GPU to speed things up from several minutes per model calculation to ~3.5 seconds. The header only MCMC library that I wrote uses the standard library random number generators and distributions that are part of C++11. This made me revisit the issue caused by NVIDIA being behind the times.

To access C++11 features, you can run compilation commands with:

$ nvcc -std=c++11 -arch=...

where after the “…” you fill in the rest of the flags you need. When I would do this on my system, it would complain about a couple of lines in math_functions.h, so instead of just giving up, I opened the file to dig into what was going on. The offending lines were (I believe) 8897 and 8901 which read

__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ int isnan(double x) throw();
__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ int isinf(double x) throw();

If we simply modify these lines to be in the same format as the others in that block, so they read

__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ constexpr bool isnan(double x);
__DEVICE_FUNCTIONS_DECL__ __cudart_builtin__ constexpr bool isinf(double x);

then we should no longer have problems using (most of) the C++11 features! However, there is at least one exception that I have found so far thanks to my custom parameter file parsing library. If you use the standard library map, you’ll get a bunch of output like

/usr/include/c++/6.3.1/tuple:605:4: error: parameter packs not expanded with ‘...’:
                bool>::type=true>
    ^
/usr/include/c++/6.3.1/tuple:605:4: note:         ‘_Elements’

However, you seem to be able to vectors, and tuples just fine. If you make these changes and run into other things that seem to break please let me know in the comments. I’ll try to look into this is more detail soon.

Advertisements

23 thoughts on “Installing Nvidia’s CUDA 8.0 on Fedora 25”

  1. Thank you! This worked great for me. I’m new to Fedora after using Ubuntu for a while. I started with the negativo17 repo (following the steps here: http://blog.mdda.net/oss/2016/11/25/nvidia-on-fedora-25) but wasn’t able to get the CUDA samples installed. So I found your site and started at step 2. Everything worked perfectly. BTW, when I compiled the example, I did get a warning:

    nvcc warning : The ‘compute_20’, ‘sm_20’, and ‘sm_21’ architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

    Not sure why that happens, but seems easy enough to suppress.

    Here is the info you requested:

    uname -r
    4.9.8-201.fc25.x86_64

    lspci | grep “VGA”
    00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09)

    lspci | egrep “VGA|3D”
    00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09)
    01:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)

    nvcc –version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2016 NVIDIA Corporation
    Built on Sun_Sep__4_22:14:01_CDT_2016
    Cuda compilation tools, release 8.0, V8.0.44

    Like

    1. Looking at the compilation command in the make file (actually what it prints to the terminal window when typing make), it generates code for all the currently supported CUDA device capabilities, including 2.0. This triggers a warning because support for those architectures will likely be dropped in the next release. Considering devices that only support that compute capability are quite old at this point, it shouldn’t be a problem.

      Glad that the guide worked out for you! Thanks for sharing the info about your setup. I’m hoping to build a list of known working configurations if enough people share. I have to admit I’m a bit jealous of your Tesla card, I’m just using gamer cards which don’t like double precision very much.

      Like

  2. Thank you! This worked for me, too!
    Before I reached this website I failed the installation several times. My X server had gone and I had to re-install my OS… Anyway now I got CUDA Toolkit8.0!

    Here is the information you requested.

    uname -r
    4.9.14-200.fc25.x86_64

    lspci | grep “VGA”
    00:02.0 VGA compatible controller: Intel Corporation Device 5912 (rev 04)
    01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)

    lspci | egrep “VGA|3D”
    00:02.0 VGA compatible controller: Intel Corporation Device 5912 (rev 04)
    01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)

    nvcc –version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2016 NVIDIA Corporation
    Built on Tue_Jan_10_13:22:03_CST_2017
    Cuda compilation tools, release 8.0, V8.0.61

    Like

  3. Thank you for the instructions!

    Here’s what you can further do:

    * rename your ‘nvcc’ as ‘nvcc2’;

    * create a script named ‘nvcc’, with the following content:
    #!/bin/sh
    #
    nvcc2 –compiler-options=”-std=c++98″ $*

    This way, nvcc is always invoked with the c++98 option and you no longer need to modify the Makefile.

    HTH!

    Like

    1. Thanks for sharing the info! I may work that in to the post at some point, crediting you with the contribution of course, as it is a good way of making it automatic. I usually have other compiler options to pass as my host side code often leverages libraries like FFTW (transforms that are too large to fit in my 2 GB of device memory) or GSL, so adding one more flag to deal with the c++98 issue is not a big deal for me, but I hope others find this useful!

      Like

  4. I have the following problem:

    When I type into the terminal:
    ./deviceQuery
    This is what I get:

    ./deviceQuery Starting…

    CUDA Device Query (Runtime API) version (CUDART static linking)

    cudaGetDeviceCount returned 30
    -> unknown error
    Result = FAIL

    Help please :/

    Like

    1. Could you tell me a little more about your system? Are you using a laptop with NVIDIA Optimus by chance? There is a thread on the CUDA developer forum where someone on Ubuntu was getting the result you are, and they effectively seemed to have fixed it by using optirun ./deviceQuery, which is necessary on Optimus laptops. Here’s the link to that forum post:

      https://devtalk.nvidia.com/default/topic/760872/ubuntu-12-04-error-cudagetdevicecount-returned-30/

      Like

      1. I’m using a desktop system with a GTX 970.
        I already tried to run it by using optirun ./deviceQuery… But it cant find this command.

        Like

      2. Optirun will only be on your system if you installed the driver through Bumblebee, which you should not do. Could you post the output of
        lspci | egrep “VGA|3D”
        and
        glxinfo | grep OpenGL
        The first will show whether your card is detected, the second should show which driver is being used.

        Like

    2. So I fixed it now… I simply run this command with sudo 😀

      But now I have a problem with Blender which can’t render out an image.
      Anyways thank you so much for your help.

      Like

  5. Hi!! I found your guide very interesting! Thank you!
    I found a solution that maybe is working on your system too.
    I was struggling trying to install gcc 5 until I stumbled upon this link:
    https://copr.fedorainfracloud.org/coprs/leebc/compat-gcc-5/

    Its a repo to install a compact version 5 of gcc.
    So that’s the procedure I followed:

    sudo dnf copr enable leebc/compat-gcc-5
    sudo dnf install compat-gcc5
    # then you have to force nvcc to use this compiler instead of the system default one
    # you can do it easily linking gcc5 and g++5 in the cuda bin path so…
    sudo ln -s /usr/bin/gcc5 /usr/local/cuda/bin/gcc
    sudo ln -s /usr/bin/g++5 /usr/local/cuda/bin/g++

    It worked for me and I can compile all the examples in the CUDA directory.
    Hope you will find it useful too
    Enjoy

    Liked by 1 person

  6. Hi,
    thanks for this great tutorial, it helps me a lot!!
    I want to add some additional information about debugging with Nsight,
    I always get the error libncurses 5.0 is missing the installation of ncurses-compact-libs solve this problem
    # dnf install ncurses-compat-libs
    (https://ask.fedoraproject.org/en/question/89534/parrallel-install-ncurses-6-ncurses-5-on-fedora-24/)

    In Nsight is a checkbox for enabling c++11 (Project->Settings->Code Generation -> Tool Settings -> Enable c++11 Support) see( http://stackoverflow.com/questions/29736820/cuda-nsight-c11-ubuntu-14-04)

    Another solution with the problem of math_functions.h and c++11 is treated in (http://stackoverflow.com/questions/39827168/using-cuda-8-0-with-gcc-6-x-bad-function-overloading-complaint)
    In Nsight I insert in Project->Settings->NVCC Compiler -> Tool Settings the Preprocessor options “-D__CORRECT_ISO_CPP11_MATH_H_PROTO” and everything works fine.

    Like

    1. Thanks for sharing the info! I usually debug with cuda-gdb on the command line, so I wasn’t aware of any potential problems with Nsight. I have also tried the suggestions of the StackOverflow answer you link and it didn’t work for me which is why I had to resort to editing header files, but hopefully it helps someone else.

      Like

  7. Dear dpearson1983,

    Many thanks for your helpful tutorial – you’ve saved me hours of headache! If there’s a way to donate please let me know.

    On my Fedora 25 workstation (running Ryzen 1700x), I ran into a little problem when compiling the ‘Simulation’ CUDA samples:

    /usr/bin/ld: cannot find -lglut
    collect2: error: ld returned 1 exit status
    Makefile:279: recipe for target ‘particles’ failed
    make: *** [particles] Error 1

    This is simply solved by installing ‘freeglut’ libraries:

    $ sudo dnf install freeglut.x86_64 freeglut_devel.x86_64

    Now for the info you requested:

    $ uname -r
    4.10.13-200.fc25.x86_64

    $ lspci | grep “VGA”
    09:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1)

    $ lspci | egrep “VGA|3D”
    09:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1)

    $ nvcc –version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2016 NVIDIA Corporation
    Built on Tue_Jan_10_13:22:03_CST_2017
    Cuda compilation tools, release 8.0, V8.0.61

    Like

  8. Great post!!! All worked for my thanks!!!

    The only problem I can’t solve this error :

    /usr/include/c++/6.3.1/tuple:752:4: nota: ‘_Elements’
    /usr/include/c++/6.3.1/tuple:771:4: error: no se expanden los paquetes de parámetro con ‘…’

    You know any solution for this error ?

    Like

    1. I don’t have a solution for that just yet. Interestingly, even though it points at tuple, I believe the error comes into play when you use the standard library map (e.g. if you have

      #include<map>

      in your code). The work around I’ve used so far is to separate out the code that uses the map and compile it to an object file which I then link with the GPU code. For example, I have a custom parameter file parsing library that I use which makes use of maps so that parameters can be called by key name. When I use it with CUDA code that I need the C++11 standard with, I create a separate .cpp file that loads the parameters into a struct defined in a header file with an initializer that takes a string (the parameter file name). The .cpp file then includes the code that uses map to load the parameters and simply pass them to the struct. This hides the map from CUDA and seems to prevent the above error. I hope this helps!

      Like

  9. Thank you!

    uname -r
    4.10.17-200.fc25.x86_64

    lspci | grep “VGA”
    01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2)

    nvcc –version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2016 NVIDIA Corporation
    Built on Tue_Jan_10_13:22:03_CST_2017
    Cuda compilation tools, release 8.0, V8.0.61

    Like

  10. Great Instruction. Worked perfect! This never goes so good. Cheers to you!

    4.11.3-200.fc25.x86_64

    00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)
    03:00.0 3D controller: NVIDIA Corporation GF117M [GeForce 610M/710M/810M/820M / GT 620M/625M/630M/720M] (rev a1)

    Like

    1. That isn’t so much a step as a note on how to compile your programs. When you invoke nvcc from the command line you will need to pass that option. Essentially
      $ nvcc -arch=sm_52 --compiler-options "-std=c++98" YOUR_CUDA_CODE.cu -o YOUR_EXECUTABLE_NAME
      The options of -arch should match your cards capability. You can also pass any other g++ compiler options within the quotes. This is something that you have to do every time you go to compile CUDA code.

      I hope this clears things up!

      Cheers,
      David

      Like

  11. My approach for Fedora 26 was different. I simply compiled gcc 5.4.0 from source and then used the nvcc flag -ccbin=/usr/local/gcc-5.4.0/bin/g++ . This worked for all sample programs, as well as for, compiling tensorflow, pytorch, and caffee.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s