GROMACS 5.0.1, nVidia CUDA Toolkit, And FFTW3 Under Ubuntu 14.04 LTS (64-bit); The Virtues Of VirtualBox

Summarized below are the catches and fixes from a recent effort to build GROMACS 5.0.1 with FFTW3 (single- and double-precision) and GPU support (so, single-precision). Also, a trick I’ve been doing with great success lately, using a virtual machine to keep my real machine as clean as possible.

0. The Virtues Of VirtualBox

Open source means never having to say you’re sorry.

I’ve made the above proclamation to anyone who’d listen lately who has any interest in using Linux software (because, regardless of what anyone says on the matter, it ain’t there yet as an operating system for general scientific users with general computing know-how). You will very likely find yourself stuck at a configure or make step in one or more prerequisite codes to some final build you’re trying to do, leaving yourself to google error messages to try to come up with some kind of solution. Invariably, you’ll try something that seems to work, only to find it doesn’t, potentially leaving a trail of orphaned files, version-breaking changes, and random downgrading only to find something else stupid (or not) fixed your build problems.

I’ve an install I’m quite happy with that has all of the working code I want on it working – and I’ve no interest in having to perform re-installs to get back to a working state again.

My solution, which I’ve used to great success with GAMESS-US, GROMACS, NWChem, and Amber (so far), is to break a virtual instance in VirtualBox first. For those who don’t know (and briefly), VirtualBox lets you install a fully-working OS inside of your own OS that simply sits as a file in a Virtual VM folder in your user directory. My procedure has been to install a 60 GB VirtualBox instance of (currently) Ubuntu 14.04 (which I will refer to here as PROTOTYPE), fully update it to the current state of my RealBox (updates, upgrades, program installs, etc.), then copy PROTOTYPE somewhere else on the machine. The only limitation of this approach is that VirtualBox doesn’t give you access to the GPU if you’re testing CUDA-specific calculations. That said, it does let you install the CUDA Development Toolkit and compile code just fine, so you can at least work your way through a full build to make sure you don’t run into problems.

When you’re done trashing your VirtualBox after a particularly heinous build, just delete PROTOTYPE from Virtual VM and re-copy your copy back into Virtual VM – voila! You’re ready for another build operation (or to make sure your “final” build actually works flawlessly before committing the build to your RealBox.

That’s all I have to say on the matter. Consider it as your default procedure (at this point, I won’t touch my RealBox with new installs until I know it’s safe in VirtualBox).

1. The State Of My Machine Pre-GROMACS And All Other apt-get’s Used Below

What follows below is pretty straightforward. Errors you might get that don’t appear below might be related to the lack of certain installs on your machine that I installed on VirtualBox. That is, my standard PROTOTYPE comes standard with Intel’s Fortran and C Compilers (for code optimization). Those installs required a few installs above the base Ubuntu install. These are (and are pretty standard anyway, so I say install them anyway):

sudo apt-get install build-essential gcc-multilib rpm openjdk-7-jre-headless 

I could have just installed a fresh version of 14.04 onto a machine to try this myself, but I’m not that motivated. Also, note this list does not include the all-important cmake. We’ll get to that.

And for the rest of GROMACS (at least for older versions), there were lots of mesa/gnuplot/motif-specific dependencies in older versions of GROMACS to build all of the files included in the GROMACS package. Regardless of GPU builds or not, I tend to default to install all the packages below just to have them (which all, for 14.04, currently apt-get properly).

sudo apt-get install openmpi-bin openmpi-common gfortran csh grace menu x11proto-print-dev motif-clients freeglut3-dev libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev libgl1-mesa-dri libcurl-ocaml-dev libcurl4-gnutls-dev gnuplot

If you don’t install the libblas3gf libblas-doc libblas-dev liblapack3gf liblapack-doc liblapack-dev series, you’ll see the following note from your cmake steps in GROMACS.

— A library with BLAS API not found. Please specify library location.
— Using GROMACS built-in BLAS.
— LAPACK requires BLAS
— A library with LAPACK API not found. Please specify library location.
— Using GROMACS built-in LAPACK.

My own preference is to use the (assumedly newer) Ubuntu-specific libraries from apt-get.

sudo apt-get install libblas3gf libblas-doc libblas-dev liblapack3gf liblapack-doc liblapack-dev

GPU-Specific? One More apt-get

My first passes at proper GPU compilation involved several steps for the nVidia Developer Toolkit install. That’s now taken care of with apt-get, so perform the final apt-get to complete the component/dependency installations.

sudo apt-get install nvidia-cuda-dev nvidia-cuda-toolkit

With luck, your first attempt at a GPU-based installation will look like the following:

[0%] Building NVCC (Device) object src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir//./cuda_tools_generated_copyrite_gpu.cu.o

[100%] Building CXX object src/programs/CMakeFiles/gmx.dir/legacymodules.cpp.o
Linking CXX executable http://www.somewhereville.com/bin/gmx
[100%] Built target gmx

2. Nothing Happens Without cmake

Install cmake! Reproducing the output below to make sure you’re using the same versions for everything (in the event something breaks in the future).

sudo apt-get install cmake

Reading package lists… Done
Building dependency tree
Reading state information… Done
The following packages were automatically installed and are no longer required:
linux-headers-3.13.0-32 linux-headers-3.13.0-32-generic
linux-image-3.13.0-32-generic linux-image-extra-3.13.0-32-generic
Use ‘apt-get autoremove’ to remove them.
The following extra packages will be installed:
cmake-data
Suggested packages:
codeblocks eclipse
The following NEW packages will be installed:
cmake cmake-data
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 3,294 kB of archives.
After this operation, 16.6 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://us.archive.ubuntu.com/ubuntu/ trusty/main cmake-data all 2.8.12.2-0ubuntu3 [676 kB]
Get:2 http://us.archive.ubuntu.com/ubuntu/ trusty/main cmake amd64 2.8.12.2-0ubuntu3 [2,618 kB]
Fetched 3,294 kB in 30s (106 kB/s)
Selecting previously unselected package cmake-data.
(Reading database … 258157 files and directories currently installed.)
Preparing to unpack …/cmake-data_2.8.12.2-0ubuntu3_all.deb …
Unpacking cmake-data (2.8.12.2-0ubuntu3) …
Selecting previously unselected package cmake.
Preparing to unpack …/cmake_2.8.12.2-0ubuntu3_amd64.deb …
Unpacking cmake (2.8.12.2-0ubuntu3) …
Processing triggers for man-db (2.6.7.1-1) …
Setting up cmake-data (2.8.12.2-0ubuntu3) …
Setting up cmake (2.8.12.2-0ubuntu3) …

3. First Pass At GROMACS

The make install step will place GROMACS where you want it on your machine, so you’re just as good building in $HOME/Downloads as you are anywhere else. I will be performing all operations from $HOME/Downloads unless otherwise stated.

According to the GROMACS Installation Manual, your quick-and-dirty install need only involve the following:

$ tar xvfz gromacs-src.tar.gz
$ ls
gromacs-src
$ mkdir build
$ cd build
$ cmake ../gromacs-src
$ make

This allows you build “out-of-source” as they put it. Frankly, I just dive right into the GROMACS folder and have at it.

CMake Error: The source directory “/home/user/Downloads/gromacs-5.0.1/build” does not appear to contain CMakeLists.txt.
Specify –help for usage, or press the help button on the CMake GUI.

And did you see the above error? If so, you read the GROMACS quick-and-dirty procedure backwards. I’m not running it this way, so doesn’t matter to what follows.

My first attempt at building GROMACS produced the following output from PROTOTYPE (reproducing all the text below).

user@PROTOTYPE:~$ cd Downloads/
user@PROTOTYPE:~/Downloads$ gunzip gromacs-5.0.1.tar.gz 
user@PROTOTYPE:~/Downloads$ tar xvf gromacs-5.0.1.tar 

gromacs-5.0.1/README
gromacs-5.0.1/INSTALL

gromacs-5.0.1/tests/CppCheck.cmake
gromacs-5.0.1/tests/CMakeLists.txt

user@PROTOTYPE:~/Downloads$ cd gromacs-5.0.1/
user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ cmake -DGMX_GPU=OFF

NOTE: If you just run cmake, you’ll get the following…

cmake version 2.8.12.2
Usage

cmake [options] cmake [options]

… which is to say, cmake requires at least one option be specified. Above, I’m just using -DGMX_GPU=OFF to start the process.

The C compiler identification is GNU 4.8.2
— The CXX compiler identification is GNU 4.8.2
— Check for working C compiler: /usr/bin/cc
— Check for working C compiler: /usr/bin/cc — works
— Detecting C compiler ABI info
— Detecting C compiler ABI info – done
— Check for working CXX compiler: /usr/bin/c++
— Check for working CXX compiler: /usr/bin/c++ — works
— Detecting CXX compiler ABI info
— Detecting CXX compiler ABI info – done
— Checking for GCC x86 inline asm
— Checking for GCC x86 inline asm – supported
— Detecting best SIMD instructions for this CPU
— Detected best SIMD instructions for this CPU – SSE2
— Try OpenMP C flag = [-fopenmp]
— Performing Test OpenMP_FLAG_DETECTED
— Performing Test OpenMP_FLAG_DETECTED – Success
— Try OpenMP CXX flag = [-fopenmp]
— Performing Test OpenMP_FLAG_DETECTED
— Performing Test OpenMP_FLAG_DETECTED – Success
— Found OpenMP: -fopenmp
— Performing Test CFLAGS_WARN
— Performing Test CFLAGS_WARN – Success
— Performing Test CFLAGS_WARN_EXTRA
— Performing Test CFLAGS_WARN_EXTRA – Success
— Performing Test CFLAGS_WARN_REL
— Performing Test CFLAGS_WARN_REL – Success
— Performing Test CFLAGS_WARN_UNINIT
— Performing Test CFLAGS_WARN_UNINIT – Success
— Performing Test CFLAGS_EXCESS_PREC
— Performing Test CFLAGS_EXCESS_PREC – Success
— Performing Test CFLAGS_COPT
— Performing Test CFLAGS_COPT – Success
— Performing Test CFLAGS_NOINLINE
— Performing Test CFLAGS_NOINLINE – Success
— Performing Test CXXFLAGS_WARN
— Performing Test CXXFLAGS_WARN – Success
— Performing Test CXXFLAGS_WARN_EXTRA
— Performing Test CXXFLAGS_WARN_EXTRA – Success
— Performing Test CXXFLAGS_WARN_REL
— Performing Test CXXFLAGS_WARN_REL – Success
— Performing Test CXXFLAGS_EXCESS_PREC
— Performing Test CXXFLAGS_EXCESS_PREC – Success
— Performing Test CXXFLAGS_COPT
— Performing Test CXXFLAGS_COPT – Success
— Performing Test CXXFLAGS_NOINLINE
— Performing Test CXXFLAGS_NOINLINE – Success
— Looking for include file unistd.h
— Looking for include file unistd.h – found
— Looking for include file pwd.h
— Looking for include file pwd.h – found
— Looking for include file dirent.h
— Looking for include file dirent.h – found
— Looking for include file time.h
— Looking for include file time.h – found
— Looking for include file sys/time.h
— Looking for include file sys/time.h – found
— Looking for include file io.h
— Looking for include file io.h – not found
— Looking for include file sched.h
— Looking for include file sched.h – found
— Looking for include file regex.h
— Looking for include file regex.h – found
— Looking for C++ include regex
— Looking for C++ include regex – not found
— Looking for posix_memalign
— Looking for posix_memalign – found
— Looking for memalign
— Looking for memalign – found
— Looking for _aligned_malloc
— Looking for _aligned_malloc – not found
— Looking for gettimeofday
— Looking for gettimeofday – found
— Looking for fsync
— Looking for fsync – found
— Looking for _fileno
— Looking for _fileno – not found
— Looking for fileno
— Looking for fileno – found
— Looking for _commit
— Looking for _commit – not found
— Looking for sigaction
— Looking for sigaction – found
— Looking for sysconf
— Looking for sysconf – found
— Looking for rsqrt
— Looking for rsqrt – not found
— Looking for rsqrtf
— Looking for rsqrtf – not found
— Looking for sqrtf
— Looking for sqrtf – not found
— Looking for sqrt in m
— Looking for sqrt in m – found
— Looking for clock_gettime in rt
— Looking for clock_gettime in rt – found
— Checking for sched.h GNU affinity API
— Performing Test sched_affinity_compile
— Performing Test sched_affinity_compile – Success
— Check if the system is big endian
— Searching 16 bit integer
— Looking for sys/types.h
— Looking for sys/types.h – found
— Looking for stdint.h
— Looking for stdint.h – found
— Looking for stddef.h
— Looking for stddef.h – found
— Check size of unsigned short
— Check size of unsigned short – done
— Using unsigned short
— Check if the system is big endian – little endian
— Found LibXml2: /usr/lib/x86_64-linux-gnu/libxml2.so (found version “2.9.1”)
— Looking for xmlTextWriterEndAttribute in /usr/lib/x86_64-linux-gnu/libxml2.so
— Looking for xmlTextWriterEndAttribute in /usr/lib/x86_64-linux-gnu/libxml2.so – found
— Looking for include file libxml/parser.h
— Looking for include file libxml/parser.h – found
— Looking for include file pthread.h
— Looking for include file pthread.h – found
— Looking for pthread_create
— Looking for pthread_create – not found
— Looking for pthread_create in pthreads
— Looking for pthread_create in pthreads – not found
— Looking for pthread_create in pthread
— Looking for pthread_create in pthread – found
— Found Threads: TRUE
— Looking for include file pthread.h
— Looking for include file pthread.h – found
— Atomic operations found
— Performing Test PTHREAD_SETAFFINITY
— Performing Test PTHREAD_SETAFFINITY – Success
— Could NOT find Boost
Boost >= 1.44 not found. Using minimal internal version. This may cause trouble if you plan on compiling/linking other software that uses Boost against Gromacs.
— Looking for zlibVersion in /usr/lib/x86_64-linux-gnu/libz.so
— Looking for zlibVersion in /usr/lib/x86_64-linux-gnu/libz.so – found
— Setting build user/date/host/cpu information
— Setting build user & time – OK
— Checking floating point format
— Checking floating point format – IEEE754 (LE byte, LE word)
— Checking for 64-bit off_t
— Checking for 64-bit off_t – present
— Checking for fseeko/ftello
— Checking for fseeko/ftello – present
— Checking for SIGUSR1
— Checking for SIGUSR1 – found
— Checking for pipe support
— Checking for isfinite
— Performing Test isfinite_compile_ok
— Performing Test isfinite_compile_ok – Success
— Checking for isfinite – yes
— Checking for _isfinite
— Performing Test _isfinite_compile_ok
— Performing Test _isfinite_compile_ok – Failed
— Checking for _isfinite – no
— Checking for _finite
— Performing Test _finite_compile_ok
— Performing Test _finite_compile_ok – Failed
— Checking for _finite – no
— Performing Test CXXFLAG_STD_CXX0X
— Performing Test CXXFLAG_STD_CXX0X – Success
— Performing Test GMX_CXX11_SUPPORTED
— Performing Test GMX_CXX11_SUPPORTED – Success
— Checking for system XDR support
— Checking for system XDR support – present
— Try C compiler SSE2 flag = [-msse2]
— Performing Test C_FLAG_msse2
— Performing Test C_FLAG_msse2 – Success
— Performing Test C_SIMD_COMPILES_FLAG_msse2
— Performing Test C_SIMD_COMPILES_FLAG_msse2 – Success
— Try C++ compiler SSE2 flag = [-msse2]
— Performing Test CXX_FLAG_msse2
— Performing Test CXX_FLAG_msse2 – Success
— Performing Test CXX_SIMD_COMPILES_FLAG_msse2
— Performing Test CXX_SIMD_COMPILES_FLAG_msse2 – Success
— Enabling SSE2 SIMD instructions
— Performing Test _callconv___vectorcall
— Performing Test _callconv___vectorcall – Failed
— Performing Test _callconv___regcall
— Performing Test _callconv___regcall – Failed
— Performing Test _callconv_
— Performing Test _callconv_ – Success
— checking for module ‘fftw3f’
— package ‘fftw3f’ not found
— pkg-config could not detect fftw3f, trying generic detection
Could not find fftw3f library named libfftw3f, please specify its location in CMAKE_PREFIX_PATH or FFTWF_LIBRARY by hand (e.g. -DFFTWF_LIBRARY=’/path/to/libfftw3f.so’)
CMake Error at cmake/gmxManageFFTLibraries.cmake:76 (MESSAGE):
Cannot find FFTW 3 (with correct precision – libfftw3f for mixed-precision
GROMACS or libfftw3 for double-precision GROMACS). Either choose the right
precision, choose another FFT(W) library (-DGMX_FFT_LIBRARY), enable the
advanced option to let GROMACS build FFTW 3 for you
(-GMX_BUILD_OWN_FFTW=ON), or use the really slow GROMACS built-in fftpack
library (-DGMX_FFT_LIBRARY=fftpack).
Call Stack (most recent call first):
CMakeLists.txt:733 (include)

— Configuring incomplete, errors occurred!
See also “/home/user/Downloads/gromacs-5.0.1/CMakeFiles/CMakeOutput.log”.
See also “/home/user/Downloads/gromacs-5.0.1/CMakeFiles/CMakeError.log”.

Lots of little things to address here. We’ll get to the Boost problem later. Meantime, you can see the critical error is in (1) the lack of FFTW3 and (2) the lack of my specifically asking for -DGMX_BUILD_OWN_FFTW=ON in the cmake process.

NOTE: If you try to fix the FFTW3 problem as described above, you’ll get the following error:

-GMX_BUILD_OWN_FFTW=ON

CMake Error: Could not create named generator MX_BUILD_OWN_FFTW=ON

Make sure to put the “D” in:

-DGMX_BUILD_OWN_FFTW=ON

4. If You Don’t Use DGMX_BUILD_OWN_FFTW=ON To Build FFTW3…

This is a skip-able section if you’re letting cmake do the dirty work (and letting cmake do it is preferred, at least for getting GROMACS built). In trying sudo apt-get install fftw*, you see (currently) the following: fftw2 fftw-dev fftw-docs

No good. So, the procedure is to build FFTW3 from source (which is just as easy as installing from .deb or .rpm files if you installed everything I mentioned above). That said, your attempts to build FFTW3 and build GROMACS may have run into several errors because of how you built FFTW3. Beginning with your extracting and prep for make:

user@PROTOTYPE:~/Downloads$ tar xvf fftw-3.3.4.tar 
user@PROTOTYPE:~/Downloads$ cd fftw-3.3.4/

Any of the combinations below produce the same error:

user@PROTOTYPE:~/Downloads/fftw-3.3.4$ ./configure 
user@PROTOTYPE:~/Downloads/fftw-3.3.4$ ./configure -enable-shared=yes
user@PROTOTYPE:~/Downloads/fftw-3.3.4$ ./configure --enable-threads --enable-float

checking for a BSD-compatible install… /usr/bin/install -c
checking whether build environment is sane… yes

config.status: executing depfiles commands
config.status: executing libtool commands

user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ cmake -DGMX_GPU=OFF
user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ cmake -DGMX_GPU=OFF -DFFTWF_LIBRARY='/usr/local/lib/libfftw3.a'

— The C compiler identification is GNU 4.8.2
— The CXX compiler identification is GNU 4.8.2
— Check for working C compiler: /usr/bin/cc

— Performing Test PTHREAD_SETAFFINITY
— Performing Test PTHREAD_SETAFFINITY – Success
— Could NOT find Boost
Boost >= 1.44 not found. Using minimal internal version. This may cause trouble if you plan on compiling/linking other software that uses Boost against Gromacs.
— Looking for zlibVersion in /usr/lib/x86_64-linux-gnu/libz.so
— Looking for zlibVersion in /usr/lib/x86_64-linux-gnu/libz.so – found

— checking for module ‘fftw3f’
— package ‘fftw3f’ not found
— pkg-config could not detect fftw3f, trying generic detection
Could not find fftw3f library named libfftw3f, please specify its location in CMAKE_PREFIX_PATH or FFTWF_LIBRARY by hand (e.g. -DFFTWF_LIBRARY=’/path/to/libfftw3f.so’)
CMake Error at cmake/gmxManageFFTLibraries.cmake:76 (MESSAGE):
Cannot find FFTW 3 (with correct precision – libfftw3f for mixed-precision
GROMACS or libfftw3 for double-precision GROMACS). Either choose the right
precision, choose another FFT(W) library (-DGMX_FFT_LIBRARY), enable the
advanced option to let GROMACS build FFTW 3 for you
(-GMX_BUILD_OWN_FFTW=ON), or use the really slow GROMACS built-in fftpack
library (-DGMX_FFT_LIBRARY=fftpack).
Call Stack (most recent call first):
CMakeLists.txt:733 (include)

— Configuring incomplete, errors occurred!
See also “/home/user/Downloads/gromacs-5.0.1/CMakeFiles/CMakeOutput.log”.
See also “/home/user/Downloads/gromacs-5.0.1/CMakeFiles/CMakeError.log”.

Including –enable-shared takes care of this error and gets you to a successful GROMACS build.

user@PROTOTYPE:~/Downloads/fftw-3.3.4$ ./configure --enable-threads --enable-float --enable-shared

— The C compiler identification is GNU 4.8.2
— The CXX compiler identification is GNU 4.8.2
— Check for working C compiler: /usr/bin/cc

— Performing Test PTHREAD_SETAFFINITY
— Performing Test PTHREAD_SETAFFINITY – Success
— Could NOT find Boost
Boost >= 1.44 not found. Using minimal internal version. This may cause trouble if you plan on compiling/linking other software that uses Boost against Gromacs.
— Looking for zlibVersion in /usr/lib/x86_64-linux-gnu/libz.so
— Looking for zlibVersion in /usr/lib/x86_64-linux-gnu/libz.so – found

— checking for module ‘fftw3f’
— found fftw3f, version 3.3.4
— Looking for fftwf_plan_r2r_1d in /usr/local/lib/libfftw3f.so
— Looking for fftwf_plan_r2r_1d in /usr/local/lib/libfftw3f.so – found
— Looking for fftwf_have_simd_avx in /usr/local/lib/libfftw3f.so
— Looking for fftwf_have_simd_avx in /usr/local/lib/libfftw3f.so – not found
— Looking for fftwf_have_simd_sse2 in /usr/local/lib/libfftw3f.so
— Looking for fftwf_have_simd_sse2 in /usr/local/lib/libfftw3f.so – not found
— Looking for fftwf_have_simd_avx in /usr/local/lib/libfftw3f.so
— Looking for fftwf_have_simd_avx in /usr/local/lib/libfftw3f.so – not found
— Looking for fftwf_have_simd_altivec in /usr/local/lib/libfftw3f.so
— Looking for fftwf_have_simd_altivec in /usr/local/lib/libfftw3f.so – not found
— Looking for fftwf_have_simd_neon in /usr/local/lib/libfftw3f.so
— Looking for fftwf_have_simd_neon in /usr/local/lib/libfftw3f.so – not found
— Looking for fftwf_have_sse2 in /usr/local/lib/libfftw3f.so
— Looking for fftwf_have_sse2 in /usr/local/lib/libfftw3f.so – not found
— Looking for fftwf_have_sse in /usr/local/lib/libfftw3f.so
— Looking for fftwf_have_sse in /usr/local/lib/libfftw3f.so – not found
— Looking for fftwf_have_altivec in /usr/local/lib/libfftw3f.so
— Looking for fftwf_have_altivec in /usr/local/lib/libfftw3f.so – not found
CMake Warning at cmake/gmxManageFFTLibraries.cmake:89 (message):
The fftw library found is compiled without SIMD support, which makes it
slow. Consider recompiling it or contact your admin
Call Stack (most recent call first):
CMakeLists.txt:733 (include)

— Using external FFT library – FFTW3
— Looking for sgemm_

— Configuring done
— Generating done
— Build files have been written to: /home/user/Downloads/gromacs-5.0.1

And out of a first-pass GROMACS build…

user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ cmake -DGMX_GPU=OFF

Scanning dependencies of target libgromacs
[0%] Building C object src/gromacs/CMakeFiles/libgromacs.dir/__/external/tng_io/src/compression/bwlzh.c.o
[0%] Building C object src/gromacs/CMakeFiles/libgromacs.dir/__/external/tng_io/src/compression/bwt.c.o

[100%] Building CXX object src/programs/CMakeFiles/gmx.dir/legacymodules.cpp.o
Linking CXX executable http://www.somewhereville.com/bin/gmx
[100%] Built target gmx

5. But You Let cmake Build FFTW3. So, Continuing The Build Process

With all of the dependencies above installed, the one note I wanted to address was that for Boost:


— Performing Test PTHREAD_SETAFFINITY – Success
— Could NOT find Boost
Boost >= 1.44 not found. Using minimal internal version. This may cause trouble if you plan on compiling/linking other software that uses Boost against Gromacs.
— Looking for zlibVersion in /usr/lib/x86_64-linux-gnu/libz.so

It certainly isn’t a major issue, but I wanted to try to get an warning-free build. Installing Boost 1.56 produced the following negative result:

user@PROTOTYPE:~/Downloads/boost_1_56_0$ ./bootstrap.sh 

Building Boost.Build engine with toolset gcc… tools/build/src/engine/bin.linuxx86_64/b2
Detecting Python version… 2.7
Detecting Python root… /usr
Unicode/ICU support for Boost.Regex?… not found.
Generating Boost.Build configuration in project-config.jam…

Bootstrapping is done. To build, run:

./b2

To adjust configuration, edit ‘project-config.jam’.
Further information:

– Command line help:
./b2 –help

– Getting started guide:
http://www.boost.org/more/getting_started/unix-variants.html

– Boost.Build documentation:
http://www.boost.org/boost-build2/doc/html/index.html

user@PROTOTYPE:~/Downloads/boost_1_56_0$ sudo ./b2 install

Performing configuration checks

– 32-bit : no (cached)
– 64-bit : yes (cached)
– arm : no (cached)

…failed updating 58 targets…
…skipped 12 targets…
…updated 11322 targets…

user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ cmake -DGMX_GPU=ON -DGMX_DOUBLE=OFF
user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ make

[0%] Building NVCC (Device) object src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir//./cuda_tools_generated_copyrite_gpu.cu.o
[0%] Building NVCC (Device) object src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir//./cuda_tools_generated_pmalloc_cuda.cu.o

[7%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/commandline/cmdlinehelpwriter.cpp.o
In file included from /home/user/Downloads/gromacs-5.0.1/src/gromacs/options/basicoptions.h:52:0,
from /home/user/Downloads/gromacs-5.0.1/src/gromacs/commandline/cmdlinehelpwriter.cpp:55:
/home/user/Downloads/gromacs-5.0.1/src/gromacs/options/../utility/gmxassert.h:47:57: fatal error: boost/exception/detail/attribute_noreturn.hpp: No such file or directory
#include
^
compilation terminated.
make[2]: *** [src/gromacs/CMakeFiles/libgromacs.dir/commandline/cmdlinehelpwriter.cpp.o] Error 1
make[1]: *** [src/gromacs/CMakeFiles/libgromacs.dir/all] Error 2
make: *** [all] Error 2

Sadly, the solution is to then include -DGMX_EXTERNAL_BOOST=off and stick with the internal boost, which then “makes” just fine. One page references the use of -DGMX_INTERNAL_BOOST=on, but that produced the following:

CMake Warning:
Manually-specified variables were not used by the project:

GMX_INTERNAL_BOOST

— Build files have been written to: /home/user/Downloads/gromacs-5.0.1

There’s more on this issue at: gerrit.gromacs.org/#/c/1232/ and t24960.science-biology-gromacs-development.biotalk.us/compiling-boost-problem-and-error-with-icc-t24960.html, but I’ve opted not to worry about it.

So, with Boost installed, I simply ignore it (and have not installed Boost on my RealBox).

user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ cmake -DGMX_GPU=ON -DGMX_EXTERNAL_BOOST=off

6. Finishing Step If All Above Goes Well: CUDA-Based GROMACS Build

If everything else above has gone smoothly (and if you ignored the Boost install. If you didn’t, remember to add -DGMX_EXTERNAL_BOOST=off to the cmake below), you should be able to cleanly run a cmake for a GPU version of GROMACS (below, with the final result to be placed into /opt/gromacs_gpu. You then specify the $PATH after and run with it).

user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ cmake -DGMX_GPU=ON -DCMAKE_INSTALL_PREFIX=/opt/gromacs_gpu -DGMX_BUILD_OWN_FFTW=ON

— The C compiler identification is GNU 4.8.2
— The CXX compiler identification is GNU 4.8.2

— Generating done
— Build files have been written to: /home/damianallis/Downloads/gromacs-5.0.1

user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ make

The make starts with the FFTW3 download and build…

Scanning dependencies of target fftwBuild
[ 0%] Performing pre-download step for ‘fftwBuild’
— downloading…
src=’http://www.fftw.org/fftw-3.3.3.tar.gz’
dest=’/home/damianallis/Downloads/gromacs-5.0.1/src/contrib/fftw/fftw.tar.gz’
— [download 0% complete]

[100%] Building CXX object src/programs/CMakeFiles/gmx.dir/legacymodules.cpp.o
Linking CXX executable http://www.somewhereville.com/bin/gmx
[100%] Built target gmx

Finally, your (sudo) make install places everything into /opt/gromacs_gpu.

user@PROTOTYPE:~/Downloads/gromacs-5.0.1$ sudo make install

— The GROMACS-managed build of FFTW 3 will configure with the following optimizations: –enable-sse2
— Configuring done
— Generating done
— Build files have been written to: /home/damianallis/Downloads/gromacs-5.0.1
[1%] Built target fftwBuild

[100%] Building CXX object src/programs/CMakeFiles/gmx.dir/legacymodules.cpp.o
Linking CXX executable http://www.somewhereville.com/bin/gmx
[100%] Built target gmx

CudaMiner Installation In Ubuntu 12.04 LTS Using CUDA Toolkit 5.5 And “Additional NVIDIA Drivers”

Author’s Note 1: It is my standard policy to put too much info into guides so that those who are searching for specific problems they come across will find the offending text in their searches. With luck, your “build error” search sent you here.

Author’s Note 2: It’s not as bad as it looks (I’ve included lots of output and error messages for easy searching)!

Author’s Note 3: I won’t be much help for you in diagnosing your errors, but am happy to tweak the text below if something is unclear.

Conventions: I include both the commands you type in your Terminal and some of the output from these commands, the output being where most of the errors appear that I work on in the discussion.

Input is formatted as below:

Text you put in (copy + paste should be fine)

Output is formatted as below:

Text you get out (for checking results and reproducing errors)

1. Introduction

This work began as an attempt to build a CUDA-friendly version of the molecular dynamics package GROMACS (which will come later) but, for reasons stemming from a new local Syracuse Meetup Group (Bitcoin’s of New York – Miner’s of Syracuse. Consider joining!), the formation of our very own local mining pool (Salt City Miners, miner.saltcityminers.com. Consider joining!), plus a “what the hell” to see if it was an easy build or not, transformed into the CudaMiner-centric compiling post you see here.

NOTE: This will be a 64-bit-centric install but I’ll include 32-bit content as I’ve found the info on other sites.

2. Installing The NVIDIA Drivers (Two Methods, The Easy One Described)

Having run through this process many times in a fresh install of Ubuntu 12.04 LTS (so nothing else is on the machine except 12.04 LTS, its updates, a few extra installs, and the CUDA/CudaMiner codes), I can say that what is below should work without hitch AFTER you install the NVIDIA drivers. Once your NVIDIA card is installed and Ubuntu recognizes it, you’ve two options.

2A. Install The Drivers From An NVIDIA Download (The Hard Version)

A few websites (and several repostings of the same content) describe the process of installing the NVIDIA drivers the olde-fashioned way, in which you’ll see references to “blacklist nouveau,” “sudo service lightdm stop,” Ctrl+Alt+F1 (to get you to a text-only session), etc. You hopefully don’t need to do this much work for your own NVIDIA install, as Ubuntu will do it for you (with only one restart required).

2B. Install The Drivers After The “Restricted Drivers Available” Pop-Up Or Go To System Settings > Available Drivers (The Easy, Teenage New York Version)

I took the easy way out by letting Ubuntu do the dirty work. The result is the installation of the (currently, as of 28 Dec 2013) v. 319 NVIDIA accelerated graphics driver. For my NVIDIA cards (GTX 690 and a GTX 650 Ti, although I assume it’s similar for a whole class of NVIDIA cards), you’re (currently, check the date again) given the option of v. 304. Don’t! I’ve seen several mentions of CudaMiner (and some of the cuds toolkit) requiring v. 319.

2013dec28_nvidia_1

Caption: You may see it in the upper right (after an install or if you’ve not clicked on it before)

2013dec28_nvidia_2

Caption: Or System Settings > Additional Drivers

2013dec28_nvidia_3

Caption: Either way, you’ll hopefully get to an NVIDIA driver list like above.

3. Pre-CUDA Toolkit Install

There are a few apt-get’s you need to do before installing the CUDA Toolkit (or, at least, the consensus is that these must be done. I’ve not seen a different list in any posts and I didn’t bother to install one-by-one to see which of these might not be needed).

If you perform the most commonly posted apt-get (plus and update and upgrade if you’ve not done so lately):

user@host:~/$ sudo apt-get update
user@host:~/$ sudo apt-get upgrade
user@host:~/$ sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev

You’ll get the following error from a fresh 12.04 LTS install:

Reading package lists… Done
Building dependency tree
Reading state information… Done
libglu1-mesa is already the newest version.
libglu1-mesa set to manually installed.
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
libgl1-mesa-glx : Depends: libglapi-mesa (= 8.0.4-0ubuntu0.6)
Recommends: libgl1-mesa-dri (>= 7.2)
E: Unable to correct problems, you have held broken packages.

The solution here is simple. Add libglapi-mesa and libgl1-mesa-dri to your install.

user@host:~/$ sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev libglapi-mesa libgl1-mesa-dri

Doing this will add a bunch of programs and libraries (listed below):

The following extra packages will be installed:
dpkg-dev freeglut3 g++ g++-4.6 libalgorithm-diff-perl libalgorithm-diff-xs-perl
libalgorithm-merge-perl libdpkg-perl libdrm-dev libgl1-mesa-dev libice-dev libkms1 libllvm3.0
libpthread-stubs0 libpthread-stubs0-dev libsm-dev libstdc++6-4.6-dev libtimedate-perl libx11-doc
libxau-dev libxcb1-dev libxdmcp-dev libxext-dev libxmu-headers libxt-dev mesa-common-dev
x11proto-core-dev x11proto-input-dev x11proto-kb-dev x11proto-xext-dev xorg-sgml-doctools
xserver-xorg xserver-xorg-core xserver-xorg-input-evdev xtrans-dev
Suggested packages:
debian-keyring g++-multilib g++-4.6-multilib gcc-4.6-doc libstdc++6-4.6-dbg libglide3
libstdc++6-4.6-doc libxcb-doc xfonts-100dpi xfonts-75dpi
The following packages will be REMOVED:
libgl1-mesa-dri-lts-raring libgl1-mesa-glx-lts-raring libglapi-mesa-lts-raring
libxatracker1-lts-raring x11-xserver-utils-lts-raring xserver-common-lts-raring
xserver-xorg-core-lts-raring xserver-xorg-input-all-lts-raring xserver-xorg-input-evdev-lts-raring
xserver-xorg-input-mouse-lts-raring xserver-xorg-input-synaptics-lts-raring
xserver-xorg-input-vmmouse-lts-raring xserver-xorg-input-wacom-lts-raring xserver-xorg-lts-raring
xserver-xorg-video-all-lts-raring xserver-xorg-video-ati-lts-raring
xserver-xorg-video-cirrus-lts-raring xserver-xorg-video-fbdev-lts-raring
xserver-xorg-video-intel-lts-raring xserver-xorg-video-mach64-lts-raring
xserver-xorg-video-mga-lts-raring xserver-xorg-video-modesetting-lts-raring
xserver-xorg-video-neomagic-lts-raring xserver-xorg-video-nouveau-lts-raring
xserver-xorg-video-openchrome-lts-raring xserver-xorg-video-r128-lts-raring
xserver-xorg-video-radeon-lts-raring xserver-xorg-video-s3-lts-raring
xserver-xorg-video-savage-lts-raring xserver-xorg-video-siliconmotion-lts-raring
xserver-xorg-video-sis-lts-raring xserver-xorg-video-sisusb-lts-raring
xserver-xorg-video-tdfx-lts-raring xserver-xorg-video-trident-lts-raring
xserver-xorg-video-vesa-lts-raring xserver-xorg-video-vmware-lts-raring
The following NEW packages will be installed:
build-essential dpkg-dev freeglut3 freeglut3-dev g++ g++-4.6 libalgorithm-diff-perl
libalgorithm-diff-xs-perl libalgorithm-merge-perl libdpkg-perl libdrm-dev libgl1-mesa-dev
libgl1-mesa-dri libgl1-mesa-glx libglapi-mesa libglu1-mesa-dev libice-dev libkms1 libllvm3.0
libpthread-stubs0 libpthread-stubs0-dev libsm-dev libstdc++6-4.6-dev libtimedate-perl libx11-dev
libx11-doc libxau-dev libxcb1-dev libxdmcp-dev libxext-dev libxi-dev libxmu-dev libxmu-headers
libxt-dev mesa-common-dev x11proto-core-dev x11proto-input-dev x11proto-kb-dev x11proto-xext-dev
xorg-sgml-doctools xserver-xorg xserver-xorg-core xserver-xorg-input-evdev xtrans-dev

And, remarkably, that’s it for the pre-install.

4. CUDA Toolkit 5.5(.22) Install

The CUDA Toolkit install starts with its 810 MB download at developer.NVIDIA.com/cuda-downloads.

Obviously, be aware of the 32- and 64-bit options. Also, the .deb doesn’t currently download, leaving you to grab the .run file (same difference, I haven’t bothered to find out why the .deb doesn’t fly yet).

Off to your Terminal and into the Downloads folder:

user@host:~/$ cd Downloads
user@host:~/Downloads$ chmod +x cuda_5.5.22_linux_64.run
user@host:~/Downloads$ sudo ./cuda_5.5.22_linux_64.run

Which will produce:

Logging to /tmp/cuda_install_14755.log
Using more to view the EULA.
End User License Agreement
————————–
. . .
and cannot be linked to any personally identifiable
information. Personally identifiable information such as your
username or hostname is not collected.

————————————————————-

Finally, some input to be had after the scrolling:

Do you accept the previously read EULA? (accept/decline/quit): accept     
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 319.37? ((y)es/(n)o/(q)uit): n
Install the CUDA 5.5 Toolkit? ((y)es/(n)o/(q)uit): y
Enter Toolkit Location [ default is /usr/local/cuda-5.5 ]: 
Install the CUDA 5.5 Samples? ((y)es/(n)o/(q)uit): y
Enter CUDA Samples Location [ default is /home/user/NVIDIA_CUDA-5.5_Samples ]: 

NOTE 1: Don’t install the NVIDIA Accelerated Graphics Driver!
NOTE 2: Yes, install the Toolkit.
NOTE 3: I will assume this location for all of the below, setting the location in the PATH.
NOTE 4: I installed the samples for testing (and found a few extra things that need installation for them).
NOTE 5: Default is fine. Once built and tested, can be deleted (although the Mandelbrot is a keeper)

Installing the CUDA Toolkit in /usr/local/cuda-5.5 …
Installing the CUDA Samples in /home/user/NVIDIA_CUDA-5.5_Samples …
Copying samples to /home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples now…
Finished copying samples.

===========
= Summary =
===========

Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-5.5
Samples: Installed in /home/user/NVIDIA_CUDA-5.5_Samples

* Please make sure your PATH includes /usr/local/cuda-5.5/bin

* Please make sure your LD_LIBRARY_PATH
* for 32-bit Linux distributions includes /usr/local/cuda-5.5/lib
* for 64-bit Linux distributions includes /usr/local/cuda-5.5/lib64:/lib
* OR
* for 32-bit Linux distributions add /usr/local/cuda-5.5/lib
* for 64-bit Linux distributions add /usr/local/cuda-5.5/lib64 and /lib
* to /etc/ld.so.conf and run ldconfig as root

* To uninstall CUDA, remove the CUDA files in /usr/local/cuda-5.5
* Installation Complete

Please see CUDA_Getting_Started_Linux.pdf in /usr/local/cuda-5.5/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 319.00 is required for CUDA 5.5 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo
.run -silent -driver

Logfile is /tmp/cuda_install_14755.log

And ignore the WARNING.

As per the “make sure” above, add the CUDA distro folders to your path and LD_LIBRARY_PATH (I chose not to modify ld.so.conf)

user@host:~/Downloads$ cd
user@host:~/$ nano .bashrc

Add the PATH and LD_LIBRARY_PATH as follows:

PATH=$PATH:/usr/local/cuda-5.5/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-5.5/lib64:/lib

And then source the .bashrc file.

user@host:~/$ source .bashrc

5. NVIDIA_CUDA-5.5_Samples (And Finishing The Toolkit Install To Build CudaMiner)

The next set of installs and file modifications came from attempting to build the Samples in the NVIDIA_CUDA-5.5_Samples (or NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples depending on how your install did it) library, during which time I think I managed to hit all of the post-Toolkit install modifications needed to make the CudaMiner build problem-free. The OpenMPI install is optional, but I do hate error messages.

5A. Error 1: /usr/bin/ld: cannot find -lcuda

My first make attempt produced the following error:

user@host:~/$ cd NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples
user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples$ make

make[1]: Entering directory `/home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/0_Simple/asyncAPI’
“/usr/local/cuda-5.5″/bin/nvcc -ccbin g++ -Ihttp://www.somewhereville.com/common/inc -m64 -gencode arch=compute_10,code=sm_10 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=\”sm_35,compute_35\” -o asyncAPI.o -c asyncAPI.cu
. . .
“/usr/local/cuda-5.5″/bin/nvcc -ccbin g++ -Ihttp://www.somewhereville.com/common/inc -m64 -o vectorAddDrv.o -c vectorAddDrv.cpp
“/usr/local/cuda-5.5″/bin/nvcc -ccbin g++ -m64 -o vectorAddDrv vectorAddDrv.o -L/usr/lib/NVIDIA-current -lcuda
/usr/bin/ld: cannot find -lcuda
collect2: ld returned 1 exit status
make[1]: *** [vectorAddDrv] Error 1
make[1]: Leaving directory `/home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/0_Simple/vectorAddDrv’
make: *** [0_Simple/vectorAddDrv/Makefile.ph_build] Error 2

This is solved by making a symbolic link for libcuda.so out of /usr/lib/NVIDIA-319/ and into /usr/lib/

NOTE: It doesn’t matter what directory you do this from. I’ve left off the NVIDIA_CUDA-5.5_Samples/ yadda yadda below.

user@host:~/$ sudo ln -s /usr/lib/NVIDIA-319/libcuda.so /usr/lib/libcuda.so

If you’re working through the build process and hit the error, run a “make clean” before rerunning.

5B. WARNING – No MPI compiler found.

The second build attempt produced the MPI Warning above.

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples$ make

make[1]: Entering directory `/home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/0_Simple/asyncAPI’
“/usr/local/cuda-5.5″/bin/nvcc -ccbin g++ -Ihttp://www.somewhereville.com/common/inc -m64 -gencode arch=compute_10,code=sm_10 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=\”sm_35,compute_35\” -o asyncAPI.o -c asyncAPI.cu
“/usr/local/cuda-5.5″/bin/nvcc -ccbin g++ -m64 -o asyncAPI asyncAPI.o
. . .
cp simpleCubemapTexture http://www.somewhereville.com/bin/x86_64/linux/release
make[1]: Leaving directory `/home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/0_Simple/simpleCubemapTexture’
———————————————————————————————–
WARNING – No MPI compiler found.
———————————————————————————————–
CUDA Sample “simpleMPI” cannot be built without an MPI Compiler.
This will be a dry-run of the Makefile.
For more information on how to set up your environment to build and run this
sample, please refer the CUDA Samples documentation and release notes
———————————————————————————————–
make[1]: Entering directory `/home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/0_Simple/simpleMPI’
[@] mpicxx -Ihttp://www.somewhereville.com/common/inc -o simpleMPI.o -c simpleMPI.cpp
. . .
mkdir -p http://www.somewhereville.com/bin/x86_64/linux/release
cp histEqualizationNPP http://www.somewhereville.com/bin/x86_64/linux/release
make[1]: Leaving directory `/home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/7_CUDALibraries/histEqualizationNPP’
Finished building CUDA samples

But otherwise finishes successfully.

To get around this warning, install OpenMPI (which is needed for multi-board GROMACS runs anyway. But, again, not needed for CudaMiner). The specific issue is the need for mpicc, which is in libopenmpi-dev (not openmpi-bin or openmpi-common).

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples$ mpicc
The program ‘mpicc’ can be found in the following packages:
* lam4-dev
* libmpich-mpd1.0-dev
* libmpich-shmem1.0-dev
* libmpich1.0-dev
* libmpich2-dev
* libopenmpi-dev
* libopenmpi1.5-dev
Try: sudo apt-get install

For completeness, I grab all three (and I’m ingnoring the NVIDIA_CUDA-5.5_Samples directory structure below).

user@host:~/$ sudo apt-get install openmpi-bin openmpi-common libopenmpi-dev

Running mpicc will now produce the following (so it’s there):

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples$ mpicc
gcc: fatal error: no input files
compilation terminated.

Now run a “make clean” if needed and make. The build should go without problem.

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples$ make

make[1]: Entering directory `/home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/0_Simple/asyncAPI’
“/usr/local/cuda-5.5″/bin/nvcc -ccbin g++ -Ihttp://www.somewhereville.com/common/inc -m64 -gencode arch=compute_10,code=sm_10 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=\”sm_35,compute_35\” -o asyncAPI.o -c asyncAPI.cu
“/usr/local/cuda-5.5″/bin/nvcc -ccbin g++ -m64 -o asyncAPI asyncAPI.o
mkdir -p http://www.somewhereville.com/bin/x86_64/linux/release
. . .
mkdir -p http://www.somewhereville.com/bin/x86_64/linux/release
cp histEqualizationNPP http://www.somewhereville.com/bin/x86_64/linux/release
make[1]: Leaving directory `/home/user/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/7_CUDALibraries/histEqualizationNPP’
Finished building CUDA samples

5C. Needed post-processing (lib glut, cuda.conf, NVIDIA.conf, and ldconfig)

The next round of problems stemmed from not being able to run the randomFog program in the new ~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release folder. I suspect the steps taken to remedy this also make all future CUDA-specific work easier, so list the issues and clean-up steps below.

Out of the list of build samples, I selected a few that worked without issue and, finally, randomFog that decidedly had issues:

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/$ cd bin/x86_64/linux/release
user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release$ ls
alignedTypes HSOpticalFlow simpleCUBLAS
asyncAPI imageDenoising simpleCUDA2GL
bandwidthTest imageSegmentationNPP simpleCUFFT
batchCUBLAS inlinePTX simpleDevLibCUBLAS
bicubicTexture interval simpleGL
bilateralFilter jpegNPP simpleHyperQ
bindlessTexture lineOfSight simpleIPC
binomialOptions Mandelbrot simpleLayeredTexture
BlackScholes marchingCubes simpleMPI
boxFilter matrixMul simpleMultiCopy
boxFilterNPP matrixMulCUBLAS simpleMultiGPU
cdpAdvancedQuicksort matrixMulDrv simpleP2P
cdpBezierTessellation matrixMulDynlinkJIT simplePitchLinearTexture
cdpLUDecomposition matrixMul_kernel64.ptx simplePrintf
cdpQuadtree MC_EstimatePiInlineP simpleSeparateCompilation
cdpSimplePrint MC_EstimatePiInlineQ simpleStreams
cdpSimpleQuicksort MC_EstimatePiP simpleSurfaceWrite
clock MC_EstimatePiQ simpleTemplates
concurrentKernels MC_SingleAsianOptionP simpleTexture
conjugateGradient mergeSort simpleTexture3D
conjugateGradientPrecond MersenneTwisterGP11213 simpleTextureDrv
convolutionFFT2D MonteCarloMultiGPU simpleTexture_kernel64.ptx
convolutionSeparable nbody simpleVoteIntrinsics
convolutionTexture newdelete simpleZeroCopy
cppIntegration oceanFFT smokeParticles
cppOverload particles SobelFilter
cudaOpenMP postProcessGL SobolQRNG
dct8x8 ptxjit sortingNetworks
deviceQuery quasirandomGenerator stereoDisparity
deviceQueryDrv radixSortThrust template
dwtHaar1D randomFog template_runtime
dxtc recursiveGaussian threadFenceReduction
eigenvalues reduction threadMigration
fastWalshTransform scalarProd threadMigration_kernel64.ptx
FDTD3d scan transpose
fluidsGL segmentationTreeThrust vectorAdd
freeImageInteropNPP shfl_scan vectorAddDrv
FunctionPointers simpleAssert vectorAdd_kernel64.ptx
grabcutNPP simpleAtomicIntrinsics volumeFiltering
histEqualizationNPP simpleCallback volumeRender
histogram simpleCubemapTexture

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release$ ./randomFog

And you get the following error:

./randomFog: error while loading shared libraries: libcurand.so.5.5: cannot open shared object file: No such file or directory

I originally thought this error might have something to with libglut based on other install sites I ran across. I therefore took the step of adding the symbolic link from /usr/lib/x86_64-linux-gnu to /usr/lib

user@host:~$ sudo ln -s /usr/lib/x86_64-linux-gnu/libglut.so.3 /usr/lib/libglut.so

That said, same issue:

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release$ ./randomFog
./randomFog: error while loading shared libraries: libcurand.so.5.5: cannot open shared object file: No such file or directory

I then found references to adding a cuda.conf file to /etc/ld.so.conf.d – and so did that (doesn’t help but it came up enough that I suspect it doesn’t hurt either).

user@host:~$ sudo nano /etc/ld.so.conf.d/cuda.conf 

This file should contain the following:

/usr/local/cuda-5.5/lib64
/usr/local/cuda-5.5/lib

Which also didn’t help.

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release$ ./randomFog 

./randomFog: error while loading shared libraries: libcurand.so.5.5: cannot open shared object file: No such file or directory

To find the location (or presence) of libcurand, ldconfig -v

user@host:~/$ ldconfig -v

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release$ ldconfig -v
/sbin/ldconfig.real: Path `/lib/x86_64-linux-gnu’ given more than once
/sbin/ldconfig.real: Path `/usr/lib/x86_64-linux-gnu’ given more than once
/usr/local/cuda-5.5/lib64:
libcuinj64.so.5.5 -> libcuinj64.so.5.5.22
libcufft.so.5.5 -> libcufft.so.5.5.22
libcurand.so.5.5 -> libcurand.so.5.5.22
libcusparse.so.5.5 -> libcusparse.so.5.5.22
. . .
libnvToolsExt.so.1 -> libnvToolsExt.so.1.0.0
/usr/local/cuda-5.5/lib:
libcufft.so.5.5 -> libcufft.so.5.5.22
libcurand.so.5.5 -> libcurand.so.5.5.22
libcusparse.so.5.5 -> libcusparse.so.5.5.22
. . .
/usr/lib/NVIDIA-319/tls: (hwcap: 0x8000000000000000)
libNVIDIA-tls.so.319.32 -> libNVIDIA-tls.so.319.32
/usr/lib32/NVIDIA-319/tls: (hwcap: 0x8000000000000000)
libNVIDIA-tls.so.319.32 -> libNVIDIA-tls.so.319.32
/sbin/ldconfig.real: Can’t create temporary cache file /etc/ld.so.cache~: Permission denied

Present twice. Instead of risking making multiple symbolic links as I walked through the dependency gauntlet, I stumbled across another reference in the form of a new /etc/ld.so.conf.d/NVIDIA.conf that contains the same content as cuda.conf (so one may not be needed, but I didn’t bother to backtrack to see. Happy to change the page if someone says otherwise).

user@host:~/$ sudo nano /etc/ld.so.conf.d/NVIDIA.conf

/usr/local/cuda-5.5/lib64
/usr/local/cuda-5.5/lib

Then run ldconfig.

user@host:~/$ sudo ldconfig

With that, randomFog works just fine (and you can assume that a problem in one is a problem in several. Having not taken the full symbolic link route in favor of adding to /etc/ld.so.conf.d, I’m assuming I hit most of the potential errors for the other programs.

user@host:~/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release$ ./randomFog 

Random Fog
==========

CURAND initialized

Random number visualization

6. Build CudaMiner

The good news is that there are only a few more steps. The bad news is that any errors you come across in your attempt to build CudaMiner that relate to NOT having done the above are (likely) not represented here, so hopefully your search was sufficiently vague.

Download CudaMiner-master.zip from Christian Buchner’s github account. Extracting CudaMiner-master.zip (with unzip, not gunzip. Damn Windows users) and running configure produces only one obvious error.

user@host:~/WHEREVER_YOU_ARE/$ cd
user@host:~/$ cd Downloads
user@host:~/Downloads$ unzip CudaMiner-master.zip 
user@host:~/Downloads$ cd CudaMiner-master/
user@host:~/Downloads/CudaMiner-master$ chmod a+wrx configure
user@host:~/Downloads/CudaMiner-master$ ./configure

checking build system type… x86_64-unknown-linux-gnu
checking host system type… x86_64-unknown-linux-gnu
checking target system type… x86_64-unknown-linux-gnu
checking for a BSD-compatible install… /usr/bin/install -c
. . .
checking for gawk… (cached) mawk
checking for curl-config… no
checking whether libcurl is usable… no
configure: error: Missing required libcurl >= 7.15.2

This error is remedied by installing libcurl4-gnutls-dev.

user@host:~/Downloads/CudaMiner-master$ sudo apt-get install libcurl4-gnutls-dev 

Which adds and modifies the following from my clean 12.04 LTS install and update

The following packages were automatically installed and are no longer required:
gir1.2-ubuntuoneui-3.0 libxcb-dri2-0 libxrandr-ltsr2 libubuntuoneui-3.0-1 libxvmc1 thunderbird-globalmenu
libllvm3.2
Use ‘apt-get autoremove’ to remove them.
The following extra packages will be installed:
comerr-dev krb5-multidev libgcrypt11-dev libgnutls-dev libgnutls-openssl27 libgnutlsxx27 libgpg-error-dev
libgssrpc4 libidn11-dev libkadm5clnt-mit8 libkadm5srv-mit8 libkdb5-6 libkrb5-dev libldap2-dev
libp11-kit-dev librtmp-dev libtasn1-3-dev zlib1g-dev
Suggested packages:
krb5-doc libcurl3-dbg libgcrypt11-doc gnutls-doc gnutls-bin krb5-user
The following NEW packages will be installed:
comerr-dev krb5-multidev libcurl4-gnutls-dev libgcrypt11-dev libgnutls-dev libgnutls-openssl27
libgnutlsxx27 libgpg-error-dev libgssrpc4 libidn11-dev libkadm5clnt-mit8 libkadm5srv-mit8 libkdb5-6
libkrb5-dev libldap2-dev libp11-kit-dev librtmp-dev libtasn1-3-dev zlib1g-dev

After a make clean, configure and make for CudaMiner went without problem.

user@host:~/Downloads/CudaMiner-master$ make clean
user@host:~/Downloads/CudaMiner-master$ ./configure

checking build system type… x86_64-unknown-linux-gnu
checking host system type… x86_64-unknown-linux-gnu
checking target system type… x86_64-unknown-linux-gnu
checking for a BSD-compatible install… /usr/bin/install -c
checking whether build environment is sane… yes
checking for a thread-safe mkdir -p… /bin/mkdir -p
. . .
configure: creating ./config.status
config.status: creating Makefile
config.status: creating compat/Makefile
config.status: creating compat/jansson/Makefile
config.status: creating cpuminer-config.h
config.status: cpuminer-config.h is unchanged
config.status: executing depfiles commands

user@host:~/Downloads/CudaMiner-master$ make

make all-recursive
make[1]: Entering directory `/home/user/Downloads/CudaMiner-master’
Making all in compat
make[2]: Entering directory `/home/user/Downloads/CudaMiner-master/compat’
Making all in jansson
make[3]: Entering directory `/home/user/Downloads/CudaMiner-master/compat/jansson’
. . .
./spinlock_kernel.cu(387): Warning: Cannot tell what pointer points to, assuming global memory space
./spinlock_kernel.cu(387): Warning: Cannot tell what pointer points to, assuming global memory space
./spinlock_kernel.cu(387): Warning: Cannot tell what pointer points to, assuming global memory space
. . .
nvcc -g -O2 -Xptxas “-abi=no -v” -arch=compute_10 –maxrregcount=64 –ptxas-options=-v -I./compat/jansson -o legacy_kernel.o -c legacy_kernel.cu
./legacy_kernel.cu(310): Warning: Cannot tell what pointer points to, assuming global memory space
./legacy_kernel.cu(310): Warning: Cannot tell what pointer points to, assuming global memory space
./legacy_kernel.cu(310): Warning: Cannot tell what pointer points to, assuming global memory space
. . .
g++ -g -O2 -pthread -L/usr/local/cuda/lib64 -o cudaminer cudaminer-cpu-miner.o cudaminer-util.o cudaminer-sha2.o cudaminer-scrypt.o salsa_kernel.o spinlock_kernel.o legacy_kernel.o fermi_kernel.o kepler_kernel.o test_kernel.o titan_kernel.o -L/usr/lib/x86_64-linux-gnu -lcurl -Wl,-Bsymbolic-functions -Wl,-z,relro compat/jansson/libjansson.a -lpthread -lcudart -fopenmp
make[2]: Leaving directory `/home/user/Downloads/CudaMiner-master’
make[1]: Leaving directory `/home/user/Downloads/CudaMiner-master’

A few warnings (well, several hundred of the same warnings) appeared during the build process (but don’t affect the program operation. Just pointing them out above).

With luck, you should be able to run a benchmark calculation immediately.

user@host:~/Downloads/CudaMiner-master$ ./cudaminer -d 0 -i 0 --benchmark

*** CudaMiner for NVIDIA GPUs by Christian Buchner ***
This is version 2013-12-18 (beta)
based on pooler-cpuminer 2.3.2 (c) 2010 Jeff Garzik, 2012 pooler
Cuda additions Copyright 2013 Christian Buchner
My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

[2013-12-25 00:05:38] 1 miner threads started, using ‘scrypt’ algorithm.
[2013-12-25 00:05:58] GPU #0: GeForce GTX 690 with compute capability 3.0
[2013-12-25 00:05:58] GPU #0: the ‘K’ kernel requires single memory allocation
[2013-12-25 00:05:58] GPU #0: interactive: 0, tex-cache: 0 , single-alloc: 1
[2013-12-25 00:05:58] GPU #0: Performing auto-tuning (Patience…)
[2013-12-25 00:05:58] GPU #0: maximum warps: 447
[2013-12-25 00:07:40] GPU #0: 288.38 khash/s with configuration K27x4
[2013-12-25 00:07:40] GPU #0: using launch configuration K27x4
[2013-12-25 00:07:40] GPU #0: GeForce GTX 690, 6912 hashes, 0.06 khash/s
[2013-12-25 00:07:40] Total: 0.06 khash/s
[2013-12-25 00:07:40] GPU #0: GeForce GTX 690, 3456 hashes, 141.56 khash/s
[2013-12-25 00:07:40] Total: 141.56 khash/s
[2013-12-25 00:07:43] GPU #0: GeForce GTX 690, 708480 hashes, 251.11 khash/s
[2013-12-25 00:07:43] Total: 251.11 khash/s
[2013-12-25 00:07:48] GPU #0: GeForce GTX 690, 1257984 hashes, 251.19 khash/s
[2013-12-25 00:07:48] Total: 251.19 khash/s
. . .

Then spend the rest of the week optimizing parameters for your particular card and mining proclivity:

user@host:~/Downloads/CudaMiner-master$ ./cudaminer -h
	   *** CudaMiner for NVIDIA GPUs by Christian Buchner ***
	             This is version 2013-12-18 (beta)
	based on pooler-cpuminer 2.3.2 (c) 2010 Jeff Garzik, 2012 pooler
	       Cuda additions Copyright 2013 Christian Buchner
	   My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

Usage: cudaminer [OPTIONS]
Options:
  -a, --algo=ALGO       specify the algorithm to use
                          scrypt    scrypt(1024, 1, 1) (default)
                          sha256d   SHA-256d
  -o, --url=URL         URL of mining server (default: http://127.0.0.1:9332/)
  -O, --userpass=U:P    username:password pair for mining server
  -u, --user=USERNAME   username for mining server
  -p, --pass=PASSWORD   password for mining server
      --cert=FILE       certificate for mining server using SSL
  -x, --proxy=[PROTOCOL://]HOST[:PORT]  connect through a proxy
  -t, --threads=N       number of miner threads (default: number of processors)
  -r, --retries=N       number of times to retry if a network call fails
                          (default: retry indefinitely)
  -R, --retry-pause=N   time to pause between retries, in seconds (default: 30)
  -T, --timeout=N       network timeout, in seconds (default: 270)
  -s, --scantime=N      upper bound on time spent scanning current work when
                          long polling is unavailable, in seconds (default: 5)
      --no-longpoll     disable X-Long-Polling support
      --no-stratum      disable X-Stratum support
  -q, --quiet           disable per-thread hashmeter output
  -D, --debug           enable debug output
  -P, --protocol-dump   verbose dump of protocol-level activities
      --no-autotune     disable auto-tuning of kernel launch parameters
  -d, --devices         takes a comma separated list of CUDA devices to use.
                        This implies the -t option with the threads set to the
                        number of devices.
  -l, --launch-config   gives the launch configuration for each kernel
                        in a comma separated list, one per device.
  -i, --interactive     comma separated list of flags (0/1) specifying
                        which of the CUDA device you need to run at inter-
                        active frame rates (because it drives a display).
  -C, --texture-cache   comma separated list of flags (0/1) specifying
                        which of the CUDA devices shall use the texture
                        cache for mining. Kepler devices will profit.
  -m, --single-memory   comma separated list of flags (0/1) specifying
                        which of the CUDA devices shall allocate their
                        scrypt scratchbuffers in a single memory block.
  -H, --hash-parallel   1 to enable parallel SHA256 hashing on the CPU. May
                        use more CPU overall, but distributes hashing load
                        neatly across all CPU cores. 0 is now the default
                        which assigns one static CPU core to each GPU.
  -S, --syslog          use system log for output messages
  -B, --background      run the miner in the background
      --benchmark       run in offline benchmark mode
  -c, --config=FILE     load a JSON-format configuration file
  -V, --version         display version information and exit
  -h, --help            display this help text and exit

I’ve only had a few problems with CudaMiner to date. The most annoying problem has been the inability to run tests to optimize card performance without having to put the machine to sleep and wake it back up again (better than a full restart). CudaMiner will, without this, simply hang on a script line:

[2013-12-25 00:49:08] 1 miner threads started, using ‘scrypt’ algorithm.

The sleep + wake does the trick, although I’d love to find out how to not have this happen.

The second annoying problem was:

“. . . result does not validate on CPU (i=NNNN, s=0)!

This error is due to your “K16x16” configuration (the most prominent one I’ve found in google searches, so placed here to help others find it. Your values may vary) being too much for the card (so vary them down a spell until you don’t get there error). There’s a wealth of proper card settings available on the litecoin hardware comparison site, so I direct you there:

litecoin.info/Mining_hardware_comparison

7. And Finally. . .

By all accounts, CudaMiner is a much faster mining tool for NVIDIA owners. To that end, please note that Christian Buchner has made your life much easier (and your virtual wallet hopefully a little fuller). As mentioned above, his donation address is:

LKS1WDKGED647msBQfLBHV3Ls8sveGncnm

Do consider showing him some love.

This post was made in the interest of helping others get their mining going. If this guide helped and you score blocks early, my wallet’s always open as well (can’t blame someone for trying).

Litecoin: LTmicpwpGgrZiyiJmMUdyqq4CG8CqiBqrm
Dogecoin: DBwXMoQ4scAqZfYUJgc3SYqTED7eywSHdB

The timing for getting the guide up is based on a new mining operation here in Syracuse, NY in the form of Salt City Miners, currently the Cloud City of mining operations (also appropriate for the weather conditions). Parties interested in adding their power to the fold are more than welcome to sign up at miner.saltcityminers.com/.

2013dec28_scm_logo

And don’t forget the Meetup group: Syracuse Meetup Group – Bitcoin’s of New York – Miner’s of Syracuse