Optimising JAGS

Just Another Gibbs Sampler is a really nice piece of software, but if you want to calculate long markov chains it takes some time. I tried to optimise this process by using different versions, including some self-compiled binaries.

All in all I tried five different jags 4.2.0 versions on the same hardware (virtualized ubuntu guest, 256GB memory and 16 dedicated Intel Xeon Nehalem cores). The results are quite surprising, because gcc beats clang: The clang self-compiled version is slower than the pre-built binaries. And the HPC-optimized clang version is even slower. It's not a huge difference, but quite surprising. I will take a deeper look at the results in future if time permits it.

If you use virtualization it might be the case, that the compiler is not able to detect the right cpu architecture. Use the following command to check it:
gcc -march=native -E -v - /null 2>&1 | grep cc1

In all self compiled versions, the same source code was used. Just the configuration was modified by adding flags or change the compiler. The following bash commands were used to download the source files:

wget -O jags.tar https://sourceforge.net/projects/mcmc-jags/files/JAGS/4.x/Source/JAGS-4.2.0.tar.gz/download
tar -xvf jags.tar
cd JAGS-*

In the "not-HPC" optimized version, no extra flags were provided. I just called ./configure followed by make -j 4 to build it.

Self-compiled GCC-HPC

This version should be optimized for high performance settings. Nice side-fact: These compiler flags are recommended by Intel for HPC :)

./configure OPT="-m64 -Ofast -flto -march=native -funroll-loops"
make -j 4 clean all

Self-compiled Clang++

Clang++ is a really nice C++ compiler-frontend and in combination with llvm you may get nice results. There are several possibilities to switch your compiler, I used environment variables:

export CC=clang
export CXX=clang++

The same source and flags (Intels recommendation for HPC and no additional flags for default clang++) like before were used! And then I compiled it:

make -j 4 clean all

Install

To install the self-compiled versions, I also used make like this:

sudo make install

Michael Rutter provides some pre-built binaries in his repository. You can install it like this:

sudo add-apt-repository ppa:marutter/rrutter
sudo apt-get update
sudo apt-get install r-cran-rjags

Benchmark

For benchmark purposes I used the mcmc-jags examples and measured the runtime.

wget -O jags-examples.tar https://sourceforge.net/projects/mcmc-jags/files/Examples/4.x/classic-bugs.tar.gz/download
cd classic-bugs/vol1
export JAGS=`which jags`
time make bench -j 10

real user sys sys+user diff
pre-built 7m59.518s 13m40.708s 0m54.184s 874.892s 0
gcc 7m41.468s 13m28.392s 0m47.596s 855.988s +2,16%
gcc-hpc 7m38.224s 13m24.400s 0m45.924s 850.324s +2,8%
clang++ 7m53.486s 13m49.228s 0m49.852s 879.08s -0,48%
clang++-hpc 8m16.854s 14m11.280s 0m44.744s 896.024 -2,41%

Versions

  • gcc 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2)
  • clang 3.8.0-2ubuntu4 (tags/RELEASE_380/final)

Links