Why ?
VSIPL++ is "a C++ library for developing high performance signal- and image-processing applications.", so the first question would seems to be WTF?
Well, as far as field manipulation is concerned, it does offer a lot of the primitive operations and algebra that we need, including the support for accelerators like GPUs. So maybe it's worth a look.
Evaluation
I would like to keep a track of my evaluation, and this a place as good as any other.
I did a first evaluation on satch, my ubuntu laptop with 2 dual cores a an Nvidia QuadroFX 2700M.
The install was made from the binary tar ball, both VSIPL++ and cuda are installed in their default locations, /opt/codesourcery/sourceryvsipl++-cuda-3.1 and /usr/local/cuda respectively.
Getting started
Installation
went just fine.
Building Applications
- Following the instructions, we create a workspace using the ia32-ser-cuda variant. This variant is metntionned in the tarball distribution, not on the web site.
It will be needed later on.
alainm@satch:~/vsi$ vsip-create-workspace --variant=ia32-ser-cuda ./ws alainm@satch:~/vsi$ ls ws alainm@satch:~/vsi$ ls ws/ benchmarks dda eval Makefile pi signal ustorage common.mk dispatch example1.cpp mprod.cpp profile solvers views cuda dsp io parallel radar ssar vmul.cpp alainm@satch:~/vsi$
- We can build the example by running make, but there is a small glitch with cuda toolkit libraries, the current cuda tk version, as of Sep 7th 2011, is 4.0.17. It appears that VSIPL++ was linked against a cuda 3.x.y.
As a result:
g++ -I . -I/opt/codesourcery/sourceryvsipl++-cuda-3.1/include -I/opt/codesourcery/sourceryvsipl++-cuda-3.1/include/fftw3 -I/opt/codesourcery/sourceryvsipl++-cuda-3.1/include -DVSIP_IMPL_ENABLE_THREADING=1 -DVSIP_IMPL_FFTW3=1 -DVSIP_IMPL_FFTW3_HAVE_FLOAT -DVSIP_IMPL_FFTW3_HAVE_DOUBLE -DVSIP_IMPL_FFTW3_HAVE_LONG_DOUBLE -I/usr/local/cuda/include -DVSIP_IMPL_HAVE_CUDA=1 -DVSIP_IMPL_CUDA_FFT=1 -DVSIP_IMPL_HAVE_CULA=1 -DVSIP_IMPL_FFT_USE_FLOAT=1 -DVSIP_IMPL_FFT_USE_DOUBLE=1 -DVSIP_IMPL_FFT_USE_LONG_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_FLOAT=1 -DVSIP_IMPL_PROVIDE_FFT_DOUBLE=1 -DVSIP_IMPL_PROVIDE_FFT_LONG_DOUBLE=1 -DVSIP_IMPL_USE_F2C_ABI -DVSIP_IMPL_USE_CBLAS=1 -O2 -DNDEBUG -funswitch-loops -fgcse-after-reload --param max-inline-insns-single=2000 --param large-function-insns=6000 --param large-function-growth=800 --param inline-unit-growth=300 -m32 -march=pentium4 -mmmx -msse -msse2 -fPIC -o mprod mprod.cpp -Wl,--allow-shlib-undefined -m32 -march=pentium4 -mmmx -msse -msse2 -fPIC -L/usr/local/cuda/lib32 -L/usr/local/cuda/lib -L/opt/codesourcery/sourceryvsipl++-cuda-3.1/lib/ia32 -L/opt/codesourcery/sourceryvsipl++-cuda-3.1/lib/ia32/ser-cuda -lvsip_csl -lsvpp -lfftw3f -lfftw3 -lfftw3l -latlas_lapack -llapack -lcblas -latlas -lF77 -lcula -lcufft -lcublas -lcuda -lcudart /usr/bin/ld: warning: libcufft.so.3, needed by /opt/codesourcery/sourceryvsipl++-cuda-3.1/lib/ia32/ser-cuda/libvsip_csl.so, not found (try using -rpath or -rpath-link) /usr/bin/ld: warning: libcublas.so.3, needed by /opt/codesourcery/sourceryvsipl++-cuda-3.1/lib/ia32/ser-cuda/libvsip_csl.so, not found (try using -rpath or -rpath-link) /usr/bin/ld: warning: libcudart.so.3, needed by /opt/codesourcery/sourceryvsipl++-cuda-3.1/lib/ia32/ser-cuda/libvsip_csl.so, not found (try using -rpath or -rpath-link) /tmp/ccYPPiUL.o: In function `main': mprod.cpp:(.text+0x18ea): undefined reference to `cublasCgemm' mprod.cpp:(.text+0x1a8f): undefined reference to `cublasSgemm' collect2: ld returned 1 exit status make: *** [mprod] Error 1 alainm@satch:~/vsi/ws$
More precisely:
alainm@satch:~/vsi/ws$ ldd /opt/codesourcery/sourceryvsipl++-cuda-3.1/lib/ia32/ser-cuda/libvsip_csl.so | egrep 'libcu.*so.3' libcufft.so.3 => not found libcublas.so.3 => not found libcudart.so.3 => not found alainm@satch:~/vsi/ws$
while:
alainm@satch:~/Desktop$ ls /usr/local/cuda/lib/*.so.? /usr/local/cuda/lib/libcublas.so.4 /usr/local/cuda/lib/libcufft.so.4 /usr/local/cuda/lib/libcusparse.so.4 /usr/local/cuda/lib/libcudart.so.4 /usr/local/cuda/lib/libcurand.so.4 /usr/local/cuda/lib/libnpp.so.4 alainm@satch:~/Desktop$
We need to install the last cuda tk 3.x.y (cudatoolkit_3.2.16_linux_32_ubuntu10.04.run) to get a smooth compilation.
A closer look at the documentation (1.1.4.2) indicates that version 3.1 is supported.