If you are new to GPU computing with MATLAB, see the Useful Links section at the bottom of this page.
On the Shared Computing Cluster (SCC), a number of nodes are equipped with GPUs. To facilitate computing with GPUs via MATLAB, the Parallel Computing Toolbox provides utility functions capable of exploiting the GPUs for better computational performance. Demonstrated below is a matrix multiply example using the GPU:
%%%%%%%%%%%%%%%%%%
% gpuExample.m %
%%%%%%%%%%%%%%%%%%
function gpuExample(A, B)
%function gpuExample(A, B)
% This function computes matrix product of A and B on Client and GPU
% Collects walltime and check if both results agree (to 5 decimal place)
% A - MATLAB Client array (N x N)
% B - MATLAB Client array (N x N)
% Usage example:
% >> N = 3000; A = rand(N); B = rand(N);
% >> gpuExample(A, B)
tic
C = A*B; % matrix product on Client
tC = toc;
% copy A and B from Client to GPU
a = gpuArray(A); b = gpuArray(B);
tic
c = a*b; % matrix product on GPU
tgpu = toc;
tic
CC = gather(c); % copy data from GPU to Client
tg = toc;
disp(['Matrix multiply time on Client is ' num2str(tC)])
disp(['Matrix multiply time on GPU is ' num2str(tgpu)])
disp(['Time for gathering data from GPU back to Client is ' num2str(tg)])
% Verify that GPU and Client computations agree
tol = 1e-5;
if any(abs(CC-C) > tol)
disp('Matrix product on Client and GPU disagree')
else
disp('Matrix product on Client and GPU agree')
end
end % function gpuExample
There are two ways to run the GPU code:
- For debugging and code development, run MATLAB job in interactive batch:
To run a MATLAB code with GPU instructions, you need to run it on an SCC node with at least 1 GPU (the SCC login nodes do not have any).- First, launch an interactive batch session with:
scc1% qrsh -l gpus=1
The “
-l gpus=1
” specifies that 1 GPU is requested. Without an explicit request (i.e., by default) a twelve-hour wallclock time limit is imposed. - When an SCC node with GPUs is available, the interactive batch job will be accepted and a new X window appears.
- Launch MATLAB from this window:
scc-ha2% matlab &
In the MATLAB window, run the gpuExample.m script:
>> N=3000; A=rand(N); B=rand(N); >> gpuExample Matrix multiply time on Client is 1.236 Matrix multiply time on GPU is 0.000501 Time for gathering data from GPU back to Client is 0.20443 Matrix product on Client and GPU agree >>
- First, launch an interactive batch session with:
- For production, run MATLAB job in batchCreate a batch script
mybatch
as follows:#!/bin/csh # # Batch submission procedure: # scc1% qsub mybatch # # Note: A line of the form "#$ qsub_option" is interpreted # by qsub as if "qsub_option" was passed to qsub on # the commandline. # # Set the hard runtime (aka wallclock) limit for this job, # default is 12 hours. Format: -l h_rt=HH:MM:SS # #$ -l h_rt=12:00:00 # # Merge stderr into the stdout file, to reduce clutter. # #$ -j y # # Specifies number of GPUs wanted # #$ -l gpus=1 # # end of qsub options # matlab -nodisplay -singleCompThread -batch "N=3000;gpuExample(rand(N),rand(N))" # end of script
On the last statement, strings enclosed in double quotes (“) are valid MATLAB commands, including your own application m-files (without the .m suffix).
Submit the batch job using the above batch script which requests 1 CPU and 1 GPU. For other options, please visit GPU Computing on the SCC
scc1% qsub mybatch
Use
qstat
to query the status of your jobscc1:% qstat -u userid job-ID prior name user state submit/start at queue . . . ---------------------------------------------------------------------- 477578 0.00000 mybatch userid qw 03/14/2013 08:50:06