# How do I use Matlab's Distributed Computing features?

Matlab Distributed Computing Server (MDCS) is installed for Matlab versions R2014a and on. Latest available version is R2020b.

NOTE: This page describes the commands for R2017a and later. See this guide for earlier versions.

Using MATLAB on the cluster enables you to utilize high performance facilities like:

- Parallel computing
- Parallel for-loops
- Evaluate functions in the background

- Big data processing
- Analyze big data sets in parallel

- Batch Processing
- Offload execution of functions to run in the background

- GPU computing (Available on Snowy)
- Accelerate your code by running it on a GPU

- Machine & Deep learning

For a complete user guide from MATHWORKS, see:

https://se.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav

Some online tutorials and courses:

- Parallel computing
- Machine Learning
- Deep Learning

### Get started

With MATLAB you can e.g. submit jobs directly to our job queue scheduler, without having to use slurm's commands directly. To do this login to the cluster's login node (e.g. rackham.uppmax.uu.se) with SSH (X-forwarding enabled), or perhaps more efficiently with Thinlinc (see our Thinlinc guide).

Please use interactive mode to avoid unexpected overtrassing!

You load the matlab module with e.g.:

$ module load matlab/R2019a

or to load last stable installed release

$ module load matlab

and launch the GUI with just :

$ matlab &

Use "&" to have MATLAB in background making terminal active.

Begin by running the command "configCluster" in Matlab Command Window to choose a cluster configuration. Matlab will set up a configuration and will then print out some instructions. Follow them. These inform you what is needed in your script or in command line to run in parallel on the cluster.

You can also set environments that is read if you don't specify it. Go to HOME > ENVIRONMENT > Parallel > Parallel preferences.

A simple test case that can be run is the following:

>> configCluster %(on Bianca it will look a little different) [1] rackham [2] snowy Select a cluster [1-2]: 1 >> >> c = parcluster('rackham'); %on Bianca 'bianca Rxxxxx' >> c.AdditionalProperties.AccountName = 'snic2021-X-YYY'; >> c.AdditionalProperties.QueueName = 'node'; >> c.AdditionalProperties.WallTime = '00:10:00'; >> c.saveProfile >> job = c.batch(@parallel_example, 1, {90, 5}, 'pool', 19) %19 is for 20 cores. On Snowy and Bianca use 15. >> job.wait >> job.fetchOutputs{:}

where parallel_example.m is a file with the following matlab function:

functiont = parallel_example(nLoopIters, sleepTime) t0 = tic; parfor idx = 1:nLoopIters A(idx) = idx; pause(sleepTime); end t = toc(t0);

This will schedule a 20 tasks node-job (19 + 1) on Rackham under the given project (so you'll have to change this to your project name). For the moment jobs are hard coded to be node jobs. This means that if you request 21 tasks instead (20 + 1) you will get a 2 node job, but only 1 core will be used on the second node. In this case you'd obviously request 40 tasks (39 + 1) instead.

The second argument in the call to c.batch(), 1 in this example, is the number of output arguments expected from the function to be called. Function that returns no arguments needs a 0 here instead.

The curly brackets {90, 5} in the example contain the input arguments for the function to be called, in this example nLoopIters=90 and sleepTime=5.

For jobs using several nodes (in this case 2) you may modify the call to:

>> configCluster [1] rackham [2] snowy Select a cluster [1-2]: 1 >> >> c = parcluster('rackham'); %on Bianca 'bianca R<version>' >> c.AdditionalProperties.AccountName = 'snic2021-X-YYY'; >> c.AdditionalProperties.QueueName = 'node'; >> c.AdditionalProperties.WallTime = '00:10:00'; >> c.saveProfile >> job = c.batch(@parallel_example_hvy, 1, {1000, 1000000}, 'pool', 39)% 31 on Bianca or Snowy >> job.wait >> job.fetchOutputs{:}

where parallel_example-hvy.m is a file with the following matlab function:

functioncmdout = parallel_example_hvy(nLoopIters, sleepTime) t0 = tic; ml = 'module list'; [status, cmdout] = system(ml); parfor idx = 1:nLoopIters A(idx) = idx; for foo = 1:nLoopIters*sleepTime A(idx) = A(idx) + A(idx); A(idx) = A(idx)/3; end endTo see the output to screen from jobs, use job.Tasks.Diary. Output from the submitted function is fetched with fetchOutputs().

For more information about Matlab's Distributed Computing features please see Matlab's HPC Portal.

### Using GPU

Running MATLAB with GPU is, as of now, only possible on the Snowy cluster. Uppsala University affiliated staff and students with allocation on Snowy can use this resource.

Start an interactive session. For one cpu core and one gpu:

interactive -A <project> -p core -N 1 -M snowy --gres=gpu:1 --gpus-per-node=1 -t 3:00:00

For more cpu:s or a full node:

interactive -A <project> -p core -N 1 -n 16 -M snowy --gres=gpu:1 --gpus-per-node=1 -t 3:00:00

Note that wall time "-t" should be set to *more than* one hour to not automatically put job in "devel" or "devcore" queue, which is not allowed for gpu jobs. Also check the GPU quide for Snowy at Using the GPU nodes on Snowy.

Load MATLAB module and start matlab as usual (with &) in the new session. Then test if the gpu device is found by typing:

gpuDeviceand/or

gpuDeviceCount