Julia user guide
Table of contents:
JULIA INSTALLATIONS
There is no system-installed Julia on the clusters. Therefore you need to load Julia with the module system. Different versions of Julia are available via the module system on Rackham, Snowy, and Bianca. Some installed packages are available via the module.
As the time of writing we have the following modules:
[user@rackham1 ~]$ module available julia ------------------------------------------------------ julia: ------------------------------------------------------ Versions: julia/1.0.5_LTS julia/1.1.1 julia/1.4.2 julia/1.6.1 julia/1.6.3 julia/1.7.2 (Default)
To load a specific version of Julia into your environment, just type e.g.
$ module load julia/1.6.1
Doing:
$ module load julia
will give you the default version (1.7.2), often the latest version.
A good and important suggestion is that you always specify a certain version. This is to be able to reproduce your work, a very important key in research!
You can run a julia script in the shell by:
$ julia example_script.jl
You start a Julia session by typing in the shell:
$ julia
The julia prompt looks like this:
julia>
Exit with <Ctrl-D> or 'exit()'.
INTRODUCTION
Julia is according to https://julialang.org/:
- Fast
- Dynamic
- Reproducible
- Composable
- General
- Open source
Documentation for version 1.6.3.
PACKAGES
Some packages are preinstalled. That means that they are available also on Bianca.These include:
- "BenchmarkTools"
- "CSV"
- "CUDA"
- "MPI"
- "Distributed"
- "IJulia"
- "Plots"
- "Gadfly"
- "DataFrames"
- "DistributedArrays"
- "PlotlyJS"
+ all "standard" libraries.
This list will be extended while you, as users, may wish more packages.
You may control the present "central library" by typing in julia shell :
using Pkg Pkg.activate(DEPOT_PATH[2]*"/environments/v1.7"); #change version accordingly Pkg.status() Pkg.activate(DEPOT_PATH[1]*"/environments/v1.7"); #to return to user library
Packages are imported or loaded by the commands ``import`` and ``using``, respectively. The difference is shown here. Or briefly:
- To use module functions, use
import Module
to import the module, andModule.fn(x)
to use the functions. - Alternatively,
using Module
will import all exportedModule
functions into the current namespace.
HOW TO INSTALL PERSONAL PACKAGES
To make sure that the package is not already installed, type in Julia:
using Pkg Pkg.activate(DEPOT_PATH[2]*"/environments/v1.7"); #change version accordingly
Pkg.status()
To go back to your own personal packages:
Pkg.activate(DEPOT_PATH[1]*"/environments/v1.7"); Pkg.status()
You can load (using/import) ANY package from both local and central installation irrespective to which environment you activate. However, the setup is that your package is prioritized if there are similar names.
To install personal packages you type within Julia:
Pkg.add("<package_name>")
This will install under the path ~/.julia/packages/. Then you can load it by just doing "using/import <package_name>".
using <package_name>
You can also activate a "package prompt" in julia with ']':
(@v1.7) pkg> add <package name>
For installing specific versions specify with <package name>@<X.Y.Z>.
After adding you may be asked to precompile or build. Do so according to instruction given on the screen. Otherwise, first time importing or using the package, Julia may start a precompilation that will take a few seconds up to several minutes.
Exit with <backspace>:
julia>
Own packages on Bianca
You can use make an installation on Rackham and then use the wharf to copy it over to your ~/.julia/ directory.
Otherwise, send an email to support@uppmax.uu.se and we'll help you.
Running IJulia from Jupyter notebook
Like for python it is possible to run a Julia in a notebook, i.e. in a web interface with possibility of inline figures and debugging. An easy way to do this is to load the python module as well. In shell:
$ module load julia/1.7.2 $ module load python/3.9.5 $ julia
In Julia:
using IJulia notebook(dir="</path/to/work/dir/>")
A Firefox session will start with the Jupyter notebook interface.
HOW TO RUN PARALLEL JOBS
There are several packages available for Julia that let you run parallel jobs. Some of them are only able to run on one node, while others try to leverage several machines. You'll find an introduction here.
Threading
Threading divides up your work among a number of cores within a node. The threads share their memory. Below is an example from within Julia. First, in the shell type:
$ export JULIA_NUM_THREADS=4 $ julia
in Julia:
using Base.Threads nthreads() a = zeros(10) @threads for i = 1:10 a[i] = Threads.threadid() end a
Distributed computing
Distributed processing uses individual processes with individual memory, that communicate with each other. In this case, data movement and communication is explicit.
Julia supports various forms of distributed computing.
- A native master-worker system based on remote procedure calls: Distributed.jl
- MPI through MPI.jl : a Julia wrapper for the
MPI
protocol, see further down. - DistributedArrays.jl: distribute an array among workers
If choosing between distributed and MPI, distributed is easier to program, whereas MPI may be more suitable for multi-node applications.
For more detailed info please confer the manual for distributed computing and julia MPI.
Master-Worker model
We need to launch Julia with
$ julia -p 4
then inside Julia you can check
nprocs() workers()
which should print 5 and [2,3,4,5]. Why 5, you ask? Because *"worker 1"* is the *"boss"*. And bosses don't work.
As you can see, you can run distributed computing directly from the julia shell.
Batch example
Julia script hello_world_distributed.jl:
using Distributed # launch worker processes num_cores = parse(Int, ENV["SLURM_CPUS_PER_TASK"]) addprocs(19) println("Number of cores: ", nprocs()) println("Number of workers: ", nworkers()) # each worker gets its id, process id and hostname for i in workers() id, pid, host = fetch(@spawnat i (myid(), getpid(), gethostname())) println(id, " " , pid, " ", host) end # remove the workers for i in workers() rmprocs(i) endBatch script job_distributed.slurm:
#!/bin/bash #SBATCH -A j<proj> #SBATCH -p devel #SBATCH --job-name=distrib_jl # create a short name for your job #SBATCH --nodes=1 # node count #SBATCH --ntasks=20 # total number of tasks across all nodes #SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --time=00:01:00 # total run time limit (HH:MM:SS) #SBATCH --mail-type=begin # send email when job begins #SBATCH --mail-type=end # send email when job ends #SBATCH --mail-user=<email> module load julia/1.7.2 julia hello_world_distributed.jl
Put job in queue:
$ sbatch job_distributed.slurm
Interactive example
$ salloc -A <proj> -p node -N 1 -n 10 -t 1:0:0
$ julia hello_world_distributed.jl
MPI
Remember that you will also have to load the the openmpi module before starting julia, so that the MPI header files can be found with the command "module load gcc/x.x.x openmpi/y.y.y"). Because of how MPI works, we need to explicitly write our code into a file, juliaMPI.jl:
import MPI
MPI.Init()
comm = MPI.COMM_WORLD
MPI.Barrier(comm)
root = 0
r = MPI.Comm_rank(comm)
sr = MPI.Reduce(r, MPI.SUM, root, comm)
if(MPI.Comm_rank(comm) == root)
@printf("sum of ranks: %s\n", sr)
end
MPI.Finalize()
You can execute your code the normal way as
$
mpirun -np 3 julia juliaMPI.jl
A batch script, job_MPI.slurm, should include a "module load gcc/9.3.0 openmpi/3.1.5"
For julia/1.6.3 and earlier:
$
module load gcc/9.3.0 openmpi/3.1.5
For julia/1.7.2:
$
module load gcc/10.3.0 openmpi/3.1.6
#!/bin/bash #SBATCH -A j<proj> #SBATCH -p devel #SBATCH --job-name=MPI_jl # create a short name for your job #SBATCH --nodes=1 # node count #SBATCH --ntasks=20 # total number of tasks across all nodes #SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks) #SBATCH --time=00:05:00 # total run time limit (HH:MM:SS) #SBATCH --mail-type=begin # send email when job begins #SBATCH --mail-type=end # send email when job ends #SBATCH --mail-user=<email> module load julia/1.7.2 module load gcc/10.3.0 openmpi/3.1.6 mpirun -n 20 julia juliaMPI.jl
See the MPI.jl examples for more input!
GPU
Example Julia script, juliaCUDA.jl:
using CUDA, Test N = 2^20 x_d = CUDA.fill(1.0f0, N) y_d = CUDA.fill(2.0f0, N) y_d .+= x_d @test all(Array(y_d) .== 3.0f0) println("Success")
Batch script juliaGPU.slurm, note settings for Bianca vs. Snowy:
#!/bin/bash #SBATCH -A <proj-id> #SBATCH -M <snowy OR bianca> #SBATCH -p node #SBATCH -C gpu #NB: Only for Bianca #SBATCH -N 1 #SBATCH --job-name=juliaGPU # create a short name for your job #SBATCH --gpus-per-node=<1 OR 2> # number of gpus per node (Bianca 2, Snowy 1) #SBATCH --time=00:15:00 # total run time limit (HH:MM:SS) #SBATCH --qos=short # if test run t<15 min #SBATCH --mail-type=begin # send email when job begins #SBATCH --mail-type=end # send email when job ends #SBATCH --mail-user=<email> module purge module load julia/1.7.2 # system CUDA works as of today julia juliaCUDA.jl
Put job in queue:
$ sbatch juliaGPU.slurm
Interactive session with GPU
On Snowy, gettin 1 cpu and 1 gpu:
$ interactive -A <proj> -n 1 -M snowy --gres=gpu:1 -t 3:00:00
On Bianca, getting 2 cpu:s and 1 gpu:
$ interactive -A <proj> -n 2 -C gpu --gres=gpu:1 -t 01:10:00