Julia user guide

Julia installations

 There is no system-installed Julia on the clusters. Therefore you need to load Julia with the module system. Different versions of Julia are available via the module system on Rackham, Snowy, and Bianca. Some installed packages are available via the module.

As the time of writing we have the following modules:

[user@rackham1 ~]$ module available julia
------------------------------------------------------
julia:
------------------------------------------------------
Versions:
        julia/1.0.5_LTS
        julia/1.1.1
        julia/1.4.2
        julia/1.6.1
        julia/1.6.3 
        julia/1.7.2 (Default)

To load a specific version of Julia into your environment, just type e.g.

        $ module load julia/1.6.1

​Doing:

        $ module load julia

will give you the default version (1.7.2), often the latest version.

A good and important suggestion is that you always specify a certain version. This is to be able to reproduce your work, a very important key in research!

You can run a julia script in the shell by:

        $ julia example_script.jl

You start a Julia session by typing in the shell:

        $ julia

The julia prompt looks like this:

julia> 

Exit with <Ctrl-D> or 'exit()'. 

INTRODUCTION

Julia is according to https://julialang.org/:

  • Fast
  • Dynamic
  • Reproducible
  • Composable
  • General
  • Open source

Documentation for version 1.6.3.

Julia discussions.

PACKAGES

Some packages are preinstalled. That means that they are available also on Bianca.These include:

  1.   "BenchmarkTools"
  2.   "CSV"
  3.   "CUDA"
  4.   "MPI"
  5.   "Distributed"
  6.   "IJulia"
  7.   "Plots"
  8.   "Gadfly"
  9.   "DataFrames"
  10.   "DistributedArrays"
  11.   "PlotlyJS"

+ all "standard" libraries.

This list will be extended while you, as users, may wish more packages.

You may control the present "central library" by typing in julia shell :

      using Pkg
      Pkg.activate(DEPOT_PATH[2]*"/environments/v1.7");     #change version accordingly
      Pkg.status()
      Pkg.activate(DEPOT_PATH[1]*"/environments/v1.7");     #to return to user library

Packages are imported or loaded by the commands ``import`` and ``using``, respectively. The difference is shown here. Or briefly:

  • To use module functions, use import Module to import the module, and Module.fn(x) to use the functions.
  • Alternatively, using Module will import all exported Module functions into the current namespace.

HOW TO INSTALL PERSONAL PACKAGES

To make sure that the package is not already installed, type in Julia:

using Pkg
      Pkg.activate(DEPOT_PATH[2]*"/environments/v1.7");  #change version accordingly
      Pkg.status()

To go back to your own personal packages:

      Pkg.activate(DEPOT_PATH[1]*"/environments/v1.7");
      Pkg.status()

You can load (using/import) ANY package from both local and central installation irrespective to which environment you activate. However, the setup is that your package is prioritized if there are similar names.

To install personal packages you type within Julia:

      Pkg.add("<package_name>")

This will install under the path ~/.julia/packages/. Then you can load it by just doing "using/import <package_name>".

      using <package_name>

You can also activate a "package prompt" in julia with   ']':

(@v1.7) pkg> add <package name>

For installing specific versions specify with  <package name>@<X.Y.Z>.

After adding you may be asked to precompile or build. Do so according to instruction given on the screen. Otherwise, first time importing or using the package, Julia may start a precompilation that will take a few seconds up to several minutes.

Exit with <backspace>:

julia> 

Own packages on Bianca

You can use make an installation on Rackham and then use the wharf to copy it over to your ~/.julia/ directory.

Otherwise, send an email to support@uppmax.uu.se and we'll help you.

Running IJulia from Jupyter notebook

Like for python it is possible to run a Julia in a notebook, i.e. in a web interface with possibility of inline figures and debugging. An easy way to do this is to load the python module as well. In shell:

$ module load julia/1.7.2
$ module load python/3.9.5
$ julia

In Julia:

using IJulia
notebook(dir="</path/to/work/dir/>")

A Firefox session will start with the Jupyter notebook interface.

HOW TO RUN PARALLEL JOBS

There are several packages available for Julia that let you run parallel jobs. Some of them are only able to run on one node, while others try to leverage several machines. You'll find an introduction here.

Threading

Threading divides up your work among a number of cores within a node. The threads share their memory. Below is an example from within Julia. First, in the shell type:

$ export JULIA_NUM_THREADS=4
      $ julia

in Julia:

using Base.Threads
nthreads()
      a = zeros(10)
@threads for i = 1:10
        a[i] = Threads.threadid()
end
a

Distributed computing

Distributed processing uses individual processes with individual memory, that communicate with each other. In this case, data movement and communication is explicit.
Julia supports various forms of distributed computing.

  •     A native master-worker system based on remote procedure calls: Distributed.jl
  •     MPI through MPI.jl : a Julia wrapper for the MPI protocol, see further down.
  •     DistributedArrays.jl: distribute an array among workers

If choosing between distributed and MPI, distributed is easier to program, whereas MPI may be more suitable for multi-node applications.

For more detailed info please confer the manual for distributed computing and julia MPI.  

Master-Worker model
We need to launch Julia with

$ julia -p 4

then inside Julia you can check

nprocs()
workers()

which should print 5 and [2,3,4,5]. Why 5, you ask? Because *"worker 1"* is the *"boss"*. And bosses don't work.

As you can see, you can run distributed computing directly from the julia shell. 

Batch example

Julia script hello_world_distributed.jl:

using Distributed
# launch worker processes
num_cores = parse(Int, ENV["SLURM_CPUS_PER_TASK"])
addprocs(19)
println("Number of cores: ", nprocs())
println("Number of workers: ", nworkers())
# each worker gets its id, process id and hostname
for i in workers()
    id, pid, host = fetch(@spawnat i (myid(), getpid(), gethostname()))
    println(id, " " , pid, " ", host)
end
# remove the workers
for i in workers()
    rmprocs(i)
end

Batch script job_distributed.slurm:

#!/bin/bash
#SBATCH -A j<proj>
#SBATCH -p devel
#SBATCH --job-name=distrib_jl     # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks=20              # total number of tasks across all nodes
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --time=00:01:00          # total run time limit (HH:MM:SS)
#SBATCH --mail-type=begin        # send email when job begins
#SBATCH --mail-type=end          # send email when job ends
#SBATCH --mail-user=<email>
module load julia/1.7.2
julia hello_world_distributed.jl

​Put job in queue:

$ sbatch job_distributed.slurm

Interactive example 

$ salloc -A <proj> -p node -N 1 -n 10 -t 1:0:0 

$ julia hello_world_distributed.jl

MPI 

Remember that you will also have to load the the openmpi module before starting julia, so that the MPI header files can be found  with the command "module load gcc/x.x.x openmpi/y.y.y"). Because of how MPI works, we need to explicitly write our code into a file, juliaMPI.jl:

    import MPI
MPI.Init()
comm = MPI.COMM_WORLD
MPI.Barrier(comm)
root = 0
r = MPI.Comm_rank(comm)
sr = MPI.Reduce(r, MPI.SUM, root, comm)
if(MPI.Comm_rank(comm) == root)
@printf("sum of ranks: %s\n", sr)
end
MPI.Finalize()

You can execute your code the normal way as

      $ mpirun -np 3 julia juliaMPI.jl

A batch script, job_MPI.slurm, should include a "module load gcc/9.3.0 openmpi/3.1.5"

For julia/1.6.3 and earlier: 

      $ module load gcc/9.3.0 openmpi/3.1.5

For julia/1.7.2: 

      $ module load gcc/10.3.0 openmpi/3.1.6

#!/bin/bash
#SBATCH -A j<proj>
#SBATCH -p devel
#SBATCH --job-name=MPI_jl        # create a short name for your job
#SBATCH --nodes=1                # node count
#SBATCH --ntasks=20              # total number of tasks across all nodes
#SBATCH --cpus-per-task=1        # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --time=00:05:00          # total run time limit (HH:MM:SS)
#SBATCH --mail-type=begin        # send email when job begins
#SBATCH --mail-type=end          # send email when job ends
#SBATCH --mail-user=<email>
module load julia/1.7.2
module load gcc/10.3.0 openmpi/3.1.6
mpirun -n 20 julia juliaMPI.jl

See the MPI.jl examples for more input!

GPU

Example Julia script, juliaCUDA.jl:

using CUDA, Test
N = 2^20
x_d = CUDA.fill(1.0f0, N)
y_d = CUDA.fill(2.0f0, N)
y_d .+= x_d
@test all(Array(y_d) .== 3.0f0)
println("Success")

Batch script juliaGPU.slurm, note settings for Bianca vs. Snowy:

#!/bin/bash
#SBATCH -A <proj-id>
#SBATCH -M <snowy OR bianca>
#SBATCH -p node
#SBATCH -C gpu   #NB: Only for Bianca
#SBATCH -N 1
#SBATCH --job-name=juliaGPU         # create a short name for your job 
#SBATCH --gpus-per-node=<1 OR 2>             # number of gpus per node (Bianca 2, Snowy 1)
#SBATCH --time=00:15:00          # total run time limit (HH:MM:SS)
#SBATCH --qos=short              # if test run t<15 min
#SBATCH --mail-type=begin        # send email when job begins
#SBATCH --mail-type=end          # send email when job ends
#SBATCH --mail-user=<email>
module purge
module load julia/1.7.2          # system CUDA works as of today
julia juliaCUDA.jl

​Put job in queue:

$ sbatch juliaGPU.slurm

Interactive session with GPU

On Snowy, gettin 1 cpu and 1 gpu:

$ interactive -A <proj> -n 1 -M snowy --gres=gpu:1  -t 3:00:00

On Bianca, getting 2 cpu:s and 1 gpu:

$ interactive -A <proj> -n 2 -C gpu --gres=gpu:1 -t 01:10:00 
     

Last modified: 2022-06-09