Uppsala Multidisciplinary Center for Advanced Computational Science

Getting Started with Life Science Projects

You have submitted samples for sequencing, and you've been told by the NGS facility that you should apply for a project at UPPMAX to analyse the results. This page guides you through the process. This page assumes you are not working with sensitive data. Read this page if you work with human data.

Click here to go to the actual instructions and skip over the background information. 

Background:

UPPMAX is a supercomputing facility hosted by Uppsala University and is a part of the Swedish National Infrastructure for Computing (SNIC). As a SNIC center, we provide computational resources for a wide variety of researchers all over Sweden. Access to our resources is granted through the SNIC project application portal, SUPR.

In order to do any kind of sequence analysis, you need two resources:

  1. Computations. It takes time for a CPU to run programs. Computational resources are measured in core-hours. Allocations are granted in core-hours per month
    • For example, if you have a hundred samples and it takes a single core a week run a pipeline on one sample, then the total core-hours needed is 100 samples * 7 days/week * 24 hours/week * 1 core/hour = 16800 core-hours. If you're planning to do this analysis over the course of 6 months then you'll need a project that provides about 16800/6 = 2800 core-hours/month. 
    • Our current SNIC-funded compute cluster is called Rackham.​
    • If a project exceeds its allocation of CPU time, you can keep working but at a lower priority in the queue. We call this the bonus queue.
  2. Data storage. It takes disk space to store sequences and related data. Space is usually measured in GB.
    • The sequencing facility should be able to tell you roughly how much space the raw sample will take. When you're working with the data, it usually expands by a factor ranging between 50%-300%, and you will need to account for this when you apply for a storage project.
    • The storage system attached to Rackham is named Crex.
    • If a project exceeds its storage quota, no one can write any more data to its directory.

On Rackham, we have divided projects into two types: Compute Projects and Storage Projects. 

Compute Projects: SNIC SMALL, SNIC MEDIUM, SNIC LARGE. These are called "SNAC" projects while in the proposal phase. They also come with 128 GB of storage, which is enough for many.

  • SMALL: Anyone can apply. Default limit of 2000 core-hours/month, can go up to 5000 upon request.
  • MEDIUM: PI must be permanently employed researcher. Up to 100 kch/m. Subject to a technical evaluation.
  • LARGE: For very large projects involving large groups or multiple groups. Proposals are accepted twice per year. 

Storage Projects: For molecular biology/bioinformatics, you should use SciLifeLab Storage. This project type allocates space that is reserved for life science (esp. molecular biology) research. General research projects that fall outside this category should apply for our other storage project type.

Getting your projects

Below are step-by-step instructions for getting a project for data-intensive life science research. As described above, you'll need a compute project and probably a storage project. It is best to submit both proposals at once.

Submitting a proposal for storage project

  1. Figure out how much raw data you're going to get, in GB.
    • If you're going to work from existing databases, this is relatively straightforward.
    • If an NGS platform is producing data for you, they can provide an estimate.​
  2. Estimate the "expansion factor", i.e. how much additional data you'll produce when analyzing the raw data. This number is usually 1.5x-3x, sometimes more.
  3. Calculate a final estimate of your total storage needs. This is "GB of raw data" times "expansion factor".
  4. Go to SUPR. Create an account if you need one. Log in.
  5. Go to the SciLifeLab Storage round. Create a new proposal.
  6. Complete the proposal and submit.
    1. Project Title should be the topic of your activity.
    2. Edit Basic Information.
      1. Abstract should summarise your research plan.
      2. Resource Usage should describe the data you're going to store. Show how you estimated your projected needs.
    3. Add co-investigators (if any).
    4. If someone other than the PI needs control over the project, assign a co-investigator the role of proxy.
    5. Add the Crex resource to the proposal and set the Requested Capacity to your total storage needs. You may ignore the other fields.
    6. Submit the Proposal. 

Submitting a proposal for a compute project

  1. Figure out how much computation time you'll need, in core-hours per month. 
    • A rule of thumb is that you consume on average 1000 core-hours per month for every TB of data (1 TB = 1000 GB) in your project.
    • If you've worked on similar data before, a more accurate estimate is possible. Calculate the total core-hour usage for analysing every sample, then divide by the number of months you expect to be running these analyses.
  2. Go to SUPR. Log in.
  3. Select a round.
    • If you need less than 10,000 core-hours per month, choose SNAC SMALL UPPMAX.
    • If you need between 10,000 - 100,000 core-hours per month, choose SNAC MEDIUM.
  4. Create a new proposal. Complete and submit.
    1. Project Title should be the topic of your activity.
    2. Edit Basic Information.
      1. Abstract should summarise your research plan.
      2. Resource Usage should describe the computations you're going to do, which softwares, etc. Show how you estimated your projected needs.
    3. Add co-investigators (if any).
    4. If someone other than the PI needs control over the project, assign a co-investigator the role of proxy.
    5. (MEDIUM:) Add the Rackham resource to the proposal. Set the Requested Capacity to your compute needs. You may ignore the other fields. 128 GB of storage on Crex will be allocated for you automatically.
    6. Submit the Proposal. 

After having submitted BOTH proposals, a decision will be made typically within a few days or a week. Feel free to contact support@uppmax.uu.se with questions.