Writing nf-core modules and subworkflows

If you decide to upload a module to nf-core/modules then this will ensure that it will become available to all nf-core pipelines, and to everyone within the Nextflow community! See modules/ for examples.

Writing a new module reference

Before you start

Please check that the module you wish to add isn’t already on nf-core/modules:

If the module doesn’t exist on nf-core/modules:

  • Please create a new issue before adding it
  • Set an appropriate subject for the issue e.g. new module: fastqc
  • Add yourself to the Assignees so we can track who is working on the module

New module workflow

We have implemented a number of commands in the nf-core/tools package to make it incredibly easy for you to create and contribute your own modules to nf-core/modules.

  1. Install any of Docker, Singularity or Conda
Single step conda installation

If you use the conda package manager you can setup a new environment and install all dependencies for the new module workflow in one step with:

conda create -n nf-core -c bioconda "nextflow>=21.04.0" "nf-core>=2.7" nf-test
conda activate nf-core

and proceed with Step 5.

  1. Install Nextflow (>=21.04.0)

  2. Install the latest version of nf-core/tools (>=2.7)

  3. Install nf-test

  4. Fork and clone the nf-core/modules repo locally

  5. Setup up pre-commit (comes packaged with nf-core/tools, watch the pre-commit bytesize talk if you want to know more about it) to ensure that your code is linted and formatted correctly before you commit it to the repository

    pre-commit install
  1. Set up git on your computer by adding a new git remote of the main nf-core git repo called upstream

    git remote add upstream https://github.com/nf-core/modules.git

    Make a new branch for your module and check it out

    git checkout -b fastqc
  2. Create a module using the nf-core DSL2 module template:

    nf-core modules create

    All of the files required to add the module to nf-core/modules will be created/edited in the appropriate places. There are at most 3 files to modify:

    1. ./modules/nf-core/fastqc/main.nf

      This is the main script containing the process definition for the module. You will see an extensive number of TODO statements to help guide you to fill in the appropriate sections and to ensure that you adhere to the guidelines we have set for module submissions.

    2. ./modules/nf-core/fastqc/meta.yml

      This file will be used to store general information about the module and author details - the majority of which will already be auto-filled. However, you will need to add a brief description of the files defined in the input and output section of the main script since these will be unique to each module. We check it’s formatting and validity based on a JSON schema during linting (and in the pre-commit hook).

    3. ./modules/nf-core/fastqc/tests/main.nf.test

      Every module MUST have a test workflow. This file will define one or more Nextflow workflow definitions that will be used to unit test the output files created by the module. By default, one workflow definition will be added but please feel free to add as many as possible so we can ensure that the module works on different data types / parameters e.g. separate workflow for single-end and paired-end data.

      Minimal test data required for your module may already exist within the nf-core/modules repository, in which case you may just have to change a couple of paths in this file - see the Test data section for more info and guidelines for adding new standardised data if required.

      Refer to the section writing nf-test tests for more information on how to write nf-tests

  3. Create a snapshot of the tests nf-core modules test

    Note

    See the nf-test docs if you would like to run the tests manually.

  4. Check that the new module you’ve added follows the module specifications

  5. Lint the module locally to check that it adheres to nf-core guidelines before submission

    nf-core modules lint

  6. Once ready, the code can be pushed and a pull request (PR) created

    On a regular basis you can pull upstream changes into this branch and it is recommended to do so before pushing and creating a pull request. Rather than merging changes directly from upstream the rebase strategy is recommended so that your changes are applied on top of the latest master branch from the nf-core repo. This can be performed as follows

git pull --rebase upstream master

Once you are ready you can push the code and create a PR

git push -u origin

Once the PR has been accepted you should delete the branch and checkout master again.

git checkout master
git branch -d fastqc
  1. Set up git on your computer by adding a new git remote of the main nf-core git repo called upstream

    git remote add upstream https://github.com/nf-core/modules.git

    Make a new branch for your subworkflow and check it out

    git checkout -b bam_sort_stats_samtools
  2. Create a subworkflow using the nf-core DSL2 subworkflow template in the root of the clone of the nf-core/modules repository:

    nf-core subworkflows create

    All of the files required to add the subworkflow to nf-core/modules will be created/edited in the appropriate places. There are at most 3 files to modify:

    1. ./subworkflows/nf-core/bam_sort_stats_samtools/main.nf

      This is the main script containing the workflow definition for the subworkflow. You will see an extensive number of TODO statements to help guide you to fill in the appropriate sections and to ensure that you adhere to the guidelines we have set for module submissions.

    2. ./subworkflows/nf-core/bam_sort_stats_samtools/meta.yml

      This file will be used to store general information about the subworkflow and author details. You will need to add a brief description of the files defined in the input and output section of the main script since these will be unique to each subworkflow.

    3. ./subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test

      Every subworkflow MUST have a test workflow. This file will define one or more Nextflow workflow definitions that will be used to unit test the output files created by the subworkflow. By default, one workflow definition will be added but please feel free to add as many as possible so we can ensure that the subworkflow works on different data types / parameters e.g. separate workflow for single-end and paired-end data.

      Minimal test data required for your subworkflow may already exist within the nf-core/modules repository, in which case you may just have to change a couple of paths in this file - see the Test data section for more info and guidelines for adding new standardised data if required.

      Refer to the section writing nf-test tests for more information on how to write nf-tests

  3. Create a snapshot of the tests

    nf-core subworkflows test

    Note

    See the nf-test docs if you would like to run the tests manually.

  4. Check that the new subworkflow you’ve added follows the subworkflow specifications

  5. Lint the subworkflow locally to check that it adheres to nf-core guidelines before submission

nf-core subworkflows lint

  1. Once ready, the code can be pushed and a pull request (PR) created

On a regular basis you can pull upstream changes into this branch and it is recommended to do so before pushing and creating a pull request - see below. Rather than merging changes directly from upstream the rebase strategy is recommended so that your changes are applied on top of the latest master branch from the nf-core repo. This can be performed as follows:

git pull --rebase upstream master

Once you are ready you can push the code and create a PR

git push -u origin bam_sort_stats_samtools

Once the PR has been accepted you should delete the branch and checkout master again.

git checkout master
git branch -d bam_sort_stats_samtools

Test data

In order to test that each component added to nf-core/modules is actually working and to be able to track any changes to results files between component updates we have set-up a number of Github Actions CI tests to run each module on a minimal test dataset using Docker, Singularity and Conda.

Please adhere to the test-data specifications when adding new test-data

If a new test dataset is added to tests/config/test_data.config, check that the config name of the added file(s) follows the scheme of the entire file name with dots replaced with underscores.

For example: the nf-core/test-datasets file genomics/sarscov2/genome/genome.fasta labelled as genome_fasta, or genomics/sarscov2/genome/genome.fasta.fai as genome_fasta_fai.

Using a stub test when required test data is too big

If the module absolute cannot run using tiny test data, there is a possibility to add stub-run to the test.yml. In this case it is required to test the module using larger scale data and document how this is done. In addition, an extra script-block labeled stub: must be added, and this block must create dummy versions of all expected output files as well as the versions.yml. An example is found in the ascat module.

In the test.yml the -stub-run argument is written as well as the md5sums for each of the files that are added in the stub-block. This causes the stub-code block to be activated when the unit test is run (see for example):

nextflow run tests/modules/<nameofmodule> -entry test_<nameofmodule> -c tests/config/nextflow.config -stub-run

Using a stub test when required test data is too big

If the subworkflow absolute cannot run using tiny test data, there is a possibility to add stub-run to the test.yml. In this case it is required to test the subworkflow using larger scale data and document how this is done. In addition, an extra script-block labeled stub: must be added, and this block must create dummy versions of all expected output files as well as the versions.yml. An example is found in the [bam_sort_stats_samtools subworkflow](

In the test.yml the -stub-run argument is written as well as the md5sums for each of the files that are added in the stub-block. This causes the stub-code block to be activated when the unit test is run (see for example)

nextflow run tests/subworkflows/nf-core/<nameofsubworkflow> -entry test_<nameofsubworkflow> -c tests/config/nextflow.config -stub-run

Uploading to nf-core/modules

When you are happy with your pull request, please select the Ready for Review label on the GitHub PR tab, and providing that everything adheres to nf-core guidelines we will endeavour to approve your pull request as soon as possible. We also recommend to request reviews from the nf-core/modules-team so a core team of volunteers can try to review your PR as fast as possible.

Once you are familiar with the module submission process, please consider joining the reviewing team by asking on the #modules slack channel.

Writing tests

nf-core components are tested using nf-test. See the page on writing nf-test tests for more information and examples.

Publishing results

Results are published using Nextflow’s native publishDir directive defined in the modules.config of a workflow (see here for an example.)

Help

For further information or help, don’t hesitate to get in touch on Slack #modules channel (you can join with this invite).