Create docker image¶
Docker can help us to maintain our computational environments. Each of our Nextflow pipeline has a dedicated docker image in our lab. And all the docker files should be avalible at dockerfile.
Dockerfile¶
To simplify the image building, we can use conda to install most of the tools. We can collect the tools available on conda cloud into a conda.yml
file, which might looks like this.
name: concordance-nf
channels:
- defaults
- bioconda
- r
- biobuilds
- conda-forge
dependencies:
- bwa=0.7.17
- sambamba=0.7.0
- samtools=1.9
- picard=2.20.6
- bcftools=1.9
- csvtk=0.18.2
- r=3.6.0
- r-ggplot2=3.1.1
- r-readr=1.3.1
- r-tidyverse=1.2.1
Then, build the Dockerfile
as below.
FROM continuumio/miniconda
MAINTAINER Katie Evans <kathrynevans2015@u.northwestern.edu>
COPY conda.yml .
RUN \
conda env update -n root -f conda.yml \
&& conda clean -a
# install other tools not avalible on conda cloud -- tidyverse sometimes need to be installed here separately...
RUN apt-get update && apt-get install -y procps
RUN R -e "install.packages('roperators',dependencies=TRUE, repos='http://cran.us.r-project.org')"
Note
Put the conda.ymal
and Dockerfile
under the same folder.
Build docker image¶
To build the docker image, you need docker desktop installed on your local machine. Also you should sign up the dockerhub to enable pushing docker image to cloud.
Go to the folder which have conda.ymal
and Dockerfile
, run
docker build -t <dockerhub account>/<name of the image> . # don't ingore the dot here
You can use docker image ls
to check the image list you have in your local machine.
Importantly, you have to check if the tools were installed sucessfully in your docker image. To test the docker image, run
docker run -ti <dockerhub account>/<name of the image> sh
The above command will let you into the docker image, where you can check the tools by their normal commands. Make sure all the tools you need have been installed appropriately.
Tag image with a version¶
There are sometimes issues with Nextflow thinking it has the latest docker image when it really doesn't. To avoid this issue, it is useful to tag each updated docker image with a version tag. Remember to update the docker call in nextflow to use the new version!
docker image tag <dockerhub account>/<name of the image>:latest <dockerhub account>/<name of the image>:<version tag>
Push docker image to dockerhub¶
After the image check, you are ready to push the image to dockerhub which allows you to download the image when ever you need to use.
docker push <dockerhub account>/<name of the image>:<version tag>
Running Nextflow with docker¶
If running Nextflow locally, a docker container can be used with the following line (check out the documentation):
nextflow run <your script> -with-docker [docker image]
Alternatively, a docker container can be specified within the nextflow.config
script to avoid adding an extra parameter:
process.container = 'nextflow/examples:latest'
docker.enabled = true
// if on quest:
// singularity.enabled = true
Important
When running Nextflow with a docker container on QUEST, it is necessary to replace the docker
command with singularity
(although you still must build a docker container). You must also load singularity using module load singularity
before starting a run.
Caching singularity images on QUEST¶
To make the most out of using a shared cache directory for singularity on b1059, make sure to add this line to your ~/.bash_profile
before you run a pipeline for the first time (Note: this is not needed to USE a previously cached image, but only when you ADD a new one).
export SINGULARITY_CACHEDIR='/projects/b1059/singularity/'