Combine GAFs on the Command Line

Container Technologies

GOanna is provided as a Docker container.

A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.

There are two major containerization technologies: Docker and Singularity (Apptainer).

Docker containers can be run with either technology.

Combine GAFs using Docker

About Docker

  • Docker must be installed on the computer you wish to use for your analysis.
  • To run Docker you must have ‘root’ permissions (or use sudo).
  • Docker will run all containers as ‘root’. This makes Docker incompatible with HPC systems (see Singularity below).
  • Docker can be run on your local computer, a server, a cloud virtual machine etc.
  • For more information on installing Docker on other systems see this tutorial: Installing Docker on your machine.

Getting the Combine GAFs container

The Combine GAFs tool is available as a Docker container on Docker Hub: Combine GAFs container

The container can be pulled with this command:

docker pull agbase/combine_gafs:1.1

Remember

You must have root permissions or use sudo, like so:

sudo docker pull agbase/combine_gafs:1.1

Running Combine GAFs with Data

Combine GAFs has three parameters:

-i InterProScan XML Parser GAF output
-g GOanna GAF output
-o output file basename

Example Command

sudo docker run \
--rm \
-v $(pwd):/work-dir \
agbase/combine_gafs:1.1 \
-i CFLO_1.fa_gaf.txt \
-g clfo1_v_insecta_goanna_gaf.tsv \
-o complete_gaf
Command Explained

sudo docker run: tells docker to run

–rm: removes the container when the analysis has finished. The image will remain for future use.

-v $(pwd):/work-dir: mounts my current working directory on the host machine to ‘/work-dir’ in the container

agbase/combine_gafs:1.1: the name of the Docker image to use

Tip

All the options supplied after the image name are Combine_GAFs options

-i CFLO_1.fa_gaf.txt: InterProScan XML Parser GAF output file.

-g clfo1_v_insecta_goanna_gaf.tsv: GOanna GAF output file.

-o complete_gaf: output file basename–a .tsv extension will be added

Combine GAFs using Singularity (Apptainer)

About Singularity (Apptainer)

  • does not require ‘root’ permissions
  • runs all containers as the user that is logged into the host machine
  • HPC systems are likely to have Singularity installed and are unlikely to object if asked to install it (no guarantees).
  • can be run on any machine where is is installed
  • more information about installing Singularity
  • This tool was tested using Singularity 3.10.2.

HPC Job Schedulers

Although Singularity can be installed on any computer this documentation assumes it will be run on an HPC system. The tool was tested on a SLURM system and the job submission scripts below reflect that. Submission scripts will need to be modified for use with other job scheduler systems.

Getting the Combine GAFs Container

The Combine GAFs tool is available as a Docker container on Docker Hub: Combine GAFs container

The container can be pulled with this command:

singularity pull docker://agbase/combine_gafs:1.1

Running Combine GAFs with Data

Combine GAFs has three parameters:

-i InterProScan XML Parser GAF output
-g GOanna GAF output
-o output file basename

Example SLURM Script

#!/bin/bash
#SBATCH --job-name=combine_gafs
#SBATCH --ntasks=8
#SBATCH --time=2:00:00
#SBATCH --partition=short
#SBATCH --account=nal_genomics

module load singularityCE

singularity run \
-B /directory/you/want/to/work/in:/work-dir \
combine_gafs_1.1.sif \
-i CFLO_1.fa_gaf.txt \
-g clfo1_v_insecta_goanna_gaf.tsv \
-o complete_gaf
Command Explained

singularity run: tells Singularity to run

-B /directory/you/want/to/work/in:/work-dir: mounts my current working directory on the host machine to ‘/work-dir’ in the container

combine_gafs_1.1.sif: the name of the Singularity image file to use

Tip

All the options supplied after the image name are GOanna options

-i CFLO_1.fa_gaf.txt: InterProScan XML Parser GAF output file.

-g clfo1_v_insecta_goanna_gaf.tsv: GOanna GAF output file.

-o complete_gaf: output file basename–a .tsv extension will be added