This assignment is technically completed entirely in Python. This assignment is due 2 weeks after precept.
This precept will introduce you to the Snakemake workflow management system. You will access the Adroit cluster, clone the repository, and run an example Snakemake pipeline.
Ensure you document the output you use to complete the exercises. Submitting a text file with the changes you make is sufficient.
Follow the guide for accessing the Adroit cluster.
cd /scratch/network/[PUID]git clone [your assignment repository URL].In this exercise, you will run an example Snakemake pipeline that converts FASTQ files to VCF format. The base and data for this Snakemake workflow can be found at /scratch/network/mw0425/fastq_2_VCF on Adroit.
module load anaconda3/2024.2) and create an environment for snakemake conda env create -f /scratch/network/swwolf/fastq_2_VCF/bioinformatics.yml.conda activate bioinformatics./scratch/network/swwolf/fastq_2_VCF to your home folder (cp -r /scratch/network/swwolf/fastq_2_VCF .) Don't forget the period at the endmetadata files, you can safely ignore these warningscd fastq_2_VCF)bioinformatics.yml, 2. Snakefile, 3. ref_genome, and 4. fastqrm command and folders can be deleted with the rm -r commandvim Snakefile or emacs Snakefile or nano Snakefilesnakemake --cores [number_of_cores] (2-4 cores is reasonable and will take a few hours, leave your computer on!)After successfully running the example pipeline, make a minor modification to the workflow.
snakemake --cores [number_of_cores].Remember to document all changes and any observations you make while completing these exercises. Your final submission will be a text file containing these documented changes. You will submit this text file like usual using git, but you should remain on the Adroit remote server while doing so (this is where you cloned your repository anyway).