Mugsy is a multiple whole genome aligner. Mugsy uses Nucmer for
pairwise alignment, a custom graph based segmentation procedure for
identifying collinear regions, and the segment-based progressive
multiple alignment strategy from Seqan::TCoffee. Mugsy accepts draft
genomes in the form of multi-FASTA files and does not require a
reference genome.
Edit mugsyenv.sh and add path to the installation area
mugsyenv.sh can then be sourced before invoking mugsy. Alternatively, you can add MUGSY_INSTALL to your environment.
This example will align three genomes and output a file /data/output/mygenomes.maf. The --directory setting is also used for storing temporary files during the run.
The prefix of each input filename will be used as the genome name in the output files (eg. genome1 from genome1.fsa). Header lines in the FASTA files should not contain ':' or '-' to avoid parsing problems.
View or parse MAF output
The output of Mugsy is Multiple Alignment Format
(MAF).
One option for browsing is GMAJ, which provides a reference-based stand-alone viewer for MAF files.
To cite Mugsy, use:
Angiuoli SV and Salzberg SL. Mugsy: Fast multiple alignment of closely related whole genomes. Bioinformatics 2011 27(3):334-4
Installation
- Download mugsy from Sourceforge
cd /path/to/install tar xvzf mugsy_x86-64-vNrN.tgzThe release bundle is compiled for x86-64-bit Linux and invoked from a Perl wrapper script. (A pre-compiled version for Mac OSX is not yet available)
export MUGSY_INSTALL=/path/to/install/mugsy
mugsyenv.sh can then be sourced before invoking mugsy. Alternatively, you can add MUGSY_INSTALL to your environment.
Getting started
- Set up the environment
bash source /path/to/install/mugsyenv.sh
- Run mugsy The input to Mugsy is DNA from two or more genomes in FASTA format. The genomes must be assembled into contigs or scaffolds. For draft genomes, a single multi-FASTA file containing all contigs for the genome should be provided.
mugsy --directory /data/output --prefix mygenomes genome1.fasta genome2.fasta genome3.fasta
This example will align three genomes and output a file /data/output/mygenomes.maf. The --directory setting is also used for storing temporary files during the run.
The prefix of each input filename will be used as the genome name in the output files (eg. genome1 from genome1.fsa). Header lines in the FASTA files should not contain ':' or '-' to avoid parsing problems.
One option for browsing is GMAJ, which provides a reference-based stand-alone viewer for MAF files.
java -jar gmaj.jar mygenomes.maf
Troubleshooting
If the pre-compiled binaries are not compatible with your machine, you may see errors like
ERROR: prenuc returned non-zero, please file a bug reportTo compile Mugsy on your machine, run
svn co https://mugsy.svn.sourceforge.net/svnroot/mugsy/trunk mugsy_trunk cd mugsy_trunk make make install make distIf this succeeds, it will create a file mugsy_x86-64-XXX.tgz containing a fresh set of executables
Sections
Download
Mugsy 1.2.3 12/21/2011Mugsy-Annotator 0.5, utilities for finding orthologs and evaluating annotation quality 02/28/2011