This MAKER tutorial was taught by Barry Moore as part of the 2011 GMOD Spring Training.
The first half of this page describes the basics of MAKER - the easy-to-use genome annotation pipeline.
MAKER is an easy-to-use genome annotation pipeline designed to be usable by small research groups with little bioinformatics experience; however, MAKER is also designed to be scalable and is appropriate for projects of any size including use by large sequence centers. MAKER can be used for de novo annotation of newly sequenced genomes, for updating existing annotations to reflect new evidence, or just to combine annotations, evidence, and quality control statistics for use in other GMOD programs like GBrowse, JBrowse, Chado, and Apollo.
MAKER has been used in many genome annotation projects:
Annotations are descriptions of different features of the genome, and they can be structural or functional in nature.
Examples:
To use this feature, you must have MPICH2 installed with the the
--enable-sharedlibs
flag set during installation (See MPICH2
Installer’s Guide). I have installed this for you. So lets set up
MPI_MAKER and run the example file that comes with MAKER.
cd ~/Documents/Software/maker/src
perl Build.PL
Accept the default that we want to build for MPI support
./Build install
You should now see the executable mpi_maker
listed among the MAKER
scripts (/maker/bin
). Let’s run some example data to see if MPI_MAKER
is working properly.
cd ~
mkdir ~/maker_run2
cd maker_run2
cp ~Documents/Software/maker/data/dpp_* ~/maker_run2
maker -CTL
gedit maker_opts.ctl
Set values in maker configuration files.
genome=dpp_contig.fasta
est=dpp_est.fasta
protein=dpp_protein.fasta
snap=/home/gmod/Documents/Software/maker/exe/snap/HMM/fly
We need to set up a few more things for MPI to work. Type mpd
to see a
list of instructions.
mpd
You should see the following.
configuration file /home/gmod/mpd.conf not found
A file named .mpd.conf file must be present in the user's home
directory (/etc/mpd.conf if root) with read and write access
only for the user, and must contain at least a line with:
MPD_SECRETWORD=<secretword>
One way to safely create this file is to do the following:
cd $HOME
touch .mpd.conf
chmod 600 .mpd.conf
and then use an editor to insert a line like
MPD_SECRETWORD=mr45-j9z
into the file. (Of course use some other secret word than mr45-j9z.)
Follow the instructions to set this file up, and start the mpi
environment with mpdboot
. Then run mpi_maker
through the MPI manager
mpiexec
.
mpdboot
mpiexec -n 2 mpi_maker
mpiexec
is a wrapper that handles the MPI environment. The -n 2
flag
tells mpiexec
to use 2 cpus/nodes when running mpi_maker
. For a
large cluster, this could be set to something like 100. You should now
know how to start a MAKER job via MPI.
This example did not work during class because a conflict with the version of Apache that was installed. The issue has since been fixed. Before beginning the example, open a terminal and remove the following files otherwise the subversion update of maker fails.
rm ~/Documents/Software/maker/MWAS/bin/mwas_server
rm ~/Documents/Software/maker/MWAS/cgi-bin/tt_templates/apollo_webstart.tt
Then update maker via subversion.
svn update ~/Documents/Software/maker/
The MWAS interface provides a very convenient method for running MAKER and viewing results; however, because compute resources are limited users are only allowed to submit a maximum of 2 megabases of sequence per job. So while MWAS might be suitable for some analyses (i.e. annotating BACs and short preliminary assemblies), if you plan on annotating an entire genome you will need to install MAKER locally. But if you like the convenience of the MWAS user interface, you can optionally install the interface on top of a locally installed version of MAKER for use in your own lab.
First under the maker
directory there is a subdirectory called MWAS
.
MWAS
contains all the needed files to build the MAKER web interface.
The maker/MWAS/bin/mwas_server
file is used to setup and run this web
interface. Lets configure that now. There are three steps to setting up
the server. First you must create and edit a server configuration file,
then load all other configuration files, and then install all files to
the appropriate web accessible directory.
cd /home/gmod/Documents/Software/maker/MWAS/
bin/mwas_server PREP
This will create a file in /maker/MWAS/config/
called server.ctl
. We
will need to edit this file before continuing.
gedit config/server.ctl
Set:
apache_user:www-data
web_address:http://localhost
cgi_dir:/usr/lib/cgi-bin/maker
cgi_web:/cgi-bin/maker
html_dir:/var/www/maker
html_web:/maker
data_dir:/var/www/maker/data
use_login:0
Now we need to generate other settings that are dependent on the values in
server_opts.ctl
.
bin/mwas_server CONFIG
Several new configuration files should now be loaded in the config/
directory. These new files define default MAKER options for the server
and the location of files for the server dropdown menus.
maker_bopts.ctl
maker_exe.ctl
maker_opts.ctl
menus.ctl
We shouldn’t need to edit any of these file. So lets copy files to the
appropriate web accessible directories. This must be done as root or
using sudo
.
sudo bin/mwas_server SETUP
If you set APOLLO_ROOT
in the server.ctl
file, then you can now
setup a special Java Web Start version of Apollo to
view results directly from the web interface. Web Start will be
described in more detail in the Apollo session. This must be done as
root or using sudo
.
sudo bin/mwas_server APOLLO
We can now run MAKER examples using this web interface, but first we need to launch a server to monitor for new job submissions.
sudo bin/mwas_server START
And then go to
MAKER comes with a number of accessory scripts that are meant to assist in manipulations of the MAKER input and output files.
Scripts:
add_utr_start_stop_gff <gff3_file>
add_utr_gff.pl <gff3_directory>
cegma2zff <cegma_gff> <genome_fasta>
chado2gff3 [OPTION] <database_name>
compare [OPTION] <database_name> <gff3_file>
cufflinks2gff3 <transcripts1.gtf> <transcripts2.gtf> ...
mpi_evaluator [options] <eval_opts> <eval_bopts> <eval_exe>
fasta_merge -d <datastore_index> -o <outfile>
genemark_gtf2gff3 <filename><pre>
''gff3_2_gtf'' - Converts MAKER GFF3 files to GTF format (run add_utr_start_stop_gff first to get UTR features)
<pre> gff3_2_gtf <gff3_file>
gff3_merge -d <datastore_index> -o <outfile>
gff3_preds2models <gff3 file> <pred list>
gff3_to_eval_gtf <maker_gff3_file>
iprscan2gff3 <iprscan_file> <gff3_fasta>
iprscan_batch <file_name> <cpus> <log_file>
ipr_update_gff <gff3_file> <iprscan_file>
maker2chado [OPTION] <database_name> <gff3file1> <gff3file2> ...
maker2chado [OPTION] <database_name> <gff3file1> <gff3file2> ...
maker2zff.pl <gff3_file>
maker_functional_fasta <uniprot_fasta> <blast_output> <fasta1> <fasta2> <fasta3> ...
maker_functional_gff <uniprot_fasta> <blast_output> <gff3_1>
maker_map_ids --prefix PYU1_ --justify 6 genome.all.gff > genome.all.id.map
map2assembly <genome.fasta> <transcripts.fasta>
map_data_ids genome.all.id.map data.txt
map_fasta_ids <map_file> <fasta_file>
map_gff_ids <map_file> <gff3_file>
split_fasta [count] <input_fasta>
tophat2gff3 <junctions.bed>
Facts about “MAKER Tutorial 2011”
Has topic | MAKER + |