GBrowse syn Scripts

From GMOD
Revision as of 19:50, 12 December 2009 by Mckays (Talk | contribs)

Jump to: navigation, search

This page describes helper scripts for processing alignment data for loading into GBrowse_syn.

Parsing Multiple Sequence Alignment Data

The scripts in this section process multiple sequence alignment data in various formats and convert them to the tab-delimited format used to load the GBrowse_syn database.

aln2hit.pl

aln2hit.pl is a generic alignment data parser that reads alignment data into the GBrowse_syn database loading format.

Purpose
Use this script in cases where you have a single alignment file and want to convert it to the tab-delimited format that is used to load the GBrowse_syn alignment database.

Note
This script is deprecated. You can use the load_alignments_msa.pl to load the database directly.

Example
perl aln2hit.pl -f clustalw -i my_alignments.aln >my_alignments.txt
Options
argument default description
f clustalw Specifies the alignment file format. Most common formats recongnized by BioPerl's AlignIO parsers are supported. Use clustalw or fasta for best results.
i Specifies the name of the input alignment file

clustal2hit.pl

clustal2hit.pl is a CLUSTALW format alignment data parser.

Purpose
Use this script in cases where you have a one or more clustal alignment files and want to convert them to the tab-delimited format that is used to load the GBrowse_syn alignment database.
Note
  • This script is somewhat deprecated. The intermediate tab-delimited format is no longer required to load the database. You can use the load_alignments_msa.pl to load the database directly.
  • If you want to process multiple files in another format, edit the FORMAT constant near the top of clustal2hit.pl.


Example
perl clustal2hit.pl *.aln >my_alignments.txt
Options

None.

mercatoraln_to_synhits.pl

mercatoraln_to_synhits.pl is a data parser for multiple sequence alignments generated by mercator.

Purpose
This script will process alignments generated by the MERCATOR pipeline
Example
Usage example here
Options
argument default description
a output.mfa Specifies the name of the alignment file
v Print progress reports while running
f fasta Specifies format of the input alignment files
d Specifies the containing directory for the genome and map files

Direct Database Loading Scripts

The scripts in this section load data from multiple sequence alignment files into the GBrowse_syn alignment database

load_alignment_database.pl

Purpose
This script loads the alignment database from a tab-delimited alignment data files (format described here).
Example
perl load_alignment_database
argument default description

load_alignments_gff3.pl

argument default description


load_alignments_msa.pl

argument default description