Difference between revisions of "GBrowse syn Configuration"

From GMOD
Jump to: navigation, search
(Species Configuration File)
(Temporary workflow for sample data)
Line 107: Line 107:
  
  
=Temporary workflow for sample data=
 
  
The first thing we need to do is create a mysql alignment database using the command-line incantation below:
 
 
$ mysql -u root -e 'create database rice_synteney'
 
 
Then we will have a look at the input data:
 
 
<pre>
 
$ cd ~/data/gbrowse_syn/rice
 
$ more data/rice.aln
 
 
CLUSTAL W(1.81) multiple sequence alignment W(1.81)
 
 
 
rice-3(+)/16598648-16600199      ggaggccggccgtctgccatgcgtgagccagacggggcgggccggagacaggccacgtgg
 
wild_rice-3(+)/14467855-14469373 gggggccgg------------------------------------agacaggccacgtgg
 
                                ** ******                                    ***************
 
 
 
rice-3(+)/16598648-16600199      ccctgccccgggctgttgacccactggcacccctgtcccgggttgtcgccctcctttccc
 
wild_rice-3(+)/14467855-14469373 ccctgccccgggctgttgacccactggcacccctgtcccgggttgtcgccctcctttccc
 
                                ************************************************************
 
 
 
rice-3(+)/16598648-16600199      cgccatgctctaagtttgctcctcttctcgaacttctctctttgattcttcacgtcctct
 
wild_rice-3(+)/14467855-14469373 cgccatgctctaagtttgctcctcttctcgaacttctctctttgattcttcacgtcctct
 
                                ************************************************************
 
 
 
 
rice-3(+)/16598648-16600199      tggagcctccccttctagctcgatcacgctctgctcttccgcttggaggctggcaaaact
 
wild_rice-3(+)/14467855-14469373 tggagcctccccttctagctcgatcgcgctctgctcttccgcttggaggctggcaaaact
 
                                ************************* **********************************
 
</pre>
 
 
'''<font color=red>NOTE1:</font>''' These data are in clustalw format.  The scripts used to process these data will recognize clustalw and other commonly used formats recognized by BioPerl's AlignIO parser.  ''This does not mean that clustal is the format used to generate the alignment data'''''Bold text'''.  These particular aignments were generated by blastZ and formated with compara pipeline components.  See [[WGA_data]] for more information on whole genome alignments pipelines.
 
 
 
'''<font color=red>NOTE2:</font>''' The sequence ID is this clustal file is overloaded to contain information about the species, strand and coordinates.  This information is essential:
 
 
  rice-3(+)/16598648-16600199
 
  species-refseq(strand)/start-end
 
 
 
 
 
 
Then, we will load the database.  This is time-consuming, so we will use a screen session to run
 
it in the background while we turn our attention to downstream tasks
 
 
$ screen
 
 
*When entering screen mode, hit 'space' to clear the first screen.
 
*If your backapce key does not work in screen mode, use ^H (ctrl key + H key).
 
 
<pre>
 
$ bin/load_alignments_msa.pl -u root -d rice_synteny --verbose data/rice.aln
 
Processing alignment file data/rice.aln...
 
Processing alignment 1
 
Mapping coordinates for alignment 1... Done!
 
Processed pair-wise alignment 1
 
Processing alignment 2
 
Mapping coordinates for alignment 2... Done!
 
Processed pair-wise alignment 2
 
Processing alignment 3
 
Mapping coordinates for alignment 3... Done!
 
Processed pair-wise alignment 3
 
Processing alignment 4
 
Mapping coordinates for alignment 4... Done!
 
Processed pair-wise alignment 4
 
Processing alignment 5
 
Mapping coordinates for alignment 5... Done!
 
Processed pair-wise alignment 5
 
Processing alignment 6
 
Mapping coordinates for alignment 6... Done!
 
Processed pair-wise alignment 6
 
Processing alignment 7
 
Mapping coordinates for alignment 7... Done!
 
Processed pair-wise alignment 7
 
Processing alignment 8
 
Mapping coordinates for alignment 8... Done!
 
Processed pair-wise alignment 8
 
Processing alignment 9
 
Mapping coordinates for alignment 9... Done!
 
Processed pair-wise alignment 9
 
Processing alignment 10
 
Mapping coordinates for alignment 10... Done!
 
Processed pair-wise alignment 10
 
</pre>
 
 
* This will go on for some time (there are 1800 alignments), so we will skip let the screen run in the background and work on our other tasks.
 
  
 
[[Category:GBrowse syn]]
 
[[Category:GBrowse syn]]

Revision as of 09:45, 3 August 2009

GBrowse_syn is a synteny viewer based on GBrowse. This page describes how to configure GBrowse_syn.

Main Configuration File

Purpose

The main configuration file specifies the alignment database, the species to be included and their corresponding configuration files and display options.

  • This file ends with the extension ".synconf".

Configurable Options

join

  • Required setting
  • The database source name (DSN) for the alignment database
#example
join        = dbi:mysql:database=pecan;host=localhost;user=nobody

source map

  • Required setting
  • This option maps the relationship between the species data sources, names and descriptions
# example:
#                 name         conf. file          description
source_map =     elegans      elegans_synteny     "C. elegans"
                 remanei      remanei_synteny     "C. remanei"
                 briggsae     briggsae_synteny    "C. briggsae"

tmpimages

  • The URL for cached image and session data
# example
tmpimages   = /gbrowse/tmp

buttons

  • The URL for stock GBrowse images, etc
# example
buttons       = /gbrowse/images/buttons

stylesheet

  • default: /gbrowse/gbrowse.css
  • The URL for the stylesheet

examples

  • Example searches to show at the top of the page
#example
examples = elegans X:1050000..1150000
           elegans I:10762799..10789727
           briggsae chrX:620000..670000

zoom levels

  • which zoom levels will be available in the navigation menu
zoom levels = 5000 10000 25000 50000 100000 200000 400000

config_extension

  • default: 'syn';
  • This specifies the extension of species-specific configuration files.
  • If GBrowse_syn is used with stand-alone GBrowse data sources, change this option to 'conf'.
  • To avoid confusing the configuration files parser, take care to select names for species-specific configuration files that are not similar to other file names. For example, do not use both elegans.conf (for GBrowse) and elegans.syn (for GBrowse_syn).

description

  • default: none
  • The description of the GBrowse_syn data source for public display

max_segment

  • default: 400_000
  • The maximum allowed segment size (sequence length) for the central reference panel
  • Take care not to set this value too high. Very large segments may cause significant network latency or even time out the web server

max_span

  • default: 0.3 (i.e., 30%)
  • This is an advanced option.
  • The maximum portion of the reference sequence size that will trigger merging of adjacent inset (aligned sequence) panels.

min_alignment_size

  • default: 0.01
  • The minimum alignment size, expressed as a fraction of the total reference sequence length, that will be used to create an inset panel.

imagewidth

  • default: 800
  • The width of the displayed sequence panels (pixels)

interimage_pad

  • default: 5
  • The space between inset panels (pixels)

vertical_pad

  • default: 5
  • The vertical space between panels (pixels)

align_height

  • default: 6
  • The height of the alignment syntenic block features (pixels)

max_gap

  • default: 200_000
  • This is an advanced option
  • The maximum gap allowed between chained alignment features

overview_ratio

  • default: 0.9
  • The relative width of the overview panel in relation to the width of the detailed display panel

overview bgcolor

  • default: gainsboro
  • The background color of the overview panel
  • Allowed values are named web colors or RGB hex codes (eg: '#FFFFFF')