Difference between revisions of "GBrowse Volvox SAM Tutorial"
m (→Read pairs) |
(→Add a database stanza to the volvox config) |
||
Line 19: | Line 19: | ||
[samdb:database] | [samdb:database] | ||
db_adaptor = Bio::DB::Sam | db_adaptor = Bio::DB::Sam | ||
− | db_args = -fasta / | + | db_args = -fasta /var/lib/gbrowse2/databases/volvox/volvox.fa |
− | -bam / | + | -bam /var/lib/gbrowse2/databases/volvox/volvox.sort.bam |
search options = none | search options = none | ||
</pre> | </pre> |
Revision as of 13:55, 9 July 2012
This page or section needs to be edited. Please help by editing this page to add your revisions or additions.
This is a short tutorial intended to add on to the GBrowse tutorial that ships with it using the "volvox" data set. The SAM file that is linked to here was created by running wgsim, a read simulator to create FASTQ files and then using Galaxy to run Bowtie and produce a SAM file. In order to use this tutorial, SAMtools and Bio::DB::Sam must be installed. This is based on the NGS tutorial that Dave Clements wrote.
Contents
Create the BAM file
Start with a SAM file and save in the GBrowse database directory for the volvox tutorial. Also make sure that the volvox fasta file, volvox.fa, is in the directory as well. Now convert the SAM file to a BAM file and index it:
samtools faidx volvox.fa samtools import volvox.fa.fai volvox.sam volvox.bam samtools sort volvox.bam volvox.sort samtools index volvox.sort.bam
Add a database stanza to the volvox config
The database stanza goes just above the "TRACK DEFAULTS" stanza:
[samdb:database] db_adaptor = Bio::DB::Sam db_args = -fasta /var/lib/gbrowse2/databases/volvox/volvox.fa -bam /var/lib/gbrowse2/databases/volvox/volvox.sort.bam search options = none
It's not a bad idea to move the data adaptor information from the top of the conf file (that points at the volvox database directory) down here and add a database stanza for it as well; though if you do that, you must also add a database line to the TRACK DEFAULTS section that points at that database name so that the other tracks continue to work.
Add a coverage track
Now add a stanza that will create a coverage density track.
[CoverageDensity] feature = coverage glyph = wiggle_density database = samdb height = 20 bicolor_pivot = 15 pos_color = blue neg_color = red key = Coverage (density) category = Reads label = 0
Note that the value of the database is samdb, which is the name it was given in the database stanza above. The bicolor_pivot value is the coverage value below which the density plot will turn red (the neg_color).
Add individual reads
This stanza will show the individual reads as glyphs, though there will be a lot of them:
[Reads] feature = match glyph = segments draw_target = 1 show_mismatch = 1 mismatch_color = red database = samdb bgcolor = blue fgcolor = white height = 5 label density = 50 bump = fast key = Reads category = Reads
The draw_target option tells the glyph to put the sequence of the read in the glyph when zoomed in far enough to see it and the show_mismatch option highlights the base in red when it doesn't match the contig sequence.
Now, the problem with this track is that if you are at a zoom level higher than a few kb, rendering the track will take a long time (my laptop can't manage 4kb), so we need a way to turn this track on and off according to the zoom level. Fortunately, GBrowse has semantic zooming built in, and so what we'd like is to turn on the coverage glyph when zoomed out and individual reads when zoomed in.
Semantic zooming
To use semantic zooming, we give the coverage and read tracks the same name, and then we add to the coverage track name (the one we want to appear as we zoom out) the base pair level at which we want it to "turn on", like this:
[Reads:1500] feature = coverage glyph = wiggle_density database = samdb height = 20 bicolor_pivot = 15 pos_color = blue neg_color = red
so that if you look at a range that is 1499 bp wide, you'll see individual reads, then if you move out a little bit and look at a segment 1500 bp wide, it will switch to a coverage density plot.
Read pairs
Bio::DB::Sam tools will also show read pairs matched up. To do this you use the "read_pair" feature type along with the segments glyph, like this:
[Pairs] feature = read_pair glyph = segments draw_target = 1 show_mismatch = 1 mismatch_color = red database = samdb bgcolor = violet height = 5 label density = 50 bump = fast key = Read pairs category = Reads [Pairs:1000] feature = coverage glyph = wiggle_density database = samdb height = 20 bicolor_pivot = 15 pos_color = purple neg_color = red
Coloring reads according to quality
With real SAM data, you could color your reads according to the quality score of the read. Since this tutorial is using fake data with uniform quality, it is somewhat boring, but if you have your own data, you can do this for bgcolor:
bgcolor = sub { my $feature = shift; my $blueness = 255 - int($feature->qual * 2.40); my $colour = chr(35) . sprintf("%X", $blueness) . sprintf("%X", $blueness) . "FF"; return $colour; }
where the callback will make the feature background more blue the higher the quality. How do you suppose it does that?