GBrowse and JBrowse are excellent for visualizing our assembly and annotation. But what if you want to do some further analysis and exploration? You can manually browse the assembly, but then you won't get tenure.
To further analyze your data you need tools like Galaxy, BioMart and InterMine. Since I work for Galaxy, we'll spend some time working on a simple example in that. We'll touch on BioMart or InterMine as time allows.
1. Get to Galaxy
We could run this analysis on the free public Galaxy server (http://usegalaxy.org), or on the Galaxy that has been installed on our VMware image. Let's run it on our local install.
Note: Please don't run on the local install with me. The public server might be able to support 120 people doing this simultaneously. The local install won't.
2. What have we got?
First load the GFF that MAKER produced into Galaxy
- Get Data → Upload File → ftp://ftp.gmod.org/pub/gmod/Meetings/2011/AGS/3263.maker.output/3263.all.gff &rarr Execute
- This uploads the GFF file into Galaxy. It recognizes it as a GFF3 file.
Now, because of a bug in Galaxy (don't tell anyone), we need to convert it to BED to run a subsequent step.
- Convert Formats → GFF-to-BED → Execute
Now lets see what is in the annotation. Lets count the number of different feature types in the file.
- Join, Subtract and Group → Group → Group by Column: c4
This tells Galaxy please group the lines by the value in column 4, which is the SO type of the feature
- Add new operation → Type: Count → Execute
Now count the number of lines that have each type.
Anything interesting? Hmmm. We've got one more exon than CDS. I wonder where that is?
3. Get just the Exons and CDSs
Just get the exons:
- Filter and Sort → Filter → Filter: GFF-to-Bed on data
The SO type is in column 4 in BED.
- With following condition: c4=='exon' → Execute
Repeat with CDS.
4. See what is in the exon set that is not in the CDS set
- Operate on Genomic Intervals → Subtract
- Subtract CDSs from Exons &rarr Execute
We have one exon left. Go visualize it in GBrowse or JBrowse.