JBrowse 2 Tutorial PAG 2022
This is very much a draft version of the PAG 2022 tutorial, using the JBrowse 1 tutorial as a template.
Contents
Prerequisites
JBrowse 2 is both a desktop and server application. In this tutorial, we will focus on the desktop application to make our lives easier, but the server application is pretty easy to set up and has simple prerequisites (but reminder: you don't need this for this tutorial):
- a web server like Apache or Nginx
- NodeJS version 10 or better
That's really it for the server. Other things the would likely help include GenomeTools for sorting GFF, SamTools for working with BAM and CRAM files, and tabix for indexing various file formats.
But, again, none of those things are needed today!
Download and install
While we've installed JBrowse 2 on the conference computers, if you'd like to follow along on your own computer, you can go to https://jbrowse.org/jb2/download/ and get the download for your platform and install it. It shouldn't take very long.
JBrowse Introduction
How and why JBrowse 2 is different from most other web-based genome browsers, including JBrowse and GBrowse.
Replace with current presentation!
Setting up JBrowse
Loading sequence
After installing JBrowse 2, open it using your operating systems preferred method, and you'll be greeted with a splash screen that has on part of it this dialog to open a new sequence:
JBrowse supports a variety of forms of sequence data including "vanilla" FASTA, but for this example, we are going to use gzipped and faidx (FASTA indexed) files. To load those up, we'll use the grape FASTA file and it's indexes (ie, 'fai' and 'gzi' files):
https://s3.amazonaws.com/jbrowse.org/genomes/grape/Vvinifera_145_Genoscope.12X.fa.gz https://s3.amazonaws.com/jbrowse.org/genomes/grape/Vvinifera_145_Genoscope.12X.fa.gz.fai https://s3.amazonaws.com/jbrowse.org/genomes/grape/Vvinifera_145_Genoscope.12X.fa.gz.gzi
In the Open Sequence dialog, give the assembly a name (something creative, like "grape") and select BgzipFastaAdapter from the "type" menu, and then copy and paste the above URLs into the appropriate textfields under the "type" menu.
If we were creating a "normal" genome browser, we'd be done with adding sequence, but since we'd like to compare, we will also add the bgzipped and indexed FASTA file for peach. When we clicked on the "open sequence" button before, we were presented with a menu asking us what type of view we'd like, but first we have to add a second genome. What we need is in the Tools menu. Select "Open assembly manager," where you'll get a dialog that was very similar to what we used for grape. This time, we'll load the peach genome, so do the same things as before, and use these URLs:
https://s3.amazonaws.com/jbrowse.org/genomes/peach/Ppersica_298_v2.0.fa.gz https://s3.amazonaws.com/jbrowse.org/genomes/peach/Ppersica_298_v2.0.fa.gz.fai https://s3.amazonaws.com/jbrowse.org/genomes/peach/Ppersica_298_v2.0.fa.gz.gzi
After adding the peach genome, we'll get a dialog that shows us that we have both genomes:
Creating a comparative view
Now we'd like to create a comparison view. JBrowse 2 supports a few comparative views, but we'll start with a whole genome dotplot. For showing areas of synteny, we have a PAF file that looks like this:
Pp01 47851208 1388059 1391133 + chr8 22385789 1539799 1542834 703 3099 21 tp:A:P cm:i:73 s1:i:686 s2:i:439 dv:f:0.1377 rl:i:921840 Pp01 47851208 19134590 19135964 - chr15 20304914 6572992 6574378 659 1387 1 tp:A:P cm:i:85 s1:i:657 s2:i:638 dv:f:0.0768 rl:i:921840 Pp01 47851208 19134614 19135805 + chr17 17126926 16801080 16802270 638 1192 0 tp:A:S cm:i:79 s1:i:638 dv:f:0.0727 rl:i:921840 Pp01 47851208 43719774 43728648 - chr18 29360087 6242566 6251482 642 8964 54 tp:A:P cm:i:55 s1:i:620 s2:i:40 dv:f:0.2275 rl:i:921840 Pp01 47851208 40987755 40994103 + chr18 29360087 2664522 2670983 639 6461 51 tp:A:P cm:i:64 s1:i:620 s2:i:77 dv:f:0.1931 rl:i:921840 Pp01 47851208 19134590 19135968 - chr5 25021643 19591018 19592393 572 1379 0 tp:A:S cm:i:69 s1:i:572 dv:f:0.0910 rl:i:921840
PAF (URL) is a fairly simple file format relating two areas in genome coordinates. For more information on doing genome comparisons and generating PAF files, see THIS TUTORIAL. To load the peach-grape PAF, select "DotplotView" from the "Select a view to launch" menu.
In the resulting dialog box, select Peach and then Grape for the assemblies to view. IMPORTANT: order here matters! Because the PAF file has the peach coordinates first, you have to use it first in this dialog box. After selecting the two assemblies, copy and paste this URL for the PAF file in to the optional PAF URL textfield:
https://s3.amazonaws.com/jbrowse.org/genomes/synteny/peach_grape.paf
After clicking "Open", you get a dotplot that looks like this:
And of course, this isn't just an image, it is a genome-browsable interface, that you can click and drag to zoom into an region you like, even across multiple chromosomes.
Creating the synteny view
When we click and drag to make a rectangle, we get a popup menu asking whether we want to zoom in or open a synteny view. We can use this functionality to zoom in on a region we are interested in and then when we're happy with the region, we can click on the Open linear syntenic view option.
and the resulting syntenic view:
Adding gene annotations
This is nice--it shows lines or trapezoids of synteny, but is perhaps not as informative as it could be. The individual genome frames in the synteny view support adding other tracks (though if you add a lot, you better have a tall monitor), so we can add gene annotations. As it happens, we have gene annotation track data from a JBrowse 1 instance for both peach and grape (which was originally used for the GBrowse_syn tutorial), so we can add those. Note that this procedure will work for just about any sort of data file that we might might want to map on to a genome (BAM, CRAM, BigWig, BigBed, indexed VCF, index GFF); JBrowse 2 generally does a pretty good job of guessing what sort of data file you want to add based on its extension.
First, click one of the genome's "Open track selector" buttons; this will cause a new frame to open on the right side of the window, titled "Available tracks." Under that text is a "hamburger menu" icon (three horizontal lines). Click on that to get the "Add track" option.
We'll need the URLs for the JBrowse 1 data, generally referred to as NCList data, for Nested Containment List, the commonly used data format for JBrowse 1 annotation tracks. These are:
Grape: https://jbrowse.org/genomes/synteny/grape_gene/{refseq}/trackData.json Peach: https://jbrowse.org/genomes/synteny/peach_gene/{refseq}/trackData.json
Note that these are only "sort of" URLs, since if you click on either of these links, you'll get a 404 - not found message. JBrowse does some magic with the {refseq}
part of the URL to substitute in the name of the chromosome.
For which ever genome you are adding annotations, copy and paste the corresponding URL above into the "Main file" textfield. In this instance there is not an index field but other formats require them (like BAM, CRAM, and VCF). Then click the "Next" button.
JBowse 2 will correctly guess that you are adding NCList data, so it will already have selected that option in the "Confirm track type" dialog, but one thing we will want to change here is the name of the track. JBrowse typically uses the name of the file to name the track, but having two tracks named "trackData.json" won't be real informative, so change the trackName entry to something useful like "Peach Genes" (unless of course, you're adding a Grape gene track). Also, double check that the "assemblyName" entry is what you expect. Now click the "Add" button, and repeat this whole procedure to add a gene track for the other species.
Depending on the zoom level of the synteny view, you will probably get a message about the gene tracks not getting loaded unless you zoom in or FORCE the loading of the tracks, which may make the application slow.
You can zoom in and out either using the magnifying glass icons in the upper right of each genome's frames or by clicking and dragging in the genome's "number line," or coordinate, region.
Generally, you can navigate in the synteny view the way you would expect: by clicking and dragging anywhere in a genome's area other than the coordinate region (because clicking and dragging there will trigger the context menu that lets you zoom in). By default, this will cause only the genome that you're interacting with to move. This default can be changed by clicking on the "Toggle linked scrolls" icon in the upper left hand corner of the window (the oval with a line through it). Note that the other two icons next to the linked scroll icon don't actually do anything yet--we are planning implementation for those soon.
Changing the way tracks look
Would like to add simple color changing and then writing simple javascript (maybe to modify mouseover labels)
Using Plugins
Or not--It's not clear to me that there is a suitable plugin I could demo