Difference between revisions of "JBrowse2 Tutorial PAG 2022"

From GMOD
Jump to: navigation, search
(Creating trix index)
(Replaced content with "Moved to JBrowse2_Tutorial_PAG_2023 ")
 
Line 1: Line 1:
== Prerequisites ==
+
Moved to [[JBrowse2_Tutorial_PAG_2023]]
 
+
* NodeJS
+
Installed using the instructions on [https://nodejs.org/en/download/package-manager/#debian-and-ubuntu-based-linux-distributions Nodejs.org]:
+
 
+
<span class="enter">
+
  curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash - &&sudo apt-get install -y nodejs
+
</span>
+
* A web server (Apache2 in this instance, but any will do). I enabled the "userdir" mod so we could all use the same machine for the tutorial:
+
 
+
<span class="enter">
+
  sudo a2enmod userdir
+
  sudo /etc/init.d/apache2 restart
+
</span>
+
 
+
===Things done just for this tutorial===
+
 
+
* A script to create several users with <code>public_html</code> directories (link for when it exists)
+
* Already installed the JBrowse command line interface (CLI) via the [https://jbrowse.org/jb2/docs/quickstart_cli/ directions] (i.e., <code>sudo npm install -g @jbrowse/cli</code>)
+
* Installed bgzip, tabix, samtools and minimap2 via apt: <code>sudo apt-get install samtools tabix minimap2</code>.
+
* Created a bgzipped and samtools faidx'ed FASTAs file for ''C. elegans'' and ''C. brenneri''.
+
* Created a "Genes only" C. elegans GFF file (<code>gzip -dc c_elegans.PRJNA13758.WS286.annotations.gff3.gz | grep "\tWormBase\t" > c_elegans.genes.gff3</code>
+
 
+
==Initializing JBrowse==
+
 
+
First, use ssh to connect to the instance we have set up for this tutorial, tutorialpag30.jbrowse.org. Do this with the user name and password you got from one of us (we have 50 users configured--hopefully that will be enough!):
+
 
+
<span class="enter">
+
  ssh username@tutorialpag30.jbrowse.org
+
</span>
+
 
+
and supply the password. When you log in, you'll be in your user's home directory, where there is nothing but a public_html directory. That directory is also currently empty, so we'll use the JBrowse CLI to initialize a new JBrowse instance:
+
 
+
<span class="enter">
+
  jbrowse create public_html
+
</span>
+
Now change to that directory, <code>cd public_html</code> and do a file list to make sure it looks right:
+
 
+
'' ''' put a picture here  ''' ''
+
[[File:Public_html_listing.png]]
+
 
+
This is all of the software required to run JBrowse. If we now navigate to the tutorial machine's website with the port supplied on the username/password slip, you should see a page indicating that JBrowse was installed but not configured: http://tutorialpag30.jbrowse.org:XXXX/.
+
 
+
'' ''' put a picture here  ''' ''
+
[[File:new_jbrowse_page.png]]
+
 
+
To make sure it really works, we can click on the ''Volvox'' (not really Volvox) data set.
+
 
+
To get started creating our JBrowse instance, we'll run the JBrowse admin-server, which looks just like JBrowse proper, but has an extra admin menu. '''Important note''': The admin server is NOT meant to be left running; it is not particularly secure, so if you leave it up, somebody might start messing with your site.  To start the admin server, we change to the directory where JBrowse will be served from (<code>public_html</code>) and run the <code>jbrowse</code> command to start it:
+
 
+
<span class="enter">
+
  jbrowse admin-server -p YYYY
+
</span>
+
 
+
When we execute that command, we get a message in the terminal that it started up and gives us some URLs to use to access the server. It will look something like this:
+
 
+
[[File:Admin-server.png|1000px]]
+
 
+
The part we need is the adminKey. In a browser window, enter a URL that looks like this: http://tutorialpag30.jbrowse.org:YYYY?adminKey=yourkey
+
 
+
==Adding a reference sequence==
+
 
+
The first thing we need to do is add a reference sequence. There is already one prepared and on the web server for ''C. elegans'' and it is at
+
 
+
  http://tutorialpag30.jbrowse.org/c_elegans.PRJNA13758.WS286.genomic.fa.gz
+
  http://tutorialpag30.jbrowse.org/c_elegans.PRJNA13758.WS286.genomic.fa.gz.fai
+
  http://tutorialpag30.jbrowse.org/c_elegans.PRJNA13758.WS286.genomic.fa.gz.gzi
+
 
+
To create this indexed reference sequence, the fasta was downloaded from the WormBase ftp site, and after uncompressing it, it was bgzipped and then indexed with SAMTools:
+
 
+
<span class="enter">
+
  bgzip c_elegans.PRJNA13758.WS286.genomic.fa
+
  samtools faidx c_elegans.PRJNA13758.WS286.genomic.fa.gz
+
</span>
+
 
+
To add this as a reference sequence to JBrowse, click on the "Start a new session" and then on the resulting page, select "Open assembly manager" from the Admin menu. In the dialog that opens, click the "Add new assembly" button.  Finally, in add assembly dialog, put something useful in the "Assembly Name" field and then select "BgzipFastaAdapter" from the "Type" menu.  At that point, the dialog will change slightly to give you places to put in the above three URLs:
+
 
+
[[File:add_assembly_dialog.png|500px]]
+
 
+
Copy and paste those URLs in to the appropriate fields and then click "Save new assembly."
+
 
+
<pre class="dont">
+
  Note: this is one place where the web version of JBrowse with the admin server is slightly
+
  different from the Desktop version: if we were using the desktop version, the above dialog
+
  would have also given the option for finding the files on a local hard drive rather than
+
  only allowing URLs.
+
 
+
  Another note: In order for the above URLs to work with a web instance of JBrowse that
+
  isn't on the "same" server (where different ports == a different server), CORS (cross
+
  origin resource sharing) had to be enabled for the web server (in this case apache).
+
  If you want to do the same thing for a server you control, google "enable CORS <your
+
  server software name>" to find directions.
+
</pre>
+
 
+
==Adding a gene track from tabix-indexed GFF==
+
 
+
Magic incantation for sorting GFF3 files, and then bgzipping it:
+
 
+
<span class="enter">
+
  sort -t"`printf '\t'`" -k1,1 -k4,4n c_elegans.genes.gff3 |bgzip > c_elegans.genes.sorted.gff3.gz
+
</span>
+
 
+
and then tabix indexing it:
+
 
+
  tabix c_elegans.genes.sorted.gff3.gz
+
 
+
  http://tutorialpag30.jbrowse.org/c_elegans.genes.sorted.gff3.gz
+
  http://tutorialpag30.jbrowse.org/c_elegans.genes.sorted.gff3.gz.tbi
+
 
+
[[File:add_track_dialog.png|400px]]
+
 
+
 
+
[[File:genes_track.png|800px]]
+
 
+
==Adding a gene track from a JBrowse (NCList) track==
+
 
+
Protein coding genes from WormBase's JBrowse 1 instance
+
 
+
  https://s3.amazonaws.com/agrjbrowse/MOD-jbrowses/WormBase/WS286/c_elegans_PRJNA13758/tracks/Curated Genes (protein coding)/{refseq}/trackData.jsonz
+
 
+
[[File:protein_coding_genes.png|800px]]
+
 
+
====Side note: finding JBrowse 1 data====
+
 
+
CORS
+
 
+
Difference between web and desktop
+
 
+
==Adding variant data from a tabix-indexed VCF==
+
 
+
  https://storage.googleapis.com/elegansvariation.org/releases/current/WI.current.soft-filtered.vcf.gz
+
 
+
[[File:cendr_vcf_track.png|800px]]
+
 
+
==Adding quantitative data from a BigWig==
+
 
+
  https://data.broadinstitute.org/compbio1/PhyloCSFtracks/ce11/latest/PhyloCSF+1.bw
+
 
+
==Using JEXL to modify the display==
+
 
+
===Dynamically changing the color===
+
 
+
[[File:jexl_change_glyph_color.png|700px]]
+
 
+
===Dynamically changing the mouseover text===
+
 
+
[[File:jexl_change_mouseover.png|700px]]
+
 
+
==Synteny==
+
 
+
 
+
===Getting the data===
+
 
+
To compare two genomes, first we need a second genome. Fortunately, WormBase.org provides several assemblies for species related to ''C. elegans''. For this tutorial, we'll use ''C. brenneri''. As before, we create a new assembly in JBrowse with the indexed fasta files provided on the tutorial machine (Admin menu -> open assembly manager):
+
 
+
  http://tutorialpag30.jbrowse.org/c_brenneri.PRJNA20035.WS287.genomic.fa.gz
+
  http://tutorialpag30.jbrowse.org/c_brenneri.PRJNA20035.WS287.genomic.fa.gz.fai
+
  http://tutorialpag30.jbrowse.org/c_brenneri.PRJNA20035.WS287.genomic.fa.gz.gzi
+
 
+
<span class="enter">
+
  minimap2 c_elegans.PRJNA13758.WS282.genomic.fa.gz c_brenneri.PRJNA20035.WS287.genomic.fa.gz > c_elegans.c_brenneri.paf
+
</span>
+
 
+
===Configuring with jbrowse admin===
+
 
+
  http://tutorialpag30.jbrowse.org/c_elegans.c_brenneri.paf
+
 
+
 
+
 
+
[[File:dotplot_config.png|800px]]
+
 
+
 
+
[[File:elegans_brenneri_dotplot.png|1000px]]
+
 
+
===Using dotplot and synteny views===
+
 
+
[[File:elegans_brenneri_synteny.png|1200px]]
+
 
+
 
+
[[File:synteny_horizontal_flip.png|800px]]
+
 
+
[[File:open_synteny_from_lgv.png|800px]]
+
 
+
==Adding text search indexes==
+
 
+
===Creating trix index===
+
 
+
The <code>jbrowse</code> CLI provide a tools to create text indexes of many of the data sources we used, like tabix indexed files. Note that it does not index JBrowse 1 (NClist) data; that we'll do below. We can create a searchable index for the Genes track we created but we should exclude the VCF because it's very big and indexing it wouldn't help our users.  To do that, we first have to find the <code>trackId</code> of the Genes track. Click on the ... after the name of the Genes track and select "About track", and copy the value of the config.trackId. It will look something like <code>blah blah blah</code>. Now on the command line in the public_html directory, run the command
+
 
+
  jbrowse text-index --tracks=<genes trackId>
+
 
+
'''WARNING:''' if you don't exclude the VCF file, it will take a very long time to run, as it will fetch the large VCF file from Google and index it.
+
 
+
===Adding a JBrowse 1 name index===
+

Latest revision as of 18:40, 30 November 2022

Moved to JBrowse2_Tutorial_PAG_2023