Difference between revisions of "JBrowse"

From GMOD
Jump to: navigation, search
m (converting to tool_data-generated page)
m
(87 intermediate revisions by 4 users not shown)
Line 1: Line 1:
<!-- to alter this page, please edit the raw data, which is stored at http://gmod.org/wiki/JBrowse/tool_data -->
+
{{SessionHead}}
 +
{| class="tutorialheader"
 +
| {{TutorialTitleLine|[[gmod:JBrowse|JBrowse]]}}<br />
 +
[[2011 GMOD Spring Training]]<br />
 +
8-12 March 2011<br />
 +
[[User:MitchSkinner|Mitch Skinner]]
 +
| align="right" | [[Image:GMODAmericas2011Logo.png|200px]]
 +
|}
  
{{ :JBrowse/tool_data | template = Template:ToolDisplay }}
 
  
[[Category:GMOD Components]]
+
__TOC__
[[Category:AJAX]]
+
 
[[Category:JBrowse]]
+
== Prerequisites ==
 +
 
 +
These have <b>already been set up</b> on the VM image.
 +
 
 +
Perl:
 +
* [[gmod:BioPerl|BioPerl 1.6]]
 +
* {{CPAN|JSON}}
 +
* {{CPAN|JSON::XS}} (optional, for speed)
 +
* {{CPAN|PerlIO::gzip}}
 +
* {{CPAN|Heap::Simple}}
 +
* {{CPAN|Heap::Simple::XS}}
 +
* {{CPAN|Devel::Size}}
 +
 
 +
 
 +
System packages:
 +
* libpng12-0
 +
* libpng12-dev
 +
 
 +
Optional, for BAM files:
 +
* samtools, and its dependency libncurses5-dev
 +
* perl module: {{CPAN|Bio::DB::SAM}}
 +
 
 +
<div class="dont">
 +
And this is how they were installed: <b>(don't do this)</b>
 +
<pre class="dont">
 +
$ sudo apt-get install git-core libpng12-0 libpng12-dev libncurses5-dev
 +
$ cd ~/Documents/Software
 +
$ wget http://sourceforge.net/projects/samtools/files/samtools/0.1.7/samtools-0.1.7a.tar.bz2
 +
$ tar xjf samtools-0.1.7a.tar.bz2
 +
$ cd samtools-0.1.7a/
 +
$ make
 +
$ sudo cpan
 +
cpan[1]> install Bio::DB::Das::Chado Bio::DB::Sam JSON JSON::XS PerlIO::gzip Heap::Simple Heap::Simple::XS Devel::Size
 +
</pre>
 +
</div>
 +
 
 +
Also: make sure you can Copy/paste from wiki.
 +
 
 +
Shell tricks:
 +
* Tab completion
 +
* History
 +
* History search
 +
<br><br><br>
 +
 
 +
== JBrowse Introduction ==
 +
 
 +
How and why [[JBrowse]] is different from most other web-based genome browsers, including [[GBrowse]].
 +
 
 +
More detail: [http://genome.cshlp.org/content/19/9/1630.full paper]
 +
<br><br>
 +
 
 +
[[Media:JBrowse_GMOD_Meeting_2011.pdf]]
 +
 
 +
== JBrowse arch ==
 +
[[Image:Jbrowse_arch.png|||600px]]
 +
 
 +
== Setting up JBrowse ==
 +
 
 +
=== Getting JBrowse ===
 +
 
 +
<div class="dont">
 +
* Install git '''(This has already been done in the VMware image.)'''
 +
 
 +
<pre class="dont">$ sudo apt-get install git-core
 +
</pre>
 +
 
 +
</div>
 +
 
 +
* prepare a directory for JBrowse
 +
 
 +
$ <span class="enter">cd /var/www</span>
 +
$ <span class="enter">sudo mkdir jbrowse</span>
 +
$ <span class="enter">sudo chown gmod.gmod jbrowse</span>
 +
 
 +
* download it from github
 +
 
 +
$ <span class="enter">git clone git://github.com/jbrowse/jbrowse.git jbrowse</span>
 +
 
 +
(or alternately)
 +
 
 +
$ <span class="enter">git clone https://github.com/jbrowse/jbrowse.git jbrowse</span>
 +
 
 +
=== Starting Point ===
 +
 
 +
Visit in web browser: http://localhost/jbrowse/
 +
 
 +
 
 +
You should see just a blank white page.
 +
 
 +
=== Basic Steps ===
 +
 
 +
Setting up a JBrowse instance with feature data goes in three basic steps:
 +
 
 +
# Specify reference sequences
 +
# Load feature data
 +
# Collect feature names
 +
 
 +
<!--
 +
=== If you didn't follow along in the chado session ===
 +
 
 +
We'll be using the chado database from the chado session; if you didn't follow along exactly, re-load the database like so:
 +
 
 +
<pre>
 +
$ dropdb chado
 +
$ createdb chado
 +
$ bzip2 -cd ~/Documents/Software/schema/chado/complete_db.bz2 | psql chado
 +
</pre>
 +
-->
 +
 
 +
=== Data from a database ===
 +
 
 +
Here, we'll use the [[Chado]] adapter; other common database adapters are {{CPAN|Bio::DB::SeqFeature::Store}} and {{CPAN|Bio::DB::GFF}}.
 +
 
 +
Starting config file:
 +
<tt>~/Documents/Data/jbrowse/first-config.json</tt>
 +
<pre>
 +
{
 +
  "description": "Pythium",
 +
  "db_adaptor": "Bio::DB::Das::Chado",
 +
  "db_args": { "-dsn": "dbi:Pg:dbname=chado",
 +
              "-user": "gmod",
 +
              "-pass": ""},
 +
 
 +
...
 +
</pre>
 +
 
 +
==== Specify reference sequences ====
 +
 
 +
The first script to run is <tt>bin/prepare-refseqs.pl</tt>; that script is the way you tell JBrowse about what your reference sequences are.  Running <tt>bin/prepare-refseqs.pl</tt> also sets up the "DNA" track.
 +
 
 +
Run this from within the <tt>/var/www/jbrowse</tt> directory (you could run it elsewhere, but you'd have to explicitly specify the location of the data directory on the command line).
 +
 
 +
$ <span class="enter">cd /var/www/jbrowse</span>
 +
$ <span class="enter">bin/prepare-refseqs.pl --conf ~/Documents/Data/jbrowse/first-config.json \
 +
    --refs scf1117875582023</span>
 +
 
 +
Visit in web browser: you should new see the JBrowse UI (and if you zoom all the way in, some sequence)
 +
 
 +
==== Load Feature Data ====
 +
 
 +
Next, we'll use <tt>biodb-to-json.pl</tt> to get feature data out of the database and turn it into [[gmod:Glossary#JSON|JSON]] data that the web browser can use.
 +
 
 +
Add a basic track definition; this will tell <tt>biodb-to-json.pl</tt> what features to put into the track, and how the track should look:
 +
 
 +
<javascript>...
 +
 
 +
  "TRACK DEFAULTS": {
 +
    "class": "feature"
 +
  },
 +
 
 +
  "tracks": [
 +
    {
 +
      "track": "gene",
 +
      "key": "Gene",
 +
      "feature": ["gene"],
 +
      "autocomplete": "all",
 +
      "class": "feature2",
 +
      "urlTemplate": "http://www.google.com/search?q={name}"
 +
    }
 +
  ]
 +
}</javascript>
 +
 
 +
<tt>track</tt> specifies the track identifier (a unique name for the track, for the software to use).  This should be just letters and numbers and - and _ characters; using other characters makes things less convenient.
 +
 
 +
<tt>key</tt> specifies a human-friendly name for the track, which can use any characters you want.
 +
 
 +
<tt>feature</tt> gives a list of feature types to include in the track.
 +
 
 +
<tt>autocomplete</tt> including this setting makes the features in the track searchable.
 +
 
 +
<tt>urltemplate</tt> specifies a URL pattern that you can use to link genomic features to specific web pages.
 +
 
 +
<tt>class</tt> specifies the [[gmod:Glossary#CSS|CSS]] class that describes how the feature should look.  The classes are specified in the <tt>genome.css</tt> file:
 +
$ <span class="enter">less genome.css</span>
 +
 
 +
For this particular track, I've specified the <tt>"feature2"</tt> class which looks like this in the CSS file:
 +
 
 +
<javascript>.plus-feature2,
 +
.minus-feature2 {
 +
    position:absolute;
 +
    height: 15px;
 +
    background-repeat: repeat-x;
 +
    cursor: pointer;
 +
    min-width: 1px;
 +
    z-index: 10;
 +
}
 +
 
 +
.plus-feature2 { background-image: url('img/plus-herringbone16.png'); }
 +
 
 +
.minus-feature2 { background-image: url('img/minus-herringbone16.png'); }</javascript>
 +
 
 +
Run the <tt>bin/biodb-to-json.pl</tt> script with this config file to set up this track:
 +
 
 +
$ <span class="enter">bin/biodb-to-json.pl --conf ~/Documents/Data/jbrowse/first-config.json</span>
 +
 
 +
(visit in web browser: you should see a new gene track)
 +
 
 +
==== More complex track ====
 +
 
 +
Now we'll add a second track; this one will have subfeatures.  This snippet is from:
 +
<tt>~/Documents/Data/jbrowse/second-config.json</tt>
 +
 
 +
<javascript>...
 +
 
 +
    {
 +
      "track": "match",
 +
      "key": "Matches",
 +
      "feature": ["match"],
 +
      "autocomplete": "all",
 +
      "subfeatures": true,
 +
      "class": "generic_parent",
 +
      "subfeature_classes": {
 +
          "match_part": "match_part"
 +
      },
 +
      "clientConfig": {
 +
          "subfeatureScale": 20
 +
      }
 +
    }
 +
 
 +
...</javascript>
 +
 
 +
$ <span class="enter">bin/biodb-to-json.pl --conf ~/Documents/Data/jbrowse/second-config.json</span>
 +
 
 +
(visit in web browser: you should see a new track, which has subfeatures if you're zoomed in far enough)
 +
 
 +
==== Collect feature names ====
 +
 
 +
When you generate JSON for a track, if you specify <tt>"autocomplete"</tt> then a listing of all of the names/IDs from that track (along with the locations of the corresponding features) will also be generated.
 +
 
 +
The <tt>bin/generate-names.pl</tt> script collects those lists of names from all the tracks and combines them into one big tree that the client uses to search.
 +
 
 +
$ <span class="enter">bin/generate-names.pl -v</span>
 +
 
 +
Visit in web browser, search for feature name: e.g.,
 +
 
 +
: '''maker-scf1117875582023-snap-gene-0.3'''
 +
 
 +
=== Data from flat files ===
 +
 
 +
We're going to recreate a JBrowse instance from a different data source: flat files.
 +
 
 +
First, wipe the slate clean by removing the <tt>data</tt> directory:
 +
 
 +
$ <span class="enter">rm -r data</span>
 +
 
 +
If you visit your JBrowse instance in a web browser, you'll see a blank screen again
 +
 
 +
==== Sequences ====
 +
 
 +
To import sequence data from a fasta file into a JBrowse instance, use <tt>prepare-refseqs.pl</tt> with the <tt>--fasta</tt> argument:
 +
 
 +
$ <span class="enter">bin/prepare-refseqs.pl --fasta ~/Documents/Data/jbrowse/scf1117875582023.fasta</span>
 +
 
 +
Visit in web browser; you should see a second reference sequence.
 +
 
 +
==== Features ====
 +
 
 +
To get feature data from flat files into JBrowse, use <tt>flatfile-to-json.pl</tt>.  We'll use some more of the data from the [[MAKER]] session:
 +
 
 +
$ <span class="enter">bin/flatfile-to-json.pl \
 +
    --gff /home/gmod/Documents/Data/maker/example2_pyu/finished.maker.output/gff/scf1117875582023.gff \
 +
    --type match --getSubs --tracklabel "gff_match" --key "GFF match" \
 +
    --cssclass generic_parent --subfeatureClasses '{"match_part": "generic_part_a"}'</span>
 +
 
 +
Visit in web browser; you should see a new "GFF match" track.
 +
 
 +
==== BAM data ====
 +
 
 +
To incorporate data from a BAM source:
 +
 
 +
$ <span class="enter"> bin/bam-to-json.pl \
 +
    --bam ~/Documents/Data/jbrowse/simulated-sorted.bam \
 +
    --tracklabel BAM_data --key "BAM Data"
 +
 
 +
=== Quantitative data ===
 +
 
 +
JBrowse can also display quantitative data in the wiggle format.  JBrowse processes wiggle files with a C++ program, which you have to compile:
 +
 
 +
$ <span class="enter">make</span>
 +
 
 +
Now you can process the wiggle file:
 +
 
 +
$ <span class="enter">bin/wig-to-json.pl --wig ~/Documents/Data/jbrowse/pyu.wig \
 +
    --tracklabel "coverage_wig" --key "Wiggle Coverage" --min 0 --max 50</span>
 +
 
 +
Visit in web browser
 +
 
 +
<br><br>
 +
 
 +
== Common Problems ==
 +
 
 +
* JSON syntax errors
 +
 
 +
<br><br><br>
 +
 
 +
== Other links ==
 +
 
 +
* Config file ref: http://jbrowse.org/code/jbrowse-master/docs/config.html
 +
* DIV test: http://jbrowse.org/test/boatdiv/boat.html
 +
 
 +
 
 +
<!--
 +
== Advanced JBrowse ==
 +
 
 +
* Feature CSS
 +
 
 +
== How it works ==
 +
* HTML elements
 +
* JSON
 +
* NCLists
 +
** named vs. positional data formats ?
 +
* Static files
 +
** No server-side code needed for browsing
 +
*** Easier deployment
 +
*** Easy parallel installation
 +
** HTTP caching
 +
*** Traditional: cut down on transfer by only downloading region of interest
 +
*** Complement: cut down on transfer by only downloading when there's a change
 +
**** Underutilized
 +
**** HTTP servers handle this for you automatically with static files
 +
-->
 +
 
 +
== Evaluation ==
 +
 
 +
{{Feedback}}
 +
 
 +
{{NextSession|MAKER|MAKER}}

Revision as of 18:38, 9 March 2011

Template:SessionHead

JBrowse Session

2011 GMOD Spring Training
8-12 March 2011
Mitch Skinner

GMODAmericas2011Logo.png


Prerequisites

These have already been set up on the VM image.

Perl:


System packages:

  • libpng12-0
  • libpng12-dev

Optional, for BAM files:

  • samtools, and its dependency libncurses5-dev
  • perl module: Bio::DB::SAM

And this is how they were installed: (don't do this)

$ sudo apt-get install git-core libpng12-0 libpng12-dev libncurses5-dev
$ cd ~/Documents/Software
$ wget http://sourceforge.net/projects/samtools/files/samtools/0.1.7/samtools-0.1.7a.tar.bz2
$ tar xjf samtools-0.1.7a.tar.bz2
$ cd samtools-0.1.7a/
$ make
$ sudo cpan
cpan[1]> install Bio::DB::Das::Chado Bio::DB::Sam JSON JSON::XS PerlIO::gzip Heap::Simple Heap::Simple::XS Devel::Size

Also: make sure you can Copy/paste from wiki.

Shell tricks:

  • Tab completion
  • History
  • History search




JBrowse Introduction

How and why JBrowse is different from most other web-based genome browsers, including GBrowse.

More detail: paper

Media:JBrowse_GMOD_Meeting_2011.pdf

JBrowse arch

Jbrowse arch.png

Setting up JBrowse

Getting JBrowse

  • Install git (This has already been done in the VMware image.)
$ sudo apt-get install git-core
  • prepare a directory for JBrowse
$ cd /var/www
$ sudo mkdir jbrowse
$ sudo chown gmod.gmod jbrowse
  • download it from github
$ git clone git://github.com/jbrowse/jbrowse.git jbrowse

(or alternately)

$ git clone https://github.com/jbrowse/jbrowse.git jbrowse

Starting Point

Visit in web browser: http://localhost/jbrowse/


You should see just a blank white page.

Basic Steps

Setting up a JBrowse instance with feature data goes in three basic steps:

  1. Specify reference sequences
  2. Load feature data
  3. Collect feature names


Data from a database

Here, we'll use the Chado adapter; other common database adapters are Bio::DB::SeqFeature::Store and Bio::DB::GFF.

Starting config file: ~/Documents/Data/jbrowse/first-config.json

{
  "description": "Pythium",
  "db_adaptor": "Bio::DB::Das::Chado",
  "db_args": { "-dsn": "dbi:Pg:dbname=chado",
               "-user": "gmod",
               "-pass": ""},

...

Specify reference sequences

The first script to run is bin/prepare-refseqs.pl; that script is the way you tell JBrowse about what your reference sequences are. Running bin/prepare-refseqs.pl also sets up the "DNA" track.

Run this from within the /var/www/jbrowse directory (you could run it elsewhere, but you'd have to explicitly specify the location of the data directory on the command line).

$ cd /var/www/jbrowse
$ bin/prepare-refseqs.pl --conf ~/Documents/Data/jbrowse/first-config.json \
    --refs scf1117875582023

Visit in web browser: you should new see the JBrowse UI (and if you zoom all the way in, some sequence)

Load Feature Data

Next, we'll use biodb-to-json.pl to get feature data out of the database and turn it into JSON data that the web browser can use.

Add a basic track definition; this will tell biodb-to-json.pl what features to put into the track, and how the track should look:

<javascript>...

 "TRACK DEFAULTS": {
   "class": "feature"
 },
 "tracks": [
   {
     "track": "gene",
     "key": "Gene",
     "feature": ["gene"],
     "autocomplete": "all",
     "class": "feature2",
     "urlTemplate": "http://www.google.com/search?q={name}"
   }
 ]

}</javascript>

track specifies the track identifier (a unique name for the track, for the software to use). This should be just letters and numbers and - and _ characters; using other characters makes things less convenient.

key specifies a human-friendly name for the track, which can use any characters you want.

feature gives a list of feature types to include in the track.

autocomplete including this setting makes the features in the track searchable.

urltemplate specifies a URL pattern that you can use to link genomic features to specific web pages.

class specifies the CSS class that describes how the feature should look. The classes are specified in the genome.css file:

$ less genome.css

For this particular track, I've specified the "feature2" class which looks like this in the CSS file:

<javascript>.plus-feature2, .minus-feature2 {

   position:absolute;
   height: 15px;
   background-repeat: repeat-x;
   cursor: pointer;
   min-width: 1px;
   z-index: 10;

}

.plus-feature2 { background-image: url('img/plus-herringbone16.png'); }

.minus-feature2 { background-image: url('img/minus-herringbone16.png'); }</javascript>

Run the bin/biodb-to-json.pl script with this config file to set up this track:

$ bin/biodb-to-json.pl --conf ~/Documents/Data/jbrowse/first-config.json

(visit in web browser: you should see a new gene track)

More complex track

Now we'll add a second track; this one will have subfeatures. This snippet is from: ~/Documents/Data/jbrowse/second-config.json

<javascript>...

   {
     "track": "match",
     "key": "Matches",
     "feature": ["match"],
     "autocomplete": "all",
     "subfeatures": true,
     "class": "generic_parent",
     "subfeature_classes": {
         "match_part": "match_part"
     },
     "clientConfig": {
         "subfeatureScale": 20
     }
   }

...</javascript>

$ bin/biodb-to-json.pl --conf ~/Documents/Data/jbrowse/second-config.json

(visit in web browser: you should see a new track, which has subfeatures if you're zoomed in far enough)

Collect feature names

When you generate JSON for a track, if you specify "autocomplete" then a listing of all of the names/IDs from that track (along with the locations of the corresponding features) will also be generated.

The bin/generate-names.pl script collects those lists of names from all the tracks and combines them into one big tree that the client uses to search.

$ bin/generate-names.pl -v

Visit in web browser, search for feature name: e.g.,

maker-scf1117875582023-snap-gene-0.3

Data from flat files

We're going to recreate a JBrowse instance from a different data source: flat files.

First, wipe the slate clean by removing the data directory:

$ rm -r data

If you visit your JBrowse instance in a web browser, you'll see a blank screen again

Sequences

To import sequence data from a fasta file into a JBrowse instance, use prepare-refseqs.pl with the --fasta argument:

$ bin/prepare-refseqs.pl --fasta ~/Documents/Data/jbrowse/scf1117875582023.fasta

Visit in web browser; you should see a second reference sequence.

Features

To get feature data from flat files into JBrowse, use flatfile-to-json.pl. We'll use some more of the data from the MAKER session:

$ bin/flatfile-to-json.pl \
    --gff /home/gmod/Documents/Data/maker/example2_pyu/finished.maker.output/gff/scf1117875582023.gff \
    --type match --getSubs --tracklabel "gff_match" --key "GFF match" \
    --cssclass generic_parent --subfeatureClasses '{"match_part": "generic_part_a"}'

Visit in web browser; you should see a new "GFF match" track.

BAM data

To incorporate data from a BAM source:

$  bin/bam-to-json.pl \
    --bam ~/Documents/Data/jbrowse/simulated-sorted.bam \
    --tracklabel BAM_data --key "BAM Data"

Quantitative data

JBrowse can also display quantitative data in the wiggle format. JBrowse processes wiggle files with a C++ program, which you have to compile:

$ make

Now you can process the wiggle file:

$ bin/wig-to-json.pl --wig ~/Documents/Data/jbrowse/pyu.wig \
    --tracklabel "coverage_wig" --key "Wiggle Coverage" --min 0 --max 50

Visit in web browser



Common Problems

  • JSON syntax errors




Other links


Evaluation

Please give us your comments on this session. We will ask for your feedback on each session and the course as a whole on the last day. Your comments will help guide the direction and content of future GMOD training and outreach efforts.


Next session →   MAKER

Facts about "JBrowse"RDF feed
Available on platformweb +
Has URLhttp://jbrowse.org/install/ +, http://jbrowse.org +, http://twitter.com/usejbrowse +, http://github.com/GMOD/jbrowse +, http://jbrowse.org/demos +, http://icemangenome.net/%E2%80%8E +, http://genomesunzipped.org/jbrowse +, http://beetlebase.org + and http://www.medicinalgenomics.com/the-jane-ome/ +
Has descriptionBrowse the genome of Ötzi the ice man + and JBrowse is a genome browser with a fully dJBrowse is a genome browser with a fully dynamic AJAX interface, being developed as the eventual successor to GBrowse. It is very fast and scales well to large datasets. JBrowse is javascript-based and does almost all of its work directly in the user's web browser, with minimal requirements for the server.

Features[edit]

  • Fast, smooth scrolling and zooming. Explore your genome with unparalleled speed.
  • Scales easily to multi-gigabase genomes and deep-coverage sequencing.
  • Supports GFF3, BED, FASTA, Wiggle, BigWig, BAM, VCF (with tabix), REST, and more. BAM, BigWig, and VCF data are displayed directly from the compressed binary file with no conversion needed.
  • Very light server resource requirements. In fact, JBrowse has no back-end server code, just tools for formatting data files to be read directly over HTTP. Serve huge datasets from a single low-cost cloud instance.ets from a single low-cost cloud instance. +
Has development statusactive +
Has input formatGFF3 +, BED +, FASTA +, WIG +, BedGraph +, Bio::DB::* +, UCSC +, Wiggle +, BigWig + and BAM +
Has licenceLGPL + and Artistic License 2.0 +
Has logoJBrowseLogo.png +
Has software maturity statusmature +
Has support statusactive +
Has titleJBrowse demos +, Ice Man Genome +, Genomes Unzipped: Public Personal Genomics +, BeetleBase + and The Jane-Ome, medicinal marijuana project +
Has topicJBrowse +
Is open sourceYes +
Link typedownload +, social media +, website +, source code +, demo server + and wild URL +
Release date2008 +
Tool functionality or classificationGenome visualization +
Written in languageJavascript + and Perl +
Has subobjectThis property is a special property in this wiki.JBrowse#http://jbrowse.org/install/ +, JBrowse#http://jbrowse.org +, JBrowse#http://twitter.com/usejbrowse +, JBrowse#http://github.com/GMOD/jbrowse +, JBrowse#http://jbrowse.org/demos +, JBrowse#http://icemangenome.net/‎ +, JBrowse#http://genomesunzipped.org/jbrowse +, JBrowse#http://beetlebase.org + and JBrowse#http://www.medicinalgenomics.com/the-jane-ome/ +