JBrowse Configuration Guide
At the most basic level, setting up JBrowse consists of:
- Placing a copy of the JBrowse directory somewhere in the web-servable part of your server's file system (often
/var/www
by default) - Running the JBrowse setup script to install a few server-side dependencies
- Running one or more server-side scripts to create a directory containing a JBrowse-formatted copy of your data.
Both the JBrowse code and these data files must be in a location where the web server can serve them to users. Then, a user pointing their web browser at the appropriate URL for the index.html file in the JBrowse directory will see the JBrowse interface, including sequence and feature tracks reflecting the data source.
Reference sequence data should be added first (using prepare-refseqs.pl`), followed by annotation data. Once all of annotation data has been added, use generate-names.pl to make the feature names searchable.
Contents
- 1 User Interface
- 2 Reference Sequences
- 3 Feature Tracks
- 4 Wiggle Tracks
- 5 Image Tracks
- 6 Name Searching and Autocompletion
- 7 Removing Tracks
- 8 Compressing data on the server
- 9 URL Control
- 10 Data Export
- 11 Faceted Track Selection
- 12 Anonymous Usage Statistics
- 13 Advanced Topics
- 14 See also
- 15 External Links
User Interface
Reference Sequences
The reference sequences are the sequences whose annotations the browser will view, and which therefore provide a co-ordinate system for all other tracks. At a close enough zoom level, the sequence data itself is visible as a special track; this track is hidden once the individual sequence characters become too small to distinguish.
The exact interpretation of "reference sequence" will depend on how you are using JBrowse; but for a model organism genome database, each reference sequence would typically represent a chromosome (in a perfect assembly) or at least a contig. Before any feature or image tracks can be input to JBrowse, the reference sequence must be taken into consideration. This is handled by the prepare-refseqs.pl script.
prepare-refseqs.pl
This script is used to input sequence data into JBrowse, and must be run prior to the addition of feature tracks or image tracks. The simplest way to use it is with the --fasta option, which uses a single sequence or set of reference sequences from a FASTA file:
bin/prepare-refseqs.pl --fasta <fasta file> [options]
If the file has multiple sequences (e.g. multiple chromosomes), each sequence will become a reference sequence by default. You may switch between these sequences by selecting the sequence of interest via the pull-down menu to the right of the large "zoom in" button.
You may use any alphabet you wish for your sequences (i.e., you are not restricted to the nucleotides A, T, C, and G; any alphanumeric character, as well as several other characters, may be used). Hence, it is possible to browse RNA and protein in addition to DNA. However, some characters should be avoided, because they will cause the sequence to "split" - part of the sequence will be cut off and and continue on the next line. These characters are the hyphen and question mark. Unfortunately, this prevents the use of hyphens to represent gaps in a reference sequence.
In addition to reading from a fasta file, prepare-refseqs.pl can read sequences from a gff file or a database. In order to read fasta sequences from a database, a config file must be used.
Syntax used to import sequences from gff files:
bin/prepare-refseqs.pl --gff <gff file with sequence information> [options]
Syntax used to import sequences with a config file:
bin/prepare-refseqs.pl --conf <config file that references a database with sequence information> --[refs|refid] <reference sequences> [options]
Option | Value |
---|---|
fasta, gff, or conf | Path to the file that JBrowse will use to import sequences. With the fasta and gff options, the sequence information is imported directly from the specified file. With the conf option, the specified config file includes the details necessary to access a database that contains the sequence information. Exactly one of these three options must be used. |
out | A path to the output directory (default is 'data' in the current directory) |
seqdir | The directory where the reference sequences are stored (default: <output directory>/seq) |
noseq | Causes no reference sequence track to be created. This is useful for reducing disk usage. |
refs | A comma-delimited list of the names of sequences to be imported as reference sequences. This option (or refid) is required when using the conf option. It is not required when the fasta or gff options are used, but it can be useful with these options, since it can be used to select which sequences JBrowse will import. |
refids | A comma-delimited list of the database identifiers of sequences to be imported as reference sequences. This option is useful when working with a Chado database that contains data from multiple different species, and those species have at least one chromosome with the same name (e.g. chrX). In this case, the desired chromosome cannot be uniquely identified by name, so it is instead identified by ID. This ID can be found in the 'feature_id' column of 'feature' table in a Chado database. |
Feature Tracks
Feature tracks can be used to visualize localized annotations on a sequence, such as gene models, transcript alignments, SNPs and so forth. JBrowse has several different method of importing annotation data into feature tracks:
- flatfile-to-json.pl - import GFF3 and BED files (recommended for new users)
- biodb-to-json.pl - import from a Bio::DB::SeqFeature::Store database (recommended for users with existing databases)
- bam-to-json.pl - import BAM files
- ucsc-to-json.pl - import UCSC database dumps (.sql and .txt.gz)
Data Formatting
flatfile-to-json.pl
This script inputs a single track into JBrowse. To format multiple tracks for JBrowse, execute the script once for each track.
Terminology: A flat file is a database that exists entirely in a single file. For this script, the flat file must be a GFF3, GFF2, or BED file.
Basic syntax:
bin/flatfile-to-json.pl --[gff|gff2|bed] <flat file> --tracklabel <track name> [options]
Hint: flatfile-to-json.pl simplifies the process of inputting a small number of tracks into JBrowse, since it does not use a config file. If you have many tracks, you will probably want to use a config file, because its structure will make the task of editing tracks easier. In that case, the appropriate script will be biodb-to-json.pl.
Option | Value |
---|---|
gff, gff2, or bed | The name of the file that contains the feature data. The names of these options correspond to the file types, with the exception of gff, which uses a GFF3 file instead of a GFF file. Exactly one of these three options must be used. |
tracklabel | The internal name that JBrowse will give to this feature track. This option requires a value. |
key | The external, human-readable label seen on the feature track when it is viewed in JBrowse. The value of key defaults to the value of tracklabel. |
autocomplete | Dictates what the features of the track will be searchable by after running generate-names.pl. This option can be used with the arguments "label", "alias", "all", or "none". By default, "none" is used.
|
out | A path to the output directory (default is 'data' in the current directory). |
cssClass | The css class that will be used to create the feature track. This option makes it possible to choose how the feature track will look by selecting a template class from genome.css. The default css class is 'feature'. |
getType | Causes the 'type' to be included in the output JSON file. The type is the feature that has been predicted (e.g. promoter site, gene). If a gff file is being used, the type will be in column 3. |
getPhase | Causes the 'phase' to be included in the output JSON file. The phase describes the reading frame of a DNA (or messenger RNA) sequence. If the phase is relevant, it can have the values 0, 1, or 2; otherwise, the value associated with the phase is '.'. If a gff file is being used, the phase will be in column 8. |
getSubs | Causes subfeature data to be included in the output JSON file. |
getLabel | Causes the 'Name' attribute associated with each feature to be included the output JSON file. This will cause a textual name to appear below the features in the track. If a gff3 file is being used, the 'Name' attribute will be in column 9 when it is defined. |
urlTemplate | A url that your browser will visit when you click on a feature in this track. This is especially useful if you want to link a feature to a page with more information about that feature. |
arrowheadClass | When this option is used, directional features will be given an arrowhead. The presence and orientation of the arrowhead for each individual feature will depend on data in the input file. Arrowhead classes are defined in genome.css. There is only one that comes with JBrowse (transcript-arrowhead). |
subfeatureClasses | The css class(es) that will be used for the subfeatures of a feature track. This option makes it possible to choose how the subfeatures will appear. Any of the classes in genome.css can be used for the subfeatures. This option must be used with getSubs in order for subfeatures to appear. |
clientConfig | Any visual additions or edits for the main features of the track (not for subfeatures). These edits must be specified in JSON syntax. |
type | The type of feature that will appear in the feature track. This option is useful when the input file contains features of several different types, and you are interested in only having one type of feature (e.g. only having features that are genes) in the feature track. In gff3 files, the type is in the third column. |
extraData | Use additional information from the input file to create variations in the appearance or behavior of individual features. This option is meant to be used in conjunction with other options. For each feature in the track, a perl subroutine is used to extract additional information, which is then associated with a variable. The value of this variable can be different for each feature. When the name of this variable is surrounded by curly braces and used in the argument for a different option, such as urlTemplate, the feature-specific data is used. |
nclChunk | The NCList chunk size. This option should not be used unless an error such as "json or perl structure exceeds maximum nesting level" is encountered. If this error does occur, lower the chunk size (the default is 50000). |
bam-to-json.pl
This script inputs a track into JBrowse using a BAM file. Tracks added with this script are similar in appearance to tracks added by flatfile-to-json.pl.
Special dependencies: SAMtools, Bio::DB::SAM
Basic syntax:
bin/bam-to-json.pl --bam <bam file> --tracklabel <track name> [options]
Option | Value |
---|---|
bam | The name of the bam file that contains the feature data. This option requires a value. |
tracklabel | The internal name that JBrowse will give to this feature track. This option requires a value. |
key | The external, human-readable label seen on the feature track when it is viewed in JBrowse. The value of key defaults to the value of tracklabel. |
out | A path to the output directory (default is 'data' in the current directory). |
cssClass | The css class that will be used to create the feature track. This option makes it possible to choose how the feature track will look by selecting a template class from genome.css. The default css class is 'feature'. |
clientConfig | Any visual additions or edits for the main features of the track (not for subfeatures). These edits must be specified in JSON syntax. |
nclChunk | The NCList chunk size in bytes. This option should not be used unless an error such as "json or perl structure exceeds maximum nesting level" is encountered. If this error does occur, lower the chunk size (the default is 50000 bytes). |
compress | This option causes the output JSON files for the track (trackData.json and hist-*.json) to be compressed with gzip. |
biodb-to-json.pl
This script uses a config file to produce a set of feature tracks in JBrowse. It can be used to obtain information from any database with appropriate schema, or from flat files. Because it can produce several feature tracks in a single execution, it is useful for large-scale feature data entry into JBrowse.
Basic syntax:
bin/biodb-to-json.pl --conf <config file> [options]
Option | Value |
---|---|
conf | The name of the JSON configuration file that will be used. This option must be specified. |
out | A path to the output directory (default is 'data' in the current directory). |
track | The identifier of a single track that will be updated or added to JBrowse. In the list of key-value pairs comprising an individual track definition in the config file, the identifier will be the value associated with "track". |
ref | A comma-delimited list of reference sequence names, used to limit database queries to a subset of JBrowse reference sequences. By default, the database is queried for all reference sequences in JBrowse. |
refid | A comma-delimited list of reference sequence IDs from a Chado database, used to limit database queries to a subset of JBrowse reference sequences. By default, the database is queried for all reference sequences in JBrowse. |
compress | This option causes the output JSON files for the track (trackData.json and hist-*.json) to be compressed with gzip. |
ucsc-to-json.pl
This script uses data from UCSC genome annotation database. To reach this data, go to hgdownload.cse.ucsc.edu and click the link for the genome of interest. Next, click the "Annotation Database" link. The data relevant to ucsc-to-json.pl (*.sql and *.txt.gz files) can be downloaded from either this page or the FTP server described on this page.
Together, a *.sql and *.txt.gz pair of files (such as cytoBandIdeo.txt.gz and cytoBandIdeo.sql) constitute a database table. Ucsc-to-json.pl uses the *.sql file to get the column labels, and it uses the *.txt.gz file to get the data for each row of the table. For the example pair of files above, the name of the database table is "cytoBandIdeo". This will become the name of the JBrowse track that is produced from the data in the table.
In addition to all of the feature-containing tables that you want to use as JBrowse tracks, you will also need to download the trackDb.sql and trackDb.txt.gz files for the organism of interest.
Basic syntax:
bin/ucsc-to-json.pl --in <directory with files from UCSC> --track <database table name> [options]
Hint: If you're using this approach, it might be convenient to also download the sequence(s) from UCSC. These are usually available from the "Data set by chromosome" link for the particular genome or from the FTP server.
Option | Value |
---|---|
in | A directory containing all of the *.sql and *.txt.gz data from UCSC. This directory must contain the trackDb.sql and trackDb.txt.gz files for the organism of interest, as well as all of the feature-containing tables that you wish to use as JBrowse tracks. |
track | The name of the database table. If you leave off the .sql or .txt.gz extensions of the table files you wish to use, you will have this value. |
out | A path to the output directory (default is 'data' in the current directory). |
cssClass | The css class that will be used to create the feature track. This option makes it possible to choose how the feature track will look by selecting a template class from genome.css. The default css class is 'feature'. |
arrowheadClass | When this option is used, directional features will be given an arrowhead. The presence and orientation of the arrowhead for each individual feature will depend on data in the input file. Arrowhead classes are defined in genome.css. There is only one that comes with JBrowse (transcript-arrowhead). |
subfeatureClasses | The css class(es) that will be used for the subfeatures of a feature track. This option makes it possible to choose how the subfeatures will appear. Any of the classes in genome.css can be used for the subfeatures. |
clientConfig | Any visual additions or edits for the main features of the track (not for subfeatures). These edits must be specified in JSON syntax. |
nclChunk | The NCList chunk size in bytes. This option should not be used unless an error such as "json or perl structure exceeds maximum nesting level" is encountered. If this error does occur, lower the chunk size (the default is 50000 bytes). |
compress | This option causes some of the output JSON files (trackData.json and hist-*.json) to be compressed with gzip. |
sortMem | The maximum amount of RAM (in bytes) to use for sorting the features. The default value is 536870912 bytes (512MiB). |
Configuration Options
Left-click Behavior
Beginning with JBrowse 1.5.0, the left-clicking behavior of feature tracks is highly configurable. To make something happen when left-clicking features on a track, add an onClick option to the feature track's configuration. In the example configuration below, left-clicks on features will open an embedded popup window showing the results of searching for that feature's name in NCBI's global search, and "search at NCBI" will show in a tooltip when the user hovers over a feature with the mouse:
{ "feature" : [ "mRNA" ], "track" : "ReadingFrame", "category" : "Genes", "class" : "dblhelix", "key" : "Frame usage", "onClick" : { "label": "search at NCBI", "url": "http://www.ncbi.nlm.nih.gov/gquery/?term={name}" } }
For details on all the options supported by onClick, see #Click Configuration Options.
Right-click Context Menus
In addition, feature tracks can be configured to display a menu of options when a user right-clicks a menu. Here is an example of a track configured with a multi-level right-click context menu:
{ "feature" : [ "match" ], "track" : "Alignments", "category" : "Alignments", "class" : "feature4", "key" : "Example alignments", "hooks": { "modify": "function( track, feature, div ) { div.style.height = (Math.random()*10+8)+'px'; div.style.backgroundColor = ['green','blue','red','orange','purple'][Math.round(Math.random()*5)];}" }, "menuTemplate" : [ { "label" : "Item with submenu", # hello this is a comment "children" : [ { "label" : "Check gene on databases", "children" : [ { "label" : "Query trin for {name}", "iconClass" : "dijitIconBookmark", "action": "newWindow", "url" : "http://wiki.trin.org.au/{name}-{start}-{end}" }, { "label" : "Query example.com for {name}", "iconClass" : "dijitIconSearch", "url" : "http://example.com/{name}-{start}-{end}" } ] }, { "label" : "2nd child of demo" }, { "label" : "3rd child: this is a track" } ] }, { "label" : "Open example.com in an iframe popup", "title" : "The magnificent example.com (feature {name})", "iconClass" : "dijitIconDatabase", "action": "iframeDialog", "url" : "http://www.example.com?featurename={name}" }, { "label" : "Open popup with XHR HTML snippet (btw this is feature {name})", "title": "function(track,feature,div) { return 'Random XHR HTML '+Math.random()+' title!'; }", "iconClass" : "dijitIconDatabase", "action": "xhrDialog", "url" : "sample_data/test_snippet.html?featurename={name}:{start}-{end}" }, { "label" : "Open popup with content snippet (btw this is feature {name})", "title": "function(track,feature,div) { return 'Random content snippet '+Math.random()+' title!'; }", "iconClass" : "dijitIconDatabase", "action": "contentDialog", "content" : "function(track,feature,div) { return '<h2>{name}</h2><p>This is some test content about feature {name}!</p><p>This message brought to you by the number <span style=\"font-size: 300%\">'+Math.round(Math.random()*100)+'</span>.</p> }, { "label" : "function(track,feature,div) { return 'Run a JS callback '+Math.random()+' title!'; }", "iconClass" : "dijitIconDatabase", "action": "function( evt ){ alert('Hi there! Ran the callback on feature '+this.feature.get('name')); }" } ] }
This configuration results in a context menu like the one pictured below. For details on what each of the options supported by menu items does, see #Click Configuration Options.
Click Configuration Options
A click event can be a string JavaScript callback, like:
"function() { alert('Run any JavaScript you want here!'); }"
Or a structure containing options like:
{ "iconClass" : "dijitIconDatabase", "action" : "iframeDialog", "url" : "http://www.ncbi.nlm.nih.gov/gquery/?term={name}", "label" : "Search for {name} at NCBI", "title" : "function(track,feature,div) { return 'Searching for '+feature.get('name')+' at NCBI'; }" }
The available options for a click action are:
- iconClass
- Used only for click actions in context menus. Usually, you will want to specify one of the Dijit icon classes here. Although they are not well documented, a list of available icon classes can be seen at https://github.com/dojo/dijit/blob/1.7.2/icons/commonIcons.css.
- action
- Either a JavaScript function to run in response to the click (e.g. "function(){..}"), or one of the following special strings: "iframeDialog" - the default - causes the given url to be opened in a popup dialog box within JBrowse, in an
iframe
element. "newWindow" - causes the given url to be opened in a new browser window. "contentDialog" - causes the JavaScript string or callback set in the content option to be displayed in the dialog box. "xhrDialog" - causes the given url to be opened in a popup dialog, containing the HTML fetched from the given url option. The difference between "iframeDialog" and "xhrDialog" is that an iframeDialog's URL should point to a complete web page, while an xhrDialog's URL should point to a URL on the same server (or that supports CORS) that contains just a snippet of HTML (not a complete web page). For those familiar with GBrowse, the xhrDialog is similar to GBrowse popup balloons that use a url:... target, while the contentDialog is similar to a GBrowse popup balloon with a normal target. GBrowse does not have an equivalent of the iframeDialog.
- content
- string (interpolated with feature fields) or JS callback that returns a string. Used only by a contentDialog.
- url
- URL used by newWindow, xhrDialog, or iframeDialog actions.
- label
- descriptive label for the link. In a right-click context menu, this will be the text in the menu item.
- title
- title used for the popup window, or for the mouseover link
Using callbacks to customize feature tracks
JBrowse feature tracks, and individual JBrowse features, can be customized using JavaScript functions you write yourself. These functions are called every time a feature in a track is drawn, and allow you to customize virtually anything about the feature's display. What's more, all of the feature's data is accessible to your customization function, so you can even customize individual features' looks based on their data.
As of JBrowse 1.3.0, feature callbacks are added by directly editing your trackList.json file with a text editor. Unfortunately, due to the limitations of the JSON format currently used for JBrowse configuration, the function must appear as a quoted (and JSON-escaped) string, on a single line. This will be improved in JBrowse 2.0.
Here is an example feature callback, in context in the trackList.json file, that can change a feature's background
CSS property (which controls the feature's color) as a function of the feature's name. If the feature's name contains a number that is odd, it give the feature's HTML div
element a red background. Otherwise, it gives it a blue background.
{ "autocomplete" : "all", "track" : "ExampleFeatures", "style" : { "className" : "feature2" }, "key" : "Example Features", "feature" : [ "remark" ], "urlTemplate" : "tracks/ExampleFeatures/{refseq}/trackData.json", "hooks": { "modify": "function( track, f, fdiv ) { var nums = f.get('name').match(/\\d+/); if( nums && nums[0] % 2 ) { fdiv.style.background = 'red'; } else { fdiv.style.background = 'blue'; } }" }, "compress" : 0, "label" : "ExampleFeatures", "type" : "FeatureTrack" },
Wiggle Tracks
Introduced in JBrowse 1.5.0, Wiggle tracks require that the user's browser support HTML <canvas>
elements. Currently, the Wiggle tracks are only capable of displaying data from a BigWig files, and BigWig support requires a recent web browser with support for HTML5 typed arrays.
Example BigWig-based Wiggle XY-Plot Track Configuration
Here is an example track configuration stanza for a Wiggle track displaying data directly from a BigWig file. Note that the URL in urlTemplate
is relative to the directory where the configuration file is located.
{ "label" : "rnaseq", "key" : "RNA-Seq Coverage" "storeClass" : "BigWig", "urlTemplate" : "../tests/data/SL2.40_all_rna_seq.v1.bigwig", "type" : "Wiggle", "variance_band" : true, "min_score": -1000, "max_score": 2000, "style": { "pos_color": "#FFA600", "neg_color": "#005EFF", "clip_marker_color": "red", "height": 100 } }
Note: numerical values do not appear in quotes.
Wiggle track configuration options
Option | Value |
---|---|
scale
|
log, default linear Graphing scale, either linear or logarithmic. |
min_score
|
Number. The minimum value to be graphed. Calculated according to autoscale if not provided. |
max_score
|
Number. The maximum value to be graphed. Calculated according to autoscale if not provided. |
autoscale
|
global|z_score If one or more of min_score and max_score options are absent, then these values will be calculated automatically. The "autoscale" option controls how the calculation is done. A value of global will use chromosome-wide statistics for the entire wiggle or dense file to find min and max values. z_score will use either ±z_score_bound if it is set, or will use ±4 otherwise. clipped_global is similar to global , except the bounds will be limited to ±z_score_bound , or ±4 if z_score_bound is not set.
|
variance_band
|
1 or 0 If 1, draw a yellow line showing the mean, and two shaded bands showing ±1 and ±2 standard deviations from the mean.
|
z_score_bound
|
for z-score based graphs, the bounds to use. |
data_offset
|
number, default zero. If set, will offset the data display by the given value. For example, a data_offset of -100 would make a data value of 123 be displayed as 23, and a data_offset of 100 would make 123 be displayed as 223.
|
bicolor_pivot
|
"zero"|(num) Where to change from pos_color to neg_color when drawing bicolor plots. Can be "mean", "zero", or a numeric value. |
style→pos_color
|
CSS color, default "blue". When drawing bicolor plots, the fill color to use for values that are above the pivot point. Example: |
style→neg_color
|
CSS color, default "red". When drawing bicolor plots, the fill color to use for values that are below the pivot point. |
disable_clip_markers
|
boolean, default false. If true, disables clip markers, which are 2-pixel colored regions at the edge of the graph that indicate when the data value lies outside the displayed range. |
style→clip_marker_color
|
CSS color, defaults to neg_color when in the positive bicolor regime (see bicolor_pivot) and pos_color in the negative bicolor regime. |
style→height
|
Height, in pixels, of the track. Defaults to 100 for XYPlot tracks, and 32 for Density tracks. |
Image Tracks
In addition to HTML-based feature tracks, JBrowse supports tracks based on pre-generated PNG or JPEG images that are tiled along the reference sequence. Currently, JBrowse ships with two different image track generators: wig-to-json.pl, which generates images showing simple quantitative (wiggle) data, and draw-basepair-track.pl, which draws arcs to show the base pairing structure of RNAs.
wig-to-json.pl
Using a wiggle file, this script creates a single Image track that displays data from the wiggle file. Beginning with JBrowse 1.5, this is no longer the recommended method of displaying wiggle data: it has largely been replaced by the direct-access BigWig data store coupled with the next-generation Wiggle track type. See Wiggle Tracks.
In wiggle data, a numeric value is associated with each nucleotide position in the reference sequence. This is represented in JBrowse as a track that looks like a histogram, where the horizontal axis is for each nucleotide position, and the vertical axis is for the number associated with that position. The vertical axis currently does not have a scale; rather, the heights for each position are relative to each other.
Special dependencies: libpng
In order to use wig-to-json.pl, the code for wig2png must be compiled. Normally, this is done automatically by setup.sh
but it can be done manually if necessary. See the Quick Start Tutorial packaged with JBrowse for details.
Basic usage
bin/wig-to-json.pl --wig <wig file> --tracklabel <track name> [options]
Hint: If you are using this type of track to plot a measure of a prediction's quality, where the range of possible quality scores is from some lowerbound to some upperbound (for instance, between 0 and 1), you can specify these bounds with the max and min options.
Option | Value |
---|---|
wig | The name of the wig file that will be used. This option must be specified. |
tracklabel | The internal name that JBrowse will give to this feature track. This option requires a value. |
key | The external, human-readable label seen on the feature track when it is viewed in JBrowse. The value of key defaults to the value of tracklabel. |
out | A path to the output directory (default is 'data' in the current directory). |
tile | The directory where the tiles, or images corresponding to each zoom level of the track, are stored. Defaults to data/tiles. |
bgcolor | The color of the track background. Specified as "RED,GREEN,BLUE" in base ten numbers between 0 and 255. Defaults to "255,255,255". |
fgcolor | The color of the track foreground (i.e. the vertical bars of the wiggle track). Specified as "RED,GREEN,BLUE" in base ten numbers between 0 and 255. Defaults to "105,155,111". |
width | The width in pixels of each tile. The default value is 2000. |
height | The height in pixels of each tile. Changing this parameter will cause a corresponding change in the top-to-bottom height of the track in JBrowse. The default value is 100. |
min | The lowerbound to use for the track. By default, this is the lowest value in the wiggle file. |
max | The upperbound to use for the track. By default, this will be the highest value in the wiggle file. |
draw-basepair-track.pl
This script inputs a single base pairing track into JBrowse. A base pairing track is a distinctive track type that represents base pairing between nucleotides as arcs. In addition, it is intended to demonstrate the Perl API for writing your own image track generators.
Basic usage
bin/draw-basepair-track.pl --gff <gff file> --tracklabel <track name> [options]
Option | Value |
---|---|
gff | The name of the gff file that will be used. This option must be specified. |
tracklabel | The internal name that JBrowse will give to this feature track. This option requires a value. |
key | The external, human-readable label seen on the feature track when it is viewed in JBrowse. The value of key defaults to the value of tracklabel. |
out | A path to the output directory (default is 'data' in the current directory). |
tile | The directory where the tiles, or images corresponding to each zoom level of the track, are stored. Defaults to data/tiles. |
bgcolor | The color of the track background. Specified as "RED,GREEN,BLUE" in base ten numbers between 0 and 255. Defaults to "255,255,255". |
fgcolor | The color of the track foreground (i.e. the base pairing arcs). Specified as "RED,GREEN,BLUE" in base ten numbers between 0 and 255. Defaults to "0,255,0". |
width | The width in pixels of each tile. The default value is 2000. |
height | The height in pixels of each tile. Changing this parameter will cause a corresponding change in the top-to-bottom height of the track in JBrowse. The default value is 100. |
thickness | The thickness of the base pairing arcs in the track. The default value is 2. |
nolinks | Disables use of file system links to compress duplicate image files. |
Name Searching and Autocompletion
The JBrowse search box auto-completes the names of features and reference sequences that are typed into it. After loading all feature and reference sequence data into a JBrowse instance (with prepare-refseqs.pl
, flatfile-to-json.pl
, etc.), generate-names.pl
must be run to build the indexes used for name searching and autocompletion.
Autocompletion Configuration
Several settings are available to customize the behavior of autocompletion. Most users will not need to configure any of these variables.
Option | Value |
---|---|
autocomplete→stopPrefixes
|
Array of word-prefixes for which autocomplete will be disabled. For example, a value of ['foo'] will prevent autocompletion when the user as typed 'f', 'fo', or 'foo', but autocompletion will resume when the user types any additional characters.
|
autocomplete→resultLimit
|
Maximum number of autocompletion results to display. Defaults to 15. |
autocomplete→tooManyMatchesMessage
|
Message displayed in the autocompletion dropdown when more than autocomplete→resultLimit matches are displayed. Defaults to 'too many matches to display'.
|
generate-names.pl
This script builds indexes of features by label (the visible name below a feature in JBrowse) and/or by alias (a secondary name that is not visible in the web browser, but may be present in the JSON used by JBrowse).
To search for a term, type it in the autocompleting text box at the top of the JBrowse window.
Basic syntax:
bin/generate-names.pl [options]
Note that generate-names.pl does not require any arguments. However, some options are available:
Option | Value |
---|---|
out | A path to the output directory (default is 'data/' in the current directory). |
thresh | A lower-bound on the Patricia trie chunk size. Specifically, the lowest possible chunk size is (thresh + 1). The default value is 200. In this context, a chunk is a group of connected Patricia trie nodes that can be visualized as a single entity, and the chunk size is the total number of genomic features contained in a chunk. The lower the value of thresh, the more chunks there will be. |
verbose | This setting causes information about the division of nodes into chunks to be printed to the screen. |
Removing Tracks
JBrowse has a script to remove individual tracks: remove-track.pl
. Run it with the --help
option to see a comprehensive usage message:
bin/remove-track.pl --help
Compressing data on the server
Starting with JBrowse 1.3.0, server-side data-formatting scripts support a
--compress
option to compress (gzip) feature and sequence
data to conserve server disk space and reduce server CPU load even
further. Using this option requires some additional web server configuration.
- For Apache
-
AllowOverride FileInfo
(orAllowOverride All
) must be set for the JBrowse data directories in order to use the.htaccess
files generated by the formatting scripts. -
mod_headers
must be installed and enabled, and if the web server is usingmod_gzip
ormod_deflate
,mod_setenvif
must also be installed and enabled.
- For nginx
- A configuration snippet like the following should be included in the configuration:
location ~* "\.(json|txt)z$" { add_header Content-Encoding gzip; gzip off; types { application/json jsonz; } }
URL Control
JBrowse provides a number of options for changing the current view in the browser by adding options to the URL which potentially contain genomic location components.
Basic syntax:
http://<server>/<path to jbrowse>?loc=<location string>&tracks=<tracks to show>
loc
Parameters represent the current genomic position which will be visible in the viewing field. Possible input structures are:
"Chromosome"+":"+ start point + ".." + end point
A chromosome name/ID followed by “:”, starting position, “..” and end position of the genome to be viewed in the browser is used as an input. Chromosome ID can be either a string or a mix of string and numbers. “CHR” to indicate chromosome may or may not be used. Strings are not case-sensitive. If the chromosome ID is found in the database reference sequence (RefSeq), the chromosome will be shown from the starting position to the end position given in URL.
example) ctgA:100..200
Chromosome ctgA will be displayed from position 100 to 200.
OR start point + ".." + end point
A string of numerical value, “..” and another numerical value is given with the loc option. JBrowse navigates through the currently selected chromosome from the first numerical value, start point, to the second numerical value, end point.
example) 200..600
OR center base
If only one numerical value is given as an input, JBrowse treats the input as the center position. Then an arbitrary region of the currently selected gene is displayed in the viewing field with the given input position as the center base.
example) 200
OR feature name/ID
If a string or a mix of string and numbers are entered as an input, JBrowser treats the input as a feature name/ID of a gene. If the ID exists in the database RefSeq, JBrowser displays an arbitrary region of the feature from the the position 0, starting position of the gene, to a certain end point.
example) ctgA
tracks
parameters are comma-delimited strings containing track names, each of which should correspond to the "label" element of the track information dictionaries that are currently viewed in the viewing field. Names for the tracks can be found in data/trackInfo.js in jbrowse-1.2.1 folder.
example) DNA,knownGene,ccdsGene,snp131,pgWatson,simpleRepeat
Embedded mode
JBrowse's included index.html
file supports three URL query arguments that can turn off the JBrowse track list, navigation bar, and overview bar, respectively. When all three of these are turned off, the only thing visible are the displayed tracks themselves, and JBrowse could be said to be running in a kind of "embedded mode".
The three parameters used for this are nav
, tracklist
, and overview
. If any of these are set to 0, that part of the JBrowse interface is hidden.
For example, you could put the embedded-mode JBrowse in an iframe, like this:
<html> <head> <title>JBrowse Embedded</title> </head> <body> <h1>Embedded Volvox JBrowse</h1> <div style="width: 400px; margin: 0 auto;"> <iframe style="border: 1px solid black" src="../../index.html?data=sample_data/json/volvox&tracklist=0&nav=0&overview=0&tracks=DNA%2CExampleFeatures%2CNameTest%2CMotifs%2CAlignments%2CGenes%2CReadingFrame%2CCDS%2CTranscript%2CClones%2CEST" width="300" height="300"> </iframe> </div> </body> </html>
Data Export
Starting with version 1.7.0, JBrowse users can export track data in a variety of formats for either specific regions, or for entire reference sequences. Export functionality can also be limited and disabled on a per-track basis using the configuration variables listed below.
Data Formats
Current supported export data formats are:
- FASTA (sequence tracks)
- GFF3 (all tracks)
- bed (feature and alignment tracks)
- bedGraph (wiggle tracks)
- Wiggle (wiggle tracks)
Configuration
Each track in JBrowse that can export data supports the following configuration variables.
Option | Value |
---|---|
noExport
|
If true, disable all data export functionality for this track. Default false. |
maxExportSpan
|
Maximum size of the a region, in bp, that can be exported from this track. Default 500 Kb. |
maxExportFeatures
|
Maximum number of features that can be exported from this track. Not set by default. |
Faceted Track Selection
Starting with version 1.4.0, JBrowse has an advanced "faceted" track selector tailored for sites with hundreds or thousands of tracks in a single JBrowse instance. This track selector allows users to interactively search for the tracks they are interested in using metadata that is associated with each track.
An example of a faceted track selector in action with about 1,800 tracks can be seen here. This is an example installation containing a snapshot of modENCODE track metadata. Note that the track data and reference sequences in this example are not real (they are actually all just copies of the same volvox test track), this is just an example of the faceted track selector in action.
The Track Selector
The Faceted
track selector takes all sources of track metadata, aggregates them, and makes the tracks searchable using this metadata. By default, tracks only have a few default metadata facets that come from the track configuration itself. After initially turning on the faceted track selector, most users will want to add their own metadata for the tracks: see #Defining Track Metadata below. To enable the faceted track selector in the JBrowse configuration, set trackSelector.type
to Faceted
.
There are some other configuration variables that can be used to customize the display of the track selector. Most users will want to set both of these variables to customize the columns and facets shown in the track selector.
Option | Value |
---|---|
trackSelector.displayColumns
|
Array of which facets should be displayed as columns in the track list. Columns are displayed in the order given. If not provided, all facets will be displayed as columns, in lexical order. |
trackSelector.renameFacets
|
Object containing "display names" for some or all of the facet names. For example, setting this to { 'developmental-stage': 'Conditions' } would display "Conditions" as the name of the developmental-stage facet.
|
trackSelector.escapeHTMLInData
|
Beginning in JBrowse 1.6.5, if this is set to true or 1 prevents HTML code that may be present in the track metadata from being rendered. Instead, the HTML code itself will be shown.
|
Example
trackSelector: { type: 'Faceted', displayColumns: ['key', 'organism', 'technique', 'target', 'factor', 'developmental-stage','principal_investigator','submission' ], renameFacets: { 'developmental-stage': 'Conditions', submission: 'Submission ID' } }
Adding Track Metadata
To add your own track metadata to JBrowse, add a trackMetadata
section to the JBrowse configuration.
JBrowse currently supports track metadata that in Excel-compatible comma-separated-value (CSV) format, but additional track metadata backends are relatively easy to add. Write the JBrowse mailing list if you have a strong need for another format for track metadata.
Option | Value |
---|---|
trackMetadata.sources
|
Array of source definitions, each of which takes the form { type: 'csv', url: '/path/to/file' } . The url is interpreted as relative to the url of the page containing JBrowse (index.html in default installations). Source definitions can also contain a class to explicitly specify the JavaScript backend used to handle this source.
|
trackMetadata.filterFacets
|
Optional array of facet names that should be the only ones made searchable. This can be used improve the speed and memory footprint of JBrowse on the client by not indexing unused metadata facets. |
Example
Configuration in trackList.json
:
trackMetadata: { filterFacets: [ 'category','organism','target','technique','principal_investigator', 'factor','developmental-stage','strain','cell-line','tissue','compound', 'temperature' ], sources: [ { type: 'csv', url: 'myTrackMetaData.csv' } ] }
Track metadata CSV:
label | technique | factor | target | principal_investigator | submission | category | type | Developmental-Stage |
---|---|---|---|---|---|---|---|---|
fly/White_INSULATORS_WIG/BEAF32 | ChIP-chip | BEAF-32 | Non TF Chromatin binding factor | White, K. | 21 | Other chromatin binding sites | data set | Embryos 0-12 hr |
fly/White_INSULATORS_WIG/CP190 | ChIP-chip | CP190 | Non TF Chromatin binding factor | White, K. | 22 | Other chromatin binding sites | data set | Embryos 0-12 hr |
fly/White_INSULATORS_WIG/GAF | ChIP-chip | GAF | Non TF Chromatin binding factor | White, K. | 23 | Other chromatin binding sites | data set | Embryos 0-12 hr |
... | ... | ... | ... | ... | ... | ... | ... | ... |
Note that the label for each track metadata row must correspond to the label
in the track configuration for the track it describes.
Anonymous Usage Statistics
JBrowse instances report usage statistics to the JBrowse
developers. This data is very important to the JBrowse project,
since it is used to make the case to grant agencies for continuing
to fund JBrowse development. No research data is transmitted, the
data collected is limited to standard Google Analytics, along with
a count of how many tracks the JBrowse instance has, how many
reference sequences are present, their average length, and what
types of tracks (wiggle, feature, etc) are present. Users can
disable usage statistics by setting suppressUsageStatistics: true
in the JBrowse configuration.
Advanced Topics
Data Format Specification: LazyNCList Feature Store
JBrowse uses lazily-loaded nested containment lists (LazyNCLists) as an efficient format for storing feature data in pre-generated static files. A nested containment list is a tree data structure in which the nodes of the tree are intervals themselves features, and edges connecting features that lie `within the bounds of (but are not subfeatures of) another feature. It has some similarities to an R tree. For more on NClists, see the Alekseyenko paper.
This data format is currently used in JBrowse 1.3 for tracks of type FeatureTrack
, and the code that actually reads this format is in SeqFeatureStore/NCList.js and ArrayRepr.js.
The LazyNCList format can be broken down into two distinct subformats: the LazyNCList itself, and the array-based JSON representation of the features themselves.
Array Representation (ArrayRepr
)
For speed and memory efficiency, JBrowse feature JSON represents features as arrays instead of objects. This is because the JSON representation is much more compact (saving a lot of disk space), and many browsers significantly optimize JavaScript Array objects over more general objects.
Each feature is represented as an array of the form [ class, data, data, ... ]
, where the class
is an integer index into the store's classes
array (more on that in the next section). Each of the elements in the classes
array is an array representation that defines the meaning of each of the the elements in the feature array.
An array representation specification is encoded in JSON as (comments added):
{ "attributes": [ // array of attribute names for this representation "AttributeNameForIndex1", "AttributeNameForIndex2", ... ], "isArrayAttr": { // list of which attributes are themselves arrays "AttributeNameForIndexN": 1, ... } }
Lazy Nested-Containment Lists (LazyNCList
)
A JBrowse LazyNCList is a nested containment list tree structure stored as one JSON file that contains the root node of the tree, plus zero or more "lazy" JSON files that contain subtrees of the main tree. These subtree files are lazily fetched: that is, they are only fetched by JBrowse when they are needed to display a certain genomic region.
On disk, the files in an LazyNCList feature store look like this:
# stats, metadata, and nclist root node data/tracks/<track_label>/<refseq_name>/trackData.json # lazily-loaded nclist subtrees data/tracks/<track_label>/<refseq_name>/lf-<chunk_number>.json # precalculated feature densities data/tracks/<track_label>/<refseq_name>/hist-<bin_size>.json ...
Where the trackData.json
file is formatted as (comments added):
{ "featureCount" : 4293, // total number of features in this store "histograms" : { // information about precalculated feature-frequency histograms "meta" : [ { // description of each available bin-size for precalculated feature frequencies "basesPerBin" : "100000", "arrayParams" : { "length" : 904, "chunkSize" : 10000, "urlTemplate" : "hist-100000-{Chunk}.json" } }, ... // and so on for each bin size ], "stats" : [ { // stats about each precalculated set of binned feature frequencies "basesPerBin" : "100000", // bin size in bp "max" : 51, // max features per bin "mean" : 4.93030973451327 // mean features per bin }, ... ] }, "intervals" : { "classes" : [ // classes: array representations used in this feature data (see ArrayRepr section above) { "isArrayAttr" : { "Subfeatures" : 1 }, "attributes" : [ "Start", "End", "Strand", "Source", "Phase", "Type", "Id", "Name", "Subfeatures" ] }, ... { // the last arrayrepr class is the "lazyClass": fake features that point to other files "isArrayAttr" : { "Sublist" : 1 }, "attributes" : [ "Start", "End", "Chunk" ] } ], "nclist" : [ [ 2, // arrayrepr class 2 12962, // "Start" minimum coord of features in this subtree 221730, // "End" maximum coord of features in this subtree 1 // "Chunk" (indicates this subtree is in lf-1.json) ], [ 2, // arrayrepr class 2 220579, // "Start" minimum coord of features in this subtree 454457, // "End" maximum coord of features in this subtree 2 // "Chunk" (indicates this subtree is in lf-2.json) ], ... ], "lazyClass" : 2, // index of arrayrepr class that points to a subtree "maxEnd" : 90303842, // maximum coordinate of features in this store "urlTemplate" : "lf-{Chunk}.json", // format for lazily-fetched subtree files "minStart" : 12962 // minimum coordinate of features in this store }, "formatVersion" : 1 }
Data Format Specification: Fixed-Resolution Tiled Image Store
JBrowse can display tracks composed of precalculated image tiles, stretching the tile images horizontally when necessary. The JBrowse Volvox example data has a wiggle data track that is converted to image tiles using the included wig2png
program, but any sort of image tiles can be displayed if they are laid out in this format.
The files for a tiled image track are structured by default like this:
data/tracks/<track_label>/<refseq_name>/trackData.json data/tracks/<track_label>/<refseq_name>/<zoom_level_urlPrefix>/<index>.png ... (and so on, for many more PNG image files)
Where the PNG files are the image tiles themselves, and trackData.json
contains metadata about the track in JSON format, including available zoom levels, the width and height of the image tiles, their base resolution (number of reference sequence base pairs per image tile), and statistics about the data (such as the global minimum and maximum of wiggle data).
The structure of the trackData.json file is:
{ "tileWidth": 2000, // width of all image tiles, in pixels "stats" : { // any statistics about the data being represented "global_min": 100, "global_max": 899 }, "zoomLevels" : [ // array describing what resolution levels are available { // in the precalculated image tiles "urlPrefix" : "1/", "height" : 100, "basesPerTile" : 2000 }, ... (and so on, for zoom levels in order of decreasing resolution / increasing bases per tile ) ] }
To see a working example of this in action, see the contents of sample_data/json/volvox/tracks/volvox_microarray.wig/ctgA
after the Volvox wiggle sample data has been formatted.
The code for working with this tiled image format in JBrowse 1.3 is in TiledImageStore/Fixed.js
.
Miscellaneous Global Configuration
JBrowse supports some miscellaneous configuration variables that can change the overall behavior of the browser.
Option | Description |
---|---|
locationBoxLength
|
The desired size, in characters of the location search box. If not set, the size of the location box is calculated to fit the largest location string that is likely to be produced, based on the length of the reference sequences and the length of their names. Added in JBrowse 1.7.0. |
css
|
Used to add additional CSS code to the browser at runtime. Can be an array of either strings containing CSS statements, or URLs for CSS stylesheets to load (as {url: "/path/to/my.css"} ). CSS can of course also be added outside of JBrowse, at the level of the HTML page where JBrowse runs. Added in JBrowse 1.6.2.
|
theme
|
Allows changing the graphical theme from the default Dijit "tundra" theme. Added in JBrowse 1.7.0. |
See also
- Using a database with JBrowse
- Using configuration files
- JBrowse Tutorial from GMOD Summer School 2010