Difference between revisions of "JBrowseDev/Main"

From GMOD
Redirect page
Jump to: navigation, search
m (generate-names.pl: small rewording)
(Redirected page to JBrowse)
 
(28 intermediate revisions by one other user not shown)
Line 1: Line 1:
JBrowse is an interactive web application that can be used to visualize data about a single genomic sequence or a set of chromosomes. It is capable of displaying large genomes and numerous feature tracks describing those genomes (see human genome [http://jbrowse.org/ucsc/hg19/?loc=chr1:208062427..208063742&tracks=DNA,knownGene,bam_sim,snp131,pgWatson,simpleRepeat,omimGene hg19]).
+
#REDIRECT [[JBrowse]]
 
+
Apart from enhanced graphics, JBrowse's major improvement over [[GBrowse]] is its use of [[Glossary#AJAX|Asynchronous JavaScript and XML]] (AJAX), hence the name "JBrowse" for "JavaScript Genome Browser". Through the use of AJAX web development methods, data can be obtained from a server without requiring a page reload. This implementation method allows JBrowse to offload a significant portion of computational effort from the machine that is serving JBrowse to the client machines, allowing the server to process requests from clients at a faster rate, and thus serve more clients, than a genome browser that is not implemented using AJAX.
+
 
+
=Installation=
+
 
+
==Prerequisites==
+
 
+
'''1. JBrowse requires Perl v5.10.0 or later.''' An installer can be downloaded from the [http://www.perl.org Perl Home Page]. If this does not work, you can download the source code and install perl through the command line.
+
 
+
If you had to download a new version of Perl, you will want this version to be used by default. First, find the path to the directory that contains the new perl executable. If you didn't specify this directory when you were installing perl, you should be able to find the default installation directory by looking at the documentation. Next, create (or open) the shell configuration file in your home directory:
+
 
+
pico $HOME/.bash_profile
+
<span style="font-size:8pt">''Note: This is specific to Mac OS X. If you are using bash on a platform that is not OS X, add the 'export' line below to both .bash_profile and .bashrc. If you do not use bash, you will need to edit the configuration file(s) appropriate for your shell.''</span>
+
 
+
Insert this line into that file, substituting '<path to the newest version of perl>' with the path to the new perl executable:
+
 
+
export PATH=<path to the newest version of perl>:$PATH
+
<span style="font-size:8pt">''Note: This is the command that is used in bash. Depending on your shell, you may have to use setenv instead of export.''</span>
+
 
+
Save this file, then close and reopen your terminal. To make sure that the newest version of perl is now being used by default, use this command:
+
 
+
perl --version
+
 
+
'''2. JBrowse must be unpacked into a directory that is served by Apache.''' Essentially, this means that the directory that JBrowse is installed in (placed in) must be accessible via your web browser. This is very simple to do on OS X. First, navigate to System Preferences > Sharing and make sure that Web Sharing is on. Then, move the JBrowse directory to your Sites folder. You should now be able to view JBrowse at the url:
+
 
+
localhost/~<your account name>/<name of JBrowse directory>
+
 
+
As an example, the url might look like:
+
 
+
localhost/~stephen/jbrowse-1.2.1
+
 
+
For Linux, the default directory served by Apache is normally '/var/www'.
+
 
+
'''3. JBrowse requires installation of a number of perl modules.''' This should be done through the [[Glossary#CPAN|Comprehensive Perl Archive Network]] (CPAN), a large repository for shared perl modules. CPAN has an associated shell program that will allow you to download and install JBrowse's dependencies.
+
 
+
To access this shell program, open up a terminal and type 'perl -MCPAN -e shell'. If you have never used cpan before, you will be prompted to configure CPAN. During the configuration process, you will be asked a series of questions about your preferences. If you do not know what an option means, you can usually use the provided default argument.
+
 
+
If you have sufficient privileges, reopen the CPAN shell as the root user by typing 'sudo perl -MCPAN -e shell' in a terminal. You will be prompted for the root password. Once you have opened the CPAN shell as root, install BioPerl by typing 'install CJFields/BioPerl-1.6.1.tar.gz'. CPAN will automatically install BioPerl on your computer. After this is complete, install the other dependencies in the same way. Here is a complete list of what you will need to download and install from CPAN:
+
 
+
* CJFields/BioPerl-1.6.1.tar.gz
+
* JSON
+
* JSON::XS (optional, for speed)
+
* Heap::Simple
+
* Heap::Simple::Perl
+
* Heap::Simple::XS
+
* PerlIO::gzip
+
* Devel::Size
+
 
+
==Common Issues==
+
 
+
'''Restricted Permissions'''
+
 
+
If you do not have administrator privileges, you will need to install the JBrowse dependencies locally, that is, for your account only. This should be straightforward for the perl installation (simply change the install location to one that you have read and write access to), but is a bit more involved for the CPAN-installed perl modules.
+
 
+
To reconfigure cpan, open the CPAN shell and type 'o conf init'. You will need to provide an answer to the "Parameters for the './Build' command?" prompt. One possible answer would be:
+
 
+
--extra-linker-flags -L<path to home directory>/site_perl
+
 
+
This will cause all modules to be installed in a directory called 'site_perl' that is in your home directory. Of critical importance is that your user account has read and write access to this directory.
+
 
+
There is one more change to be made. Perl needs to be aware of the new installation directory for your perl modules. This involves editing the PERL5LIB variable. Open your shell configuration file with a text editor, e.g. with the command:
+
 
+
pico $HOME/.bash_profile
+
<span style="font-size:8pt">''Note: This is specific to Mac OS X. If you are using bash on a platform that is not OS X, add the 'export' line below to both .bash_profile and .bashrc. If you do not use bash, you will need to edit the configuration file(s) appropriate for your shell.''</span>
+
 
+
and insert the line:
+
 
+
export PERL5LIB=$PERL5LIB:<path to home directory>/site_perl
+
<span style="font-size:8pt">''Note: This is the command that is used in bash. Depending on your shell, you may have to use setenv instead of export.''</span>
+
 
+
Changes to the PERL5LIB variable will take effect in any new shell that you open. To check whether the PERL5LIB variable has been successfully changed, use the command:
+
 
+
echo $PERL5LIB
+
 
+
'''Missing Module or Loadable Object'''
+
 
+
You might encounter an error of the type:
+
 
+
Can't locate some/perl/module.pm in @INC (INC contains <list of paths>)
+
 
+
If you see this error, it normally means one of two things. Either (1) you have not installed the perl module, or (2) you have installed the perl module, but it is not in a directory where perl expects it to be. If the absolute path to the module's directory (try 'locate your/perl/module.pm') is not a member of the list of paths in @INC, you can either move the module and its associated files and directories to another directory that is a member of @INC, or you can add the module's current directory to @INC by appending its absolute path to the PERL5LIB variable in ~/.bash_profile (See the previous topic for instructions).
+
 
+
You might also encounter an error of this type:
+
 
+
Can't locate loadable object for some/perl/module.pm in @INC (INC contains <list of paths>)
+
 
+
If you encounter this error, it means that perl found the module it was looking for, but it didn't find all of the files that were associated with that module. This error often occurs when there is XS code that does not get compiled, or whose compiled object file is not in a directory that is visible to perl (i.e. in a directory that isn't listed in @INC). If you moved or copied the perl module and its associated files to a different directory, be sure that there aren't any object files that you missed. If you don't find any object files in the directory that you copied the perl module from, try reinstalling the module through the CPAN shell. Look carefully through the CPAN log, first for errors in the compilation of the XS code, and then for possible reasons for any errors (e.g. a missing library that cannot be installed through CPAN).
+
 
+
'''Installing Modules for an Older Version of Perl'''
+
 
+
It might be possible to type 'cpan' from your shell to open CPAN. However, it is recommended that you open CPAN using the perl interpreter ('perl -MCPAN -e shell'), because the shell associated with the 'cpan' command might be intended for a version of perl other than the version that you are using with JBrowse (e.g., it might install version 5.8.8 modules when the version of perl that you are actually using is 5.10.0). Modules for a different version of perl will be installed in the wrong library directory with respect to you current version of perl, and even if you move them to the correct directory, they might not be compatible with your current version of perl. If 'perl --version' and 'cpan --version' indicate the same perl version, it is fine to use 'cpan' to access the CPAN shell. Otherwise, use 'perl -MCPAN -e shell'.
+
 
+
=Usage=
+
 
+
JBrowse is comprised of a set of scripts that use external data sources (e.g. files on your computer) to produce additional files. If these JBrowse-generated files are present when you open JBrowse with your web browser, you will automatically see the sequence and feature tracks produced from the data they contain.
+
 
+
There is a particular order that should be followed when adding data to JBrowse. Reference sequences should be added first, followed by feature tracks. Once all of the tracks have been added, it is possible to make the names of each feature searchable. While there is some flexibility in this order of events (it is possible to add additional reference sequences after feature tracks have been added, for example), the first step will always be to specify a sequence or set of sequences, and the last step will always be to make the named features searchable (assuming it is desired that all feature names are searchable).
+
 
+
==User Interface==
+
 
+
[[File:JBrowseUI.png|800px|center|thumb|
+
'''1. Location Marker:''' Click and drag to move to a different genomic position.<br>
+
'''2. Scroll Buttons:''' Click to scroll by a fixed amount at a given zoom level.<br>
+
'''3. Viewing Field:''' Drag a track to this area to make it visible. Depending on the track, some zooming may be necessary.<br>
+
'''4. Zoom Buttons:''' Click to zoom. Per click, the larger buttons zoom more than the smaller buttons.<br>
+
'''5. Search Bar:''' Browse to a certain region by searching for a location or feature name.<br>
+
'''6. Chromosome Selector:''' Choose which chromosome to view.<br>
+
'''7. Hidden Tracks:''' Drag a track to this area to hide it.<br>
+
'''8. Window Slider:''' Resize the viewing field.
+
]]
+
 
+
==Reference Sequences==
+
 
+
The reference sequence is a sequence that is representative of the feature data. It might be a consensus sequence from an alignment, or simply a sequence of interest. Before any feature tracks can be input to JBrowse, the reference sequence must be taken into consideration. This is handled by the prepare-refseqs.pl script.
+
 
+
===prepare-refseqs.pl===
+
 
+
This script must be run prior to the addition of feature tracks. The simplest way to use it would be to use the --fasta option, which uses a single sequence or set of reference sequences from a [[Glossary#FASTA|FASTA]] file:
+
 
+
bin/prepare-refseqs.pl --fasta <fasta file> [options]
+
 
+
If the file has multiple sequences, each sequence will become a reference sequence by default. You may switch between these sequences by selecting the sequence of interest via the pull-down menu to right of the large "zoom in" button.
+
 
+
You may use any alphabet you wish for your sequences (i.e., you are not restricted to the nucleotides A, T, C, and G; any alphanumeric character, as well as several other characters, may be used). Hence, it is possible to browse RNA and protein in addition to DNA. However, some characters should be avoided, because they will cause the sequence to "split" - part of the sequence will be cut off and and continue on the next line. These characters are the ''hyphen'' and ''question mark''. Unfortunately, this prevents the use of hyphens to represent gaps in a reference sequence.
+
 
+
In addition to reading from a fasta file, prepare-refseqs.pl can read sequences from a gff3 file or a database (e.g. PostgreSQL, MySQL). In order to read fasta sequences from a database, a config file must be used.
+
 
+
Syntax used to import sequences from gff files:
+
bin/prepare-refseqs.pl --gff <gff3 file with sequence information> [options]
+
 
+
Syntax used to import sequences with a config file:
+
bin/prepare-refseqs.pl --conf <config file that references a database with sequence information> --[refs|refid] <reference sequences> [options]
+
 
+
{| class="wikitable"
+
|-
+
! Option
+
! Value
+
|-
+
| fasta, gff, or conf
+
| Path to the file that JBrowse will use to import sequences. With the fasta and gff options, the sequence information is imported directly from the specified file. With the conf option, the specified config file includes the details necessary to access a database that contains the sequence information. Exactly one of these three options must be used.
+
|-
+
| out
+
| A path to the output directory (default is 'data' in the current directory)
+
|-
+
| seqdir
+
| The directory where the reference sequences are stored (default: <output directory>/seq)
+
|-
+
| noseq
+
| Causes no reference sequence track to be created. This is useful for reducing disk usage.
+
|-
+
| refs
+
| A comma-delimited list of the names of sequences to be imported as reference sequences. This option (or refid) is required when using the conf option. It is not required when the fasta or gff options are used, but it can be useful with these options, since it can be used to select which sequences JBrowse will import.
+
|-
+
| refids
+
| A comma-delimited list of the database identifiers of sequences to be imported as reference sequences. This option is useful when working with a Chado database that contains data from multiple different species, and those species have at least one chromosome with the same name (e.g. chrX). In this case, the desired chromosome cannot be uniquely identified by name, so it is instead identified by ID. This ID can be found in the 'feature_id' column of 'feature' table in a Chado database.
+
|}
+
 
+
==Feature Tracks==
+
 
+
The feature tracks are the most important components of JBrowse. They can be used to visualize information about a sequence, such as sequence conservation, RNA base pairing, and the locations of transposons. There are a number of scripts that can be used to input various types of feature tracks into JBrowse:
+
 
+
* flatfile-to-json.pl
+
* bam-to-json.pl
+
* biodb-to-json.pl
+
* ucsc-to-json.pl
+
* draw-basepair-track.pl
+
* wig-to-json.pl
+
 
+
===flatfile-to-json.pl===
+
 
+
This script inputs a single track into JBrowse. To put multiple tracks into JBrowse, it must be executed repeatedly.
+
 
+
Terminology: A ''flat file'' is a database that exists entirely in a single file. In this case, the flat file must be a [[GFF3]], [[GFF2]], or [http://www.ensembl.org/info/website/upload/bed.html BED] file.
+
 
+
Basic syntax:
+
bin/flatfile-to-json.pl --[gff|gff2|bed] <flat file> --tracklabel <track name> [options]
+
 
+
Hint: flatfile-to-json.pl simplifies the process of inputting a small number of tracks into JBrowse, since it does not use a config file. If you have many tracks, you will probably want to use a config file, because its structure will make the task of editing tracks easier. In that case, the appropriate script will be biodb-to-json.pl.
+
 
+
[[File:Flatfile-options.png|600px|thumb|center|Summary of flatfile-to-json.pl options.]]
+
 
+
{| class="wikitable"
+
|-
+
! Option
+
! Value
+
|-
+
| gff, gff2, or bed
+
| The name of the file that contains the feature data. The names of these options correspond to the file types, with the exception of gff, which uses a [[GFF3]] file instead of a [[GFF]] file. Exactly one of these three options must be used.
+
|-
+
| tracklabel
+
| The internal name that JBrowse will give to this feature track. This option requires a value.
+
|-
+
| key
+
| The external, human-readable label seen on the feature track when it is viewed in JBrowse. The value of key defaults to the value of tracklabel.
+
|-
+
| autocomplete
+
| Make the features of the track searchable. This option can be used with the arguments "label", "alias", or "all".<br>
+
&nbsp;&nbsp;&nbsp;'''label:''' Make the features searchable by the viewable name that they are associated with in JBrowse. In a gff3 file, this will be the "Name" in the attributes column.<br>
+
&nbsp;&nbsp;&nbsp;'''alias:''' Make the features searchable by an alternate name defined in the input file. In a gff3 file, this will be the "Alias" in the attributes column.<br>
+
&nbsp;&nbsp;&nbsp;'''all:''' Make the features searchable by both their label and their alias.<br>
+
|-
+
| out
+
| A path to the output directory (default is 'data' in the current directory).
+
|-
+
| [[JBrowseDev/Options/CssClass|cssClass]]
+
| The css class that will be used to create the feature track. This option makes it possible to choose how the feature track will look by selecting a template from a list of track types defined in genome.css. Click [http://jbrowse.org/code/jbrowse-master/docs/featureglyphs.html here] to view some of the feature track types that come with JBrowse. The default feature track type is "feature".
+
|-
+
| getType
+
| Causes the 'type' to be included in the output JSON file. The type is the feature that has been predicted (e.g. promoter site, gene). If a gff file is being used, the type will be in column 3.
+
|-
+
| getPhase
+
| Causes the 'phase' to be included in the output JSON file. The phase describes the reading frame of a DNA (or messenger RNA) sequence. If the phase is relevant, it can have the values 0, 1, or 2; otherwise, the value associated with the phase is '.'. If a gff file is being used, the phase will be in column 8.
+
|-
+
| getSubs
+
| If subfeatures have been specified for any features in the track, setting this option will cause them to appear. Otherwise, subfeatures will not appear.
+
|-
+
| getLabel
+
| Causes the Name attribute associated with each feature to be included in the track. If a gff3 file is being used, the Name will be in column 9 when it is defined.
+
|-
+
| [[JBrowseDev/Options/UrlTemplate|urlTemplate]]
+
| A url that your browser will visit when you click on a feature in this track. This is especially useful if you want to link a feature to a page with more information about that feature.
+
|-
+
| arrowheadClass
+
| When this option is used, directional features will be given an arrowhead. The presence and orientation of the arrowhead for each individual feature will depend on data in the input file. Arrowhead classes are defined in genome.css. There is only one that comes with JBrowse (transcript-arrowhead).
+
|-
+
| [[JBrowseDev/Options/SubfeatureClasses|subfeatureClasses]]
+
| The css class(es) that will be used for the subfeatures of a feature track. This option makes it possible to choose how the subfeatures will appear. Any of the classes in genome.css can be used for the subfeatures. The argument must be specified as a JSON association list (e.g. { "subfeature1": "cssclass1", "subfeature2": "cssclass2" }). This option must be used with getSubs in order for subfeatures to appear.
+
|-
+
| [[JBrowseDev/Options/ClientConfig|clientConfig]]
+
| Any additions or edits to the CSS class being used for the main features of the track (not for subfeatures). These edits must be specified in [[Glossary#JSON|JSON]] syntax, and any changes to the CSS style are associated with "featureCss", e.g. '{ "featureCss": "cssoption1: value1; cssoption2: value2", "histscale": 2}'.
+
|-
+
| type
+
| The type of feature that will appear in the feature track. This option is useful when the input file contains features of several different types, and you are interested in only having one type of feature (e.g. only having features that are genes) in the feature track. In gff3 files, the type is in the third column.
+
|-
+
| [[JBrowseDev/Options/ExtraData|extraData]]
+
| Use additional information from the input file to create variations in the appearance or behavior of individual features. This option is meant to be used in conjunction with other options. For each feature in the track, a perl subroutine is used to extract additional information, which is then associated with a variable. The value of this variable can be different for each feature. When the name of this variable is surrounded by curly braces and used in the argument for a different option, such as urlTemplate, the feature-specific data is used.
+
|-
+
| nclChunk
+
| The NCList chunk size. This option should not be used unless an error such as "json or perl structure exceeds maximum nesting level" is encountered. If this error does occur, lower the chunk size (the default is 50000).
+
|}
+
 
+
===bam-to-json.pl===
+
 
+
This script is very similar to flatfile-to-json.pl, but it specifically uses [[Glossary#BAM|BAM]] files as input.
+
 
+
Basic syntax:
+
bin/bam-to-json.pl --bam <bam file> --tracklabel <track name> [options]
+
 
+
{| class="wikitable"
+
|-
+
! Option
+
! Value
+
|-
+
| bam
+
| The name of the bam file that contains the feature data. This option requires a value.
+
|-
+
| tracklabel
+
| The internal name that JBrowse will give to this feature track. This option requires a value.
+
|-
+
| key
+
| The external, human-readable label seen on the feature track when it is viewed in JBrowse. The value of key defaults to the value of tracklabel.
+
|-
+
| out
+
| A path to the output directory (default is 'data' in the current directory).
+
|-
+
| [[JBrowseDev/Options/CssClass|cssClass]]
+
| The css class that will be used to create the feature track. This option makes it possible to choose how the feature track will look by selecting a template from a list of track types defined in genome.css. Click [http://jbrowse.org/code/jbrowse-master/docs/featureglyphs.html here] to view some of the feature track types that come with JBrowse. The default feature track type is "feature".
+
|-
+
| [[JBrowseDev/Options/ClientConfig|clientConfig]]
+
| Any additions or edits to the CSS class being used for the main features of the track (not for subfeatures). These edits must be specified in JSON syntax, and any changes to the CSS style are associated with "featureCss", e.g. '{ "featureCss": "cssoption1: value1; cssoption2: value2", "histscale": 2}'.
+
|-
+
| nclChunk
+
| The NCList chunk size in bytes. This option should not be used unless an error such as "json or perl structure exceeds maximum nesting level" is encountered. If this error does occur, lower the chunk size (the default is 50000 bytes).
+
|-
+
| compress
+
| This option causes the output JSON files for the track (trackData.json and hist-*.json) to be compressed with gzip.
+
|}
+
 
+
===biodb-to-json.pl===
+
 
+
This script uses a config file to produce a set of feature tracks in JBrowse. It can be used to obtain information from any database with appropriate schema, or from flat files. Because it can produce several feature tracks in a single execution, it is useful for large-scale feature data entry into JBrowse.
+
 
+
Basic syntax:
+
bin/biodb-to-json.pl --conf <config file> [options]
+
 
+
For more details about the structure of a config file, see Using Config Files.
+
 
+
{| class="wikitable"
+
|-
+
! Option
+
! Value
+
|-
+
| conf
+
| The name of the JSON configuration file that will be used. This option must be specified.
+
|-
+
| out
+
| A path to the output directory (default is 'data' in the current directory).
+
|-
+
| track
+
| The identifier of a single track that will be updated or added to JBrowse. In the list of key-value pairs comprising an individual track definition in the config file, the identifier will be the value associated with "track".
+
|-
+
| ref
+
| A comma-delimited list of reference sequence names, used to limit database queries to a subset of JBrowse reference sequences. By default, the database is queried for all reference sequences in JBrowse.
+
|-
+
| refid
+
| A comma-delimited list of reference sequence IDs from a Chado database, used to limit database queries to a subset of JBrowse reference sequences. By default, the database is queried for all reference sequences in JBrowse.
+
|-
+
| compress
+
| This option causes the output JSON files for the track (trackData.json and hist-*.json) to be compressed with gzip.
+
|}
+
 
+
===ucsc-to-json.pl===
+
 
+
This script uses data from UCSC genome annotation database. To reach this data, go to [http://hgdownload.cse.ucsc.edu/downloads.html hgdownload.cse.ucsc.edu] and click the link for the genome of interest. Next, click the "Annotation Database" link. The data relevant to ucsc-to-json.pl (*.sql and *.txt.gz files) can be downloaded from either this page or the FTP server described on this page.
+
 
+
Together, a *.sql and *.txt.gz pair of files (such as cytoBandIdeo.txt.gz and cytoBandIdeo.sql) constitute a database table. Ucsc-to-json.pl uses the *.sql file to get the column labels, and it uses the *.txt.gz file to get the data for each row of the table. For the example pair of files above, the name of the database table is "cytoBandIdeo". This will become the name of the JBrowse track that is produced from the data in the table.
+
 
+
In addition to all of the feature-containing tables that you want to use as JBrowse tracks, you will also need to download the trackDb.sql and trackDb.txt.gz files for the organism of interest.
+
 
+
Basic syntax:
+
bin/ucsc-to-json.pl --in <directory with files from UCSC> --track <database table name> [options]
+
 
+
Hint: If you're using this approach, it might be convenient to also download the sequence(s) from UCSC. These are usually available from the "Data set by chromosome" link for the particular genome or from the FTP server.
+
 
+
{| class="wikitable"
+
|-
+
! Option
+
! Value
+
|-
+
| in
+
| A directory containing all of the *.sql and *.txt.gz data from UCSC. This directory ''must'' contain the trackDb.sql and trackDb.txt.gz files for the organism of interest, as well as all of the feature-containing tables that you wish to use as JBrowse tracks.
+
|-
+
| track
+
| The name of the database table. If you leave off the .sql or .txt.gz extensions of the table files you wish to use, you will have this value.
+
|-
+
| out
+
| A path to the output directory (default is 'data' in the current directory).
+
|-
+
| [[JBrowseDev/Options/CssClass|cssClass]]
+
| The css class that will be used to create the feature track. This option makes it possible to choose how the feature track will look by selecting a template from a list of track types defined in genome.css. Click [http://jbrowse.org/code/jbrowse-master/docs/featureglyphs.html here] to view some of the feature track types that come with JBrowse. The default feature track type is "feature".
+
|-
+
| arrowheadClass
+
| When this option is used, directional features will be given an arrowhead. The presence and orientation of the arrowhead for each individual feature will depend on data in the input file. Arrowhead classes are defined in genome.css. There is only one that comes with JBrowse (transcript-arrowhead).
+
|-
+
| [[JBrowseDev/Options/SubfeatureClasses|subfeatureClasses]]
+
| The css class(es) that will be used for the subfeatures of a feature track. This option makes it possible to choose how the subfeatures will appear. Any of the classes in genome.css can be used for the subfeatures. The argument must be specified as a JSON association list (e.g. { "subfeature1": "cssclass1", "subfeature2": "cssclass2" }).
+
|-
+
| [[JBrowseDev/Options/ClientConfig|clientConfig]]
+
| Any additions or edits to the CSS class being used for the main features of the track (not for subfeatures). These edits must be specified in JSON syntax, and any changes to the CSS style are associated with "featureCss", e.g. '{ "featureCss": "cssoption1: value1; cssoption2: value2", "histscale": 2}'.
+
|-
+
| nclChunk
+
| The NCList chunk size in bytes. This option should not be used unless an error such as "json or perl structure exceeds maximum nesting level" is encountered. If this error does occur, lower the chunk size (the default is 50000 bytes).
+
|-
+
| compress
+
| This option causes some of the output JSON files (trackData.json and hist-*.json) to be compressed with gzip.
+
|-
+
| sortMem
+
| The maximum amount of RAM (in bytes) to use for sorting the features. The default value is 536870912 bytes (512MiB).
+
|}
+
 
+
===draw-basepair-track.pl===
+
 
+
This script inputs a single base pairing track into JBrowse. A base pairing track is a distinctive track type that represents base pairing between nucleotides as arcs.
+
 
+
Terminology: In JBrowse jargon, a ''tile'' is a png image that is used as an entire track. When draw-basepair-track.pl is executed, a tile is created for each zoom level, and the set of generated tiles is used to display the track at all possible zoom levels. This is also the case for wig-to-json.pl.
+
 
+
Basic syntax:
+
bin/draw-basepair-track.pl --gff <gff file> --tracklabel <track name> [options]
+
 
+
[[File:Basepair-options.png|600px|center|thumb|Summary of draw-basepair-track.pl options.]]
+
 
+
{| class="wikitable"
+
|-
+
! Option
+
! Value
+
|-
+
| gff
+
| The name of the gff file that will be used. This option must be specified.
+
|-
+
| tracklabel
+
| The internal name that JBrowse will give to this feature track. This option requires a value.
+
|-
+
| key
+
| The external, human-readable label seen on the feature track when it is viewed in JBrowse. The value of key defaults to the value of tracklabel.
+
|-
+
| out
+
| A path to the output directory (default is 'data' in the current directory).
+
|-
+
| tile
+
| The directory where the tiles, or images corresponding to each zoom level of the track, are stored. Defaults to data/tiles.
+
|-
+
| bgcolor
+
| The color of the track background. Specified as "RED,GREEN,BLUE" in base ten numbers between 0 and 255. Defaults to "255,255,255".
+
|-
+
| fgcolor
+
| The color of the track foreground (i.e. the base pairing arcs). Specified as "RED,GREEN,BLUE" in base ten numbers between 0 and 255. Defaults to "0,255,0".
+
|-
+
| width
+
| The width in pixels of each tile. The default value is 2000.
+
|-
+
| height
+
| The height in pixels of each tile. Changing this parameter will cause a corresponding change in the top-to-bottom height of the track in JBrowse. The default value is 100.
+
|-
+
| thickness
+
| The thickness of the base pairing arcs in the track. The default value is 2.
+
|-
+
| nolinks
+
| Disables use of file system links to compress duplicate image files.
+
|}
+
 
+
===wig-to-json.pl===
+
 
+
Using a [http://genome.ucsc.edu/goldenPath/help/wiggle.html WIG] file, this script inputs a single wiggle track into JBrowse. In a wiggle track, a numeric value is associated with each nucleotide position in the reference sequence. This is represented in JBrowse as a track that looks like a bar graph, where the horizontal axis is for each nucleotide position, and the vertical axis is for the number associated with that position. The vertical axis currently does not have a scale; rather, the heights for each position are relative to each other.
+
 
+
Special Dependencies: libpng
+
 
+
In order to use wig-to-json.pl, the code for wig2png must be compiled. This can be done with the following command:
+
 
+
make
+
 
+
'''Note:''' If you are using Mac OS X, it might be necessary to execute 'make' in the following way:
+
 
+
make GCC_LIB_ARGS=-L/usr/X11/lib GCC_INC_ARGS=-I/usr/X11/include
+
 
+
----
+
 
+
Terminology: In JBrowse jargon, a ''tile'' is a png image that is used as an entire track. When wig-to-json.pl is executed, a tile is created for each zoom level, and the set of generated tiles is used to display the track at all possible zoom levels. This is also the case for draw-basepair-track.pl.
+
 
+
Basic syntax:
+
bin/wig-to-json.pl --wig <wig file> --tracklabel <track name> [options]|}
+
 
+
Hint: If you are using this type of track to plot a measure of a prediction's quality, where the range of possible quality scores is from some lowerbound to some upperbound (for instance, between 0 and 1), you can specify these bounds with the max and min options.
+
 
+
[[File:Wiggle-options.png|600px|center|thumb|Summary of wig-to-json.pl options.]]
+
 
+
{| class="wikitable"
+
|-
+
! Option
+
! Value
+
|-
+
| wig
+
| The name of the wig file that will be used. This option must be specified.
+
|-
+
| tracklabel
+
| The internal name that JBrowse will give to this feature track. This option requires a value.
+
|-
+
| key
+
| The external, human-readable label seen on the feature track when it is viewed in JBrowse. The value of key defaults to the value of tracklabel.
+
|-
+
| out
+
| A path to the output directory (default is 'data' in the current directory).
+
|-
+
| tile
+
| The directory where the tiles, or images corresponding to each zoom level of the track, are stored. Defaults to data/tiles.
+
|-
+
| bgcolor
+
| The color of the track background. Specified as "RED,GREEN,BLUE" in base ten numbers between 0 and 255. Defaults to "255,255,255".
+
|-
+
| fgcolor
+
| The color of the track foreground (i.e. the vertical bars of the wiggle track). Specified as "RED,GREEN,BLUE" in base ten numbers between 0 and 255. Defaults to "105,155,111".
+
|-
+
| width
+
| The width in pixels of each tile. The default value is 2000.
+
|-
+
| height
+
| The height in pixels of each tile. Changing this parameter will cause a corresponding change in the top-to-bottom height of the track in JBrowse. The default value is 100.
+
|-
+
| min
+
| The lowerbound to use for the track. By default, this is the lowest value in the wiggle file.
+
|-
+
| max
+
| The upperbound to use for the track. By default, this will be the highest value in the wiggle file.
+
|}
+
 
+
==Naming==
+
 
+
===generate-names.pl===
+
 
+
This script is only important if your feature tracks are annotated with names (e.g. the name of a gene in a track containing genes). If the 'autocomplete' option was used when inputting a track, running this script will make the locations of that track's features searchable via the small search text box (next to the "Go" button). Clicking on Go after entering a search term will take you to the annotation element that you searched for.
+
 
+
Basic syntax:
+
bin/generate-names.pl
+
 
+
Note that generate-names.pl does not require any arguments. However, some options are available:
+
 
+
{| class="wikitable"
+
|-
+
! Option
+
! Value
+
|-
+
| dir
+
| A path to the output directory (default is 'data/names' in the current directory).
+
|-
+
| thresh
+
| A lower-bound on the Patricia trie chunk size. Specifically, the lowest possible chunk size is (thresh + 1). The default value is 200. In this context, a chunk is a group of connected Patricia trie nodes that can be visualized as a single entity, and the chunk size is the total number of genomic features contained in a chunk. The lower the value of thresh, the more chunks there will be.
+
|-
+
| verbose
+
| This setting causes information about the division of nodes into chunks to be printed to the screen.
+
|}
+
 
+
==Removing Tracks==
+
 
+
While JBrowse does not support a script that removes individual tracks, there are a number of possible options that can be taken to change or remove a track:
+
 
+
'''1. Overwrite the unwanted track with a new track.''' This is useful when a mistake was made in preparing a track, and you are interested in removing the track only so that you can replace it with a correct track that has the same tracklabel (the 'tracklabel' is a track's internal name). This is done by writing the new information with the same value associated with the tracklabel option.
+
 
+
'''2. Remove the entire data directory.''' This is useful when you want to completely remove a track or set of tracks, rather than replacing them with different tracks. This is perhaps the fastest way to remove a track, but it has the obvious pitfall that you might also be removing tracks that you wanted to keep. If you don't have very many feature tracks, or if biodb-to-json.pl is being used to generate most of the feature tracks, (in which case most of the tracks can be recovered with a single execution of biodb-to-json.pl), this option will be fine.
+
 
+
'''3. Remove the information about the specific tracks from the data directory.''' This allows you to remove a track without removing every track, combining the advantages of the previous two methods for removing a set of tracks. The disadvantage is that you must manually remove an entry from a file that is interpreted by JBrowse. The important part to remove will be in trackInfo.js if you want to remove a feature track or refSeqs.js if you want to remove a sequence track.
+
 
+
=Additional Information=
+
 
+
==Using Config Files==
+
 
+
In the context of JBrowse, a config file is a set of instructions in JSON syntax that first indicates the location of the feature data, and then specifies a list of JBrowse tracks that can use the referenced feature data. The options for the feature tracks are virtually the same in the config file as they are in flatfile-to-json.pl. The difference is that, instead of inputting the feature tracks one at a time with flatfile-to-json.pl, the tracks are specified all at once in a file. This greatly reduces the amount of typing needed to change a track, especially for tracks that use several options. It also makes it easier to manage a large number of tracks, since the options used for those tracks are all recorded in a human-readable way.
+
 
+
 
+
Here is a sample config file with each line explained. Note that, in order for this config file to work, it would be necessary to remove the grey comments (since JSON does not support them).
+
 
+
{
+
  <span style="color:#888888">This is the header. It contains information about the database.</span>
+
  <span style="color:#888888">description: a brief textual description of the data source.</span>
+
  "description": "D. melanogaster (release 5.37)",
+
  <span style="color:#888888">db_adaptor: a perl module with methods for opening databases and extracting<br>  information. This will normally be either Bio::DB::SeqFeature::Store,<br>  Bio::DB::Das::Chado, or Bio::DB::GFF.</span>
+
  "db_adaptor": "Bio::DB::SeqFeature::Store",
+
  <span style="color:#888888">db_args: arguments required to produce an instance of the db_adaptor. The<br>  required arguments can be found by searching for the db_adaptor on the CPAN<br>  website.</span>
+
  "db_args": {
+
              <span style="color:#888888">adaptor: With Bio::DB::SeqFeature::Store, a value of "memory"<br>              for the adaptor indicates that the data is stored somewhere in<br>              the file system. Alternatively, it might have been stored in a<br>              database such as MySQL or BerkeleyDB.</span>
+
              "-adaptor": "memory",
+
              <span style="color:#888888">dir: given the "memory" argument for the adaptor, this is the<br>              file system path to the location in memory where the data is<br>              stored. Data will automatically be extracted from any *.gff<br>              or *.gff3 files in this directory.</span>
+
              "-dir": "/Users/stephen/Downloads/dmel_r5.37"
+
            },
+
  <span style="color:#888888">This is the body. It contains information about the feature tracks.</span>
+
  <span style="color:#888888">TRACK DEFAULTS: The default options for every track.</span>
+
  "TRACK DEFAULTS": {
+
    <span style="color:#888888">class: same as 'cssClass' in flatfile-to-json.pl.</span>
+
    "class": "feature"
+
  },
+
  <span style="color:#888888">tracks: information about each individual track.</span>
+
  "tracks": [
+
    <span style="color:#888888">Information about the first track.</span>
+
    {
+
      <span style="color:#888888">track: same as 'tracklabel' in flatfile-to-json.pl.</span>
+
      "track": "gene",
+
      <span style="color:#888888">key: same meaning as in flatfile-to-json.pl.</span>
+
      "key": "Gene Span",
+
      <span style="color:#888888">feature: an array of the feature types that will be used for the track.<br>      Similar to 'type' in flatfile-to-json.pl.</span>
+
      "feature": ["gene"],
+
      <span style="color:#888888">autocomplete: same meaning as in flatfile-to-json.pl.</span>
+
      "autocomplete": "all",
+
      "class": "feature2",
+
      <span style="color:#888888">urlTemplate: same meaning as in flatfile-to-json.pl. Note how <br>      urlTemplate is being used with a variable called "feature_id" defined<br>      in extraData. In this way, different features in the same track can<br>      be linked to different pages on FlyBase.</span>
+
      "urlTemplate": "http://flybase.org/cgi-bin/fbidq.html?{feature_id}",
+
      <span style="color:#888888">extraData: same as in flatfile-to-json.pl.</span>
+
      "extraData": {"feature_id": "sub {shift-&gt;attributes(\"load_id\");}"}
+
    },
+
    <span style="color:#888888">Information about the second track.</span>
+
    {
+
      "track": "mRNA",
+
      "feature": ["mRNA"],
+
      "autocomplete": "alias",
+
      <span style="color:#888888">subfeatures: similar to 'getSubs' in flatfile-to-json.pl.</span>
+
      "subfeatures": true,
+
      "key": "mRNA",
+
      "class": "transcript",
+
      <span style="color:#888888">subfeature_classes: same as 'subfeatureClasses' in flatfile-to-json.pl.</span>
+
      "subfeature_classes": {
+
        "CDS": "transcript-CDS",
+
        "five_prime_UTR": "transcript-five_prime_UTR",
+
        "three_prime_UTR": "transcript-three_prime_UTR"
+
      },
+
      <span style="color:#888888">arrowheadClass: same meaning as in flatfile-to-json.pl.</span>
+
      "arrowheadClass": "transcript-arrowhead",
+
      <span style="color:#888888">clientConfig: same meaning as in flatfile-to-json.pl.</span>
+
      "clientConfig": {
+
        "histScale":5
+
      },
+
      "urlTemplate": "http://flybase.org/cgi-bin/fbidq.html?{feature_id}",
+
      "extraData": {"feature_id": "sub {shift-&gt;attributes(\"load_id\");}"}
+
    }
+
  ]
+
}
+
 
+
 
+
Note how the config file is divided into two parts, a header section that contains information about the database, and a body section that contains information about the feature tracks.
+
 
+
==Using a Database Backend==
+
 
+
'''Giving JBrowse Access to a Database'''
+
 
+
JBrowse is capable of extracting sequence and feature information from databases such as PostgreSQL, MySQL, BerkeleyDB, and Oracle. This is done by using prepare-refseqs.pl or biodb-to-json.pl with a config file whose header section contains information about the database.
+
 
+
For a PostgreSQL database with the Chado schema, the config file header would look something like this:
+
 
+
{
+
  "description": "D. melanogaster (release 5.37)",
+
  "db_adaptor": "Bio::DB::Das::Chado",
+
  "db_args": { "-dsn": "dbi:Pg:dbname=fruitfly;host=localhost;port=5432",
+
                "-user": "yourusername",
+
                "-pass": "yourpassword"
+
              },
+
  ...
+
}
+
 
+
In the database source name (dsn) argument, 'dbi:Pg' indicates that you are using PostgreSQL, and the dbname, host, and port were specified when the database was created with PostgreSQL's createdb command. The user and pass arguments were specified when the PostgreSQL user account was created with the createuser command. Collectively, these arguments identify the database and give the Bio::DB::Das::Chado object access to it.
+
 
+
Assuming that you already have access to an existing database with the Chado schema and the feature data you're interested in, this is all you need in order to use JBrowse with the database.
+
 
+
'''Preparing a Database From Scratch'''
+
 
+
The way to set up the database is as follows:<br>
+
1. Install the database management system (DBMS).<br>
+
2. Import the appropriate schema into the database.<br>
+
3. Import the sequence and feature data into the database.
+
 
+
As an example, try to prepare a PostgreSQL database with the Chado schema. Chado-1.11 can be downloaded [http://sourceforge.net/projects/gmod/files/gmod/chado-1.11/ here] and most of the information you need to know about Chado installation can be found in [http://gmod.svn.sourceforge.net/viewvc/gmod/schema/trunk/chado/INSTALL.Chado INSTALL.Chado]. If you choose to install the latest stable version of PostgreSQL (9.0.4 as of this date), you might encounter a few quirks:
+
 
+
When you create new users, you might have to explicitly request a password prompt with the '--pwprompt' option. If you were able to create a user account ''without'' specifying a password for that user, no password will work for that account.
+
 
+
When you are running 'make load_schema', you might get an error message about a failure to drop a database that does not exist. If this error is encountered, open bin/test_load.sh and comment out this line:
+
 
+
dropdb -h $DBHOST -p $DBPORT -U $DBUSER $DBNAME;
+
 
+
The resulting line should look like this:
+
 
+
# dropdb -h $DBHOST -p $DBPORT -U $DBUSER $DBNAME;
+
 
+
When you are running 'make ontologies', you will be given a list of ontologies that you can install. At the very least be sure to install the Relationship Ontology, Sequence Ontology, Gene Ontology, and Chado Feature Properties.
+
 
+
There are two GMOD scripts that are used to insert data from a fasta or gff file into the database:
+
 
+
1. '''gmod_gff3_preprocessor.pl''' standardizes the gff file, sorting the feature data and moving any fasta sequences to a separate file.
+
 
+
Basic syntax:
+
gmod_gff3_preprocessor.pl --gfffile <gff file>
+
 
+
2. '''gmod_bulk_load_gff3.pl''' uses the output of '''gmod_gff3_preprocessor.pl''' to input data into the database.
+
 
+
Fasta syntax:
+
gmod_bulk_load_gff3.pl --organism <common name> --fastafile <fasta-formatted sequence file>
+
 
+
GFF syntax:
+
gmod_bulk_load_gff3.pl --organism <common name> --gfffile <processed gff file>
+
 
+
After inputting this data into the database, JBrowse should be able to access it using a config file with a header like the one at the beginning of this topic.
+
 
+
=See also=
+
 
+
* [[JBrowse Tutorial]]
+
 
+
=External Links=
+
 
+
* [http://genome.cshlp.org/content/19/9/1630.full JBrowse: A Next Generation Genome Browser]
+
* [http://jbrowse.org/code/jbrowse-master/docs/ Documentation from the JBrowse Package]
+
* [http://biowiki.org/view/JBrowse/QuickTutorial Quick JBrowse Tutorial from BioWiki]
+
  
 
[[Category:JBrowse]]
 
[[Category:JBrowse]]

Latest revision as of 17:25, 29 March 2013