Difference between revisions of "Using Existing Databases With JBrowse"

From GMOD
Jump to: navigation, search
m (Created Page)
 
m (See also)
Line 56: Line 56:
 
=See also=
 
=See also=
  
* [[JBrowseDev/Current/Usage/General#prepare-refseqs.pl]]
+
* [[JBrowse_Tutorial#Data_from_a_database | Section on using databases from a JBrowse Tutorial]]
* [[JBrowseDev/Current/Usage/General#biodb-to-json.pl]]
+
* [[JBrowse_Tutorial#Data_from_a_database]]
+
  
 
[[Category:JBrowse]]
 
[[Category:JBrowse]]

Revision as of 21:48, 28 July 2011

This page explains the configuration necessary to allow JBrowse server-side scripts to use information from a database.

Giving JBrowse Access to a Database

JBrowse is capable of extracting sequence and feature information from databases managed by a Database Management System (DBMS) such as PostgreSQL, MySQL, or Oracle. This is done by using prepare-refseqs.pl or biodb-to-json.pl with a config file whose header section contains information about the database.

For example, the config file header for a PostgreSQL database with the Chado schema will look something like this:

{
  "description": "D. melanogaster (release 5.37)",
  "db_adaptor": "Bio::DB::Das::Chado",
  "db_args": { "-dsn": "dbi:Pg:dbname=fruitfly;host=localhost;port=5432",
               "-user": "yourusername",
               "-pass": "yourpassword"
             },
  ...
}

In the database source name (dsn) argument, 'dbi:Pg' indicates that you are using PostgreSQL, and the dbname, host, and port were specified when the database was created with PostgreSQL's createdb command. The user and pass arguments were specified when the PostgreSQL user account was created with the createuser command. Collectively, these arguments identify the database and give the Bio::DB::Das::Chado object access to it. Other adaptors (Bio::DB::SeqFeature::Store, Bio:DB::GFF, etc.) will require similar information.

Assuming that you already have access to an existing database with an appropriate schema (Chado, in this example), this is all you will need in order to use JBrowse with that database.

Preparing a Database That JBrowse Can Use

As a demonstration, this section describes the process of preparing a PostgreSQL database with the Chado schema.

1. Install the DBMS.

If you already have PostgreSQL 8.1 or higher, proceed to the next step. Otherwise, the latest stable version of PostgreSQL can be downloaded from the PostgreSQL Downloads Page.

2. Import the appropriate schema.

Chado can be downloaded from the GMOD sourceforge repository. Most of the information you need to know about Chado installation can be found in INSTALL.Chado.

When running 'make ontologies' for Chado, you will be given a list of ontologies that you can install. At the very least be sure to install the Relationship Ontology, Sequence Ontology, Gene Ontology, and Chado Feature Properties.

3. Import the sequence and feature data.

There are two GMOD scripts that are used to insert data from a fasta or gff file into a database with the Chado schema:

1. gmod_gff3_preprocessor.pl standardizes the gff file, sorting the feature data and moving any fasta sequences to a separate file.

Basic syntax:

gmod_gff3_preprocessor.pl --gfffile <gff file>

2. gmod_bulk_load_gff3.pl uses the output of gmod_gff3_preprocessor.pl to input data into the database.

Fasta syntax:

gmod_bulk_load_gff3.pl --organism <common name> --fastafile <fasta-formatted sequence file>

GFF syntax:

gmod_bulk_load_gff3.pl --organism <common name> --gfffile <processed gff file>

After inputting this data into the database, JBrowse should be able to access it using a config file with a header like the one at the beginning of this topic.

See also