Difference between revisions of "GBrowse Adaptors"

Latest revision as of 16:22, 7 August 2012

GBrowse has a flexible adaptor (yes, it is spelled that way and is not "adapter") system for running off various types of databases/sources. A common question is "which adaptor should I be using?" This attempts to answer that question.

Adaptor	Other required software	Roughly how many users	Pros	Cons
Bio::DB::SeqFeature::Store (use bp_seqfeature_load.pl)	MySQL, PostgreSQL, SQLite, BerkeleyDB	Many and growing fast.	Roughly 4X faster than Bio::DB::GFF for the same data; designed to work with GFF3	Developed for use with GFF3; about 2X slower than Bio::DB::GFF to load a database
Bio::DB::GFF (use bp_load_gff.pl, bp_bulk_load_gff.pl, bp_fast_load_gff.pl)	A relational database server: MySQL, PostgreSQL, Oracle, or BerkeleyDB	Lots! (Especially MySQL)	Quite fast; large user base; Have to use this if your data is in the (now deprecated) GFF2 format.	Does not work well with GFF3 formatted data
Bio::DB::Sam (available from CPAN)	SAMtools	Growing (particularly with GBrowse2)	Very fast access to NextGen sequencing data	Difficult to use with GBrowse 1.70
Bio::DB::BigWig and Bio::DB::BigWigSet (available from CPAN)	UCSC Formats	Growing (particularly with GBrowse2)	Very fast access to data in bigWig format	Difficult to use with GBrowse 1.70
Bio::DB::BigBed (available from CPAN)	UCSC Formats	Growing (particularly with GBrowse2)	Very fast access to data in bigBed format	Difficult to use with GBrowse 1.70
Bio::DB::Das::Chado (available from CPAN)	PostgreSQL and a Chado schema	Relatively few due to the specialized nature of Chado	Allows 'live' viewing of the features in a Chado database	Slow compared to Bio::DB::GFF
Bio::DB::Das::BioSQL (available from CPAN)	MySQL and a BioSQL schema	Relatively few due to the small number of BioSQL users	Allows 'live' viewing of the features in a BioSQL database	Slow compared to Bio::DB::GFF
Memory (ie, flat file database using either Bio::DB::GFF or SeqFeature::Store)	None	For real servers, none	Easy for rapid development and testing	Very slow for more than a few thousand features
LuceGene	Lucene (searches indexed flat files)	Relatively few

Email Threads

There have been some useful email threads on adaptor choices and tradeoffs.

Memory Database, 2010/06

@@ Line 9: / Line 9: @@
 ! Cons
 |-
-| {{BPM|Bio::DB::SeqFeature::Store}}
+| {{BPM|Bio::DB::SeqFeature::Store}} (use bp_seqfeature_load.pl)
 | [[MySQL]], [[PostgreSQL]], SQLite, BerkeleyDB
 | Many and growing fast.
@@ Line 15: / Line 15: @@
 | Developed for use with [[GFF3]]; about 2X slower than Bio::DB::GFF to load a database
 |-
-| {{BPM|Bio::DB::GFF}}
+| {{BPM|Bio::DB::GFF}} (use bp_load_gff.pl, bp_bulk_load_gff.pl, bp_fast_load_gff.pl)
 | A [[Glossary#Database Management System|relational database server]]: [[MySQL]], [[PostgreSQL]], Oracle, or BerkeleyDB
 | Lots! (Especially [[MySQL]])
@@ Line 21: / Line 21: @@
 | Does not work well with [[GFF3]] formatted data
 |-
-| Bio::DB::Sam (available from CPAN)
+| {{CPAN|Bio::DB::Sam}} (available from CPAN)
 | [http://samtools.sourceforge.net/ SAMtools]
 | Growing (particularly with GBrowse2)
@@ Line 27: / Line 27: @@
 | Difficult to use with GBrowse 1.70
 |-
-| Bio::DB::Das::Chado (available from CPAN)
+| {{CPAN|Bio::DB::BigWig}} and {{CPAN|Bio::DB::BigWigSet}} (available from CPAN)
+| [http://genome.ucsc.edu/FAQ/FAQformat.html UCSC Formats]
+| Growing (particularly with GBrowse2)
+| Very fast access to data in [http://genome.ucsc.edu/FAQ/FAQformat.html#format6.1 bigWig] format
+| Difficult to use with GBrowse 1.70
+|-
+| {{CPAN|Bio::DB::BigBed}} (available from CPAN)
+| [http://genome.ucsc.edu/FAQ/FAQformat.html UCSC Formats]
+| Growing (particularly with GBrowse2)
+| Very fast access to data in [http://genome.ucsc.edu/FAQ/FAQformat.html#format1.5 bigBed] format
+| Difficult to use with GBrowse 1.70
+|-
+| {{CPAN|Bio::DB::Das::Chado}} (available from CPAN)
 | [[PostgreSQL]] and a [[Chado]] [[Glossary#Database Schema|schema]]
 | Relatively few due to the specialized nature of Chado
@@ Line 33: / Line 45: @@
 | Slow compared to Bio::DB::GFF
 |-
-| Bio::DB::Das::BioSQL (available from CPAN)
+| {{CPAN|Bio::DB::Das::BioSQL}} (available from CPAN)
 | [[MySQL]] and a [[BioSQL]] schema
 | Relatively few due to the small number of BioSQL users
@@ Line 55: / Line 67: @@
 There have been some useful email threads on adaptor choices and tradeoffs.
-* {{NabbleThreadLink|Memory-Database-td862590.html#a862590|Memory Database}}, 2010/06
+* {{NabbleThreadLink|Memory-Database-td862590.html|Memory Database}}, 2010/06
 [[Category:GBrowse]]

Difference between revisions of "GBrowse Adaptors"

Latest revision as of 16:22, 7 August 2012

Email Threads

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Documentation

Community

Tools