Difference between revisions of "TIGR-Workflow / Ergatis"

From GMOD
Jump to: navigation, search
(What is it?)
(added screenshots)
Line 1: Line 1:
 
== What is it? ==
 
== What is it? ==
 +
[[Image:Ergatis_monitor.png|300 px|thumb|Screenshot of the Ergatis pipeline monitor]]
  
 
(From http://ergatis.sourceforge.net) :
 
(From http://ergatis.sourceforge.net) :
 
[[Image:Ergatis_builder.png|300 px|thumb|Screenshot of the Ergatis pipeline builder]]
 
  
 
Ergatis is a web-based utility that is used to create, run, and monitor reusable computational analysis pipelines. It contains pre-built components for common bioinformatics analysis tasks. These components can be arranged graphically to form highly-configurable pipelines. Each analysis component supports multiple output formats, including the Bioinformatic Sequence Markup Language (BSML). The current implementation includes support for data loading into project databases following the CHADO schema, a highly normalized, community-supported schema for storage of biological annotation data.
 
Ergatis is a web-based utility that is used to create, run, and monitor reusable computational analysis pipelines. It contains pre-built components for common bioinformatics analysis tasks. These components can be arranged graphically to form highly-configurable pipelines. Each analysis component supports multiple output formats, including the Bioinformatic Sequence Markup Language (BSML). The current implementation includes support for data loading into project databases following the CHADO schema, a highly normalized, community-supported schema for storage of biological annotation data.
Line 16: Line 15:
  
 
== How is it a part of GMOD? ==
 
== How is it a part of GMOD? ==
 +
 +
[[Image:Ergatis_builder.png|300 px|thumb|Screenshot of the Ergatis pipeline builder]]
  
 
Currently, only loosely.  As described above, Ergatis has been used at TIGR (now part of JCVI) for the majority of annotation and comparative genomics computation and the output of many of the components (such as blast, gene predictions, clustering etc.) can be loaded automatically into a chado database instance.  In the past, due to its primary use at TIGR/JCVI, the database support was limited to Sybase, though flat-files could also be generated.  Development is now underway to port this to also support PostgreSQL and Oracle.
 
Currently, only loosely.  As described above, Ergatis has been used at TIGR (now part of JCVI) for the majority of annotation and comparative genomics computation and the output of many of the components (such as blast, gene predictions, clustering etc.) can be loaded automatically into a chado database instance.  In the past, due to its primary use at TIGR/JCVI, the database support was limited to Sybase, though flat-files could also be generated.  Development is now underway to port this to also support PostgreSQL and Oracle.

Revision as of 16:28, 6 November 2007

What is it?

Screenshot of the Ergatis pipeline monitor

(From http://ergatis.sourceforge.net) :

Ergatis is a web-based utility that is used to create, run, and monitor reusable computational analysis pipelines. It contains pre-built components for common bioinformatics analysis tasks. These components can be arranged graphically to form highly-configurable pipelines. Each analysis component supports multiple output formats, including the Bioinformatic Sequence Markup Language (BSML). The current implementation includes support for data loading into project databases following the CHADO schema, a highly normalized, community-supported schema for storage of biological annotation data.

Ergatis uses the Workflow engine to process its work on a compute grid. Workflow provides an XML language and processing engine for specifying the steps of a computational pipeline. It provides detailed execution status and logging for process auditing, facilitates error recovery from point of failure, and is highly scalable with support for distributed computing environments. The XML format employed enables commands to be run serially, in parallel, and in any combination or nesting level.

This framework has been employed in the annotation of several large, eukaryotic organisms, including Aedes aegypti and Trichomonas vaginalis.

More information is available at:

http://ergatis.sourceforge.net

How is it a part of GMOD?

Screenshot of the Ergatis pipeline builder

Currently, only loosely. As described above, Ergatis has been used at TIGR (now part of JCVI) for the majority of annotation and comparative genomics computation and the output of many of the components (such as blast, gene predictions, clustering etc.) can be loaded automatically into a chado database instance. In the past, due to its primary use at TIGR/JCVI, the database support was limited to Sybase, though flat-files could also be generated. Development is now underway to port this to also support PostgreSQL and Oracle.

Joshua Orvis is the lead developer of Ergatis and is currently at the Institute for Genome Sciences at the University of Maryland School of Medicine.