Genome grid

From GMOD
Revision as of 04:50, 5 September 2007 by Dongilbert (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Genome Grid Aims

This project aims to create a usable package of genome data analysis with cyberinfrastructure: methods, protocols, documentation, suited for genome informaticians.

The thrust of this work is parallelizing genome data, not software, to run as many separate 1-cpu jobs as is suitable to the task and resources. It focuses on data management, transport to/from, indexing, and splitting data transparently from several source data sets to compute sites, and collating results to return to the scientist.

The poster-child task is a gene homology Blast analysis of any genome, but use of several other genomics programs from gene predictors, EST assemblers, phylogeny analyses, etc. are part of the project goal. Most of these work fine on any size of data set, and subset results can be added together.

One way to do this is as a kind Teragrid science gateway project, where the authenication, admin., grid resource finding are contained in the gateway components. Parts that the user genomicist sees are for data and analysis tool selection. Many desired genome tools are available at some Teragrid sites, but methods to transparently copy and parallelize data sets are not.

Find more background here http://iubio.bio.indiana.edu/biogrid/genome-on-teragrid-poster.html , or Google: genome teragrid

Package contents will be provided thru the GMOD umbrella. See starter project at http://gmod.cvs.sourceforge.net/gmod/genogrid/