NOTE: We are working on migrating this site away from MediaWiki, so editing pages will be disabled for now.

Standard URL

Revision as of 22:44, 15 September 2009 by Clements (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

In order to simplify the retrieval of common datasets, the Generic Model Organisms Database (GMOD) community has recommended a series of standard URLs, or a common download URL. Each participating MOD has an index page like the ones below, describing the species and datasets that are available.

MOD Standard URL

Genome datasets available through the GMOD commom URL
MOD Standard URL Description
WormBase Caenorhabditis elegans and related nematodes
wFleaBase Daphnia pulex and related crustaceans
DroSpeGe Twelve Drosophila insect species genomes

About GMOD Standard URL

This standard specifies the following URLs (all located under Display this HTML-formatted index page that contains links to each of the species available through common URLs. See also Todd Harris' powerpoint presentation given at the Spring, 2005 GMOD meeting. The uses for these common URLs are two-fold:

  • Keep it simple for scientists to guess where to find a genome, when they may be unfamiliar with the MOD website.
  • Keep it standard for computists to program a long-lasting, computer parsable data URL, with no guesswork on spelling, and defined data formats.

Standard URL Description
/genome/Binomial_name An index page for species "Binomial_name". This will be an HTML-format page containing links to each of the genome releases.
/genome/Binomial_name/release Leads to index for the named release. It should be an HTML-format page containing links to each of the data sets described below.
/genome/Binomial_name/current Leads to an index of the most current release, symbolic link style.
/genome/Binomial_name/current/dna Returns a FASTA file containing big DNA fragments (e.g. chromosomes). MIME type is application/x-fasta.
/genome/Binomial_name/current/mrna Returns a FASTA file containing spliced mRNA transcript sequences. MIME type is application/x-fasta.
/genome/Binomial_name/current/ncrna Returns a FASTA file containing non-coding RNA sequences. MIME type is application/x-fasta.
/genome/Binomial_name/current/protein Returns a FASTA file containing all the protein sequences known to be encoded by the genome. MIME type is application/x-fasta
/genome/Binomial_name/current/feature Returns a GFF3 file describing genome annotations. MIME type is application/x-gff3.

Other names for this: Common download URL, Common URL, Standard URL

Note: MODs may optionally provide URLs in the short form of G_species (eg C_elegans) as a convenience for users. This should be supplied in addition to the full Binomial_name standard.

Common /genome pages

These projects provide data information at /genome/, if not yet in the common formats described below.

Common /genome/ data pages
MOD Common URL Description
Medicago Medicago truncatula plant genome
MaizeGDB Maize corn genome
Neurospora Neurospora crassa
Human vector insects genomes

MOD Non-Standard URL

For those genome projects that haven't yet standardized their URLs there is this site that lists what is available:

Reference Genomes

See also