Chado Mage Module

From GMOD
Revision as of 15:41, 21 February 2007 by 72.89.239.54 (Talk)

Jump to: navigation, search

Introduction

The Rad module is designed to store data from microarray experiments. It is based on the RAD database but has been substantially modified to contain the necessary foreign keys and satisfy the Chado naming conventions.

Rad and Expression

The Rad module and the Expression module can be considered overlapping but complementary. The Rad module can store data taken directly from the experimental results whereas the Expression module is typically used to store summary data taken from the biological literature, or extracted from the microarray data stored in Rad. The Rad module handles details about experiments that the Expression module does not whereas the Expression module can be thought of a simpler set of tables designed to tie ontologies concerned with expression to sequence features.

Usage

  1. Assume that forebrain is a record in the cvterm table from an anatomy ontology.
  2. Create a biomaterial record for the forebrain sample the expression was observed in. The organism_id would be for Drosophila melanogaster.
  3. Create a biomaterialprop record to link records from 1 and 2.
  4. Create or use an arraydesign record for the assay platform. This could be something like Drosophila2 (an Affymetrix platform), or even a string like features if we just want to report expression or lack thereof for all genes in the assayed sample.
  5. Create an assay record to represent the event where the forebrain sample was measured. It depends on the record in 4.
  6. Link records from 2 and 5 in assay_biomaterial. The relationship here is many-to-many between assays and biomaterials because of multichannel and multiplexed assay technology.
  7. Create an acquisition record that depends on 5. This is how the assay's results were digitized, typically using a digital camera or scanner, but it can refer to any data acquired from the assay in general.
  8. Create an analysis record. This is the algorithm that is used to process the data from 7.
  9. Create a quantification record. It depends on 7 and 8, and represents data from 7 processed using 8.
  10. Create element records, one per gene that is assayable using 4. Each element record has a nullable attribute where it can point back to feature records to associate elements directly with genomic features.
  11. Create elementresult records, one for each record created in 10 and pointing back to 9 which ultimately links back to the sample. Experimental result data is stored here.

Getting at the question you asked about what is expressed in forebrain, you can store a boolean for expressed/not expressed, or you could store the quantitative data and have some algorithm that determines from those data what is or is not expressed. Obviously the latter is less lossy buy also less straightforward for the casual observer to interpret.


Tables