Difference between revisions of "Chado Expression Module"

From GMOD
Jump to: navigation, search
m (expression_cvterm)
Line 24: Line 24:
 
assay, 'subcellular location' and that cvterms from different [http://www.obofoundry.org OBO]
 
assay, 'subcellular location' and that cvterms from different [http://www.obofoundry.org OBO]
 
ontologies can share the same cvterm_type.
 
ontologies can share the same cvterm_type.
 +
 +
===Rad and Expression===
 +
 +
The [[Chado_Rad_Module|Rad module]] and the Expression module can be considered overlapping but complementary. The Rad module can store data taken directly from the experimental results whereas the Expression module is typically used to store summary data taken from the biological literature, or extracted from the microarray data stored in Rad. The Rad module handles details about experiments that the Expression module does not whereas the Expression module can be thought of a simpler set of tables designed to tie ontologies concerned with expression to sequence features.
  
  

Revision as of 15:42, 21 February 2007

Introduction

This module is for how curated expression data is stored in chado. This module is totally dependent on the sequence module. Objects in the genetic module cannot connect to expression data except by going via the sequence module. We assume that we'll always have a controlled vocabulary for expression data.

Here is an example of a simple case of the sort of data that FlyBase curates. The dpp transcript is expressed in embryonic stage 13-15 in the cephalic segment as reported in a paper by Blackman et al. in 1991. This would be implemented in the expression module by linking the dpp transcript feature to expression via feature_expression (we would add a pub_id column to feature_expression to link to the publication in the pub table). We would then link the following cvterms to the expression using expression_cvterm:

  • embryonic stage 13 where the cvterm_type would be stage and the rank=0
  • embryonic stage 14 where the cvterm_type would be stage and the rank=1
  • embryonic stage 15 where the cvterm_type would be stage and the rank=1
  • cephalic segment where the cvterm_type would be anatomy and the rank=0
  • in situ hybridization where the cvterm_type would be assat and the rank=0

Note that we would change the cvterm_type column to cvterm_type_id and use a cvterm_id for a particular expression slot (i.e. stage, anatomy, assay, 'subcellular location' and that cvterms from different OBO ontologies can share the same cvterm_type.

Rad and Expression

The Rad module and the Expression module can be considered overlapping but complementary. The Rad module can store data taken directly from the experimental results whereas the Expression module is typically used to store summary data taken from the biological literature, or extracted from the microarray data stored in Rad. The Rad module handles details about experiments that the Expression module does not whereas the Expression module can be thought of a simpler set of tables designed to tie ontologies concerned with expression to sequence features.


Tables

expression_cvterm

WARNING open question

What are the possibities of combination when more than one cvterm is used in a field?

For e.g. (in

here): <t> E | early <a> <p> anterior & dorsal If the two terms used in a particular field are co-equal (both from the same CV, is the relation always "&"? May we find "or"? Obviously another case is when a bodypart term and a bodypart qualifier term are used in a specific field, eg: <t> L | third instar <a> larval antennal segment sensilla | subset <p WRT the three-part --<t><a><p> statements, are the values in the different parts *always* from different vocabularies in proforma.CV? If not, we'll need to have some kind of type qualifier telling us whether the cvterm used is <t>, <a>, or <p> yes we should have a type qualifier as a cv term can be from diff vocab e.g. blastoderm can be body part and stage terms in dros anatomy but cvterm_type_id needs to be a cv instead of a free text type