Chado Tables
From GMOD
[edit] Table: db
A database authority. Typical databases in bioinformatics are FlyBase, GO, UniProt, NCBI, MGI, etc. The authority is generally known by this shortened form, which is unique within the bioinformatics and biomedical realm. To Do - add support for URIs, URNs (e.g. LSIDs). We can do this by treating the URL as a URI - however, some applications may expect this to be resolvable - to be decided.
| FK | Name | Type | Description |
|---|---|---|---|
| db_id | serial | PRIMARY KEY | |
| name | character varying(255) | UNIQUE NOT NULL | |
| description | character varying(255) | ||
| urlprefix | character varying(255) | ||
| url | character varying(255) |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: dbxref
A unique, global, public, stable identifier. Not necessarily an external reference - can reference data items inside the particular chado instance being used. Typically a row in a table can be uniquely identified with a primary identifier (called dbxref_id); a table may also have secondary identifiers (in a linking table <T>_dbxref). A dbxref is generally written as <DB>:<ACCESSION> or as <DB>:<ACCESSION>:<VERSION>.
| FK | Name | Type | Description |
|---|---|---|---|
| dbxref_id | serial | PRIMARY KEY | |
| db_id | integer | UNIQUE#1 NOT NULL | |
| accession | character varying(255) | UNIQUE#1 NOT NULL The local part of the identifier. Guaranteed by the db authority to be unique for that db. | |
| version | character varying(255) | UNIQUE#1 NOT NULL DEFAULT ''::character varying | |
| description | text |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: project
| FK | Name | Type | Description |
|---|---|---|---|
| project_id | serial | PRIMARY KEY | |
| name | character varying(255) | UNIQUE NOT NULL | |
| description | character varying(255) | NOT NULL |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: tableinfo
| FK | Name | Type | Description |
|---|---|---|---|
| tableinfo_id | serial | PRIMARY KEY | |
| name | character varying(30) | UNIQUE NOT NULL | |
| primary_key_column | character varying(30) | ||
| is_view | integer | NOT NULL | |
| view_on_table_id | integer | ||
| superclass_table_id | integer | ||
| is_updateable | integer | NOT NULL DEFAULT 1 | |
| modification_date | date | NOT NULL DEFAULT now() |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: cv
A controlled vocabulary or ontology. A cv is composed of cvterms (AKA terms, classes, types, universals - relations and properties are also stored in cvterm) and the relationships between them.
| FK | Name | Type | Description |
|---|---|---|---|
| cv_id | serial | PRIMARY KEY | |
| name | character varying(255) | UNIQUE NOT NULL The name of the ontology. This corresponds to the obo-format -namespace-. cv names uniquely identify the cv. In OBO file format, the cv.name is known as the namespace. | |
| definition | text | A text description of the criteria for membership of this ontology. |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: cvterm
A term, class, universal or type within an ontology or controlled vocabulary. This table is also used for relations and properties. cvterms constitute nodes in the graph defined by the collection of cvterms and cvterm_relationships.
| FK | Name | Type | Description |
|---|---|---|---|
| cvterm_id | serial | PRIMARY KEY | |
| cv_id | integer | UNIQUE#1 NOT NULL The cv or ontology or namespace to which this cvterm belongs. | |
| name | character varying(1024) | UNIQUE#1 NOT NULL A concise human-readable name or label for the cvterm. Uniquely identifies a cvterm within a cv. | |
| definition | text | A human-readable text definition. | |
| dbxref_id | integer | UNIQUE NOT NULL Primary identifier dbxref - The unique global OBO identifier for this cvterm. Note that a cvterm may have multiple secondary dbxrefs - see also table: cvterm_dbxref. | |
| is_obsolete | integer | UNIQUE#1 NOT NULL Boolean 0=false,1=true; see GO documentation for details of obsoletion. Note that two terms with different primary dbxrefs may exist if one is obsolete. | |
| is_relationshiptype | integer | NOT NULL Boolean 0=false,1=true relations or relationship types (also known as Typedefs in OBO format, or as properties or slots) form a cv/ontology in themselves. We use this flag to indicate whether this cvterm is an actual term/class/universal or a relation. Relations may be drawn from the OBO Relations ontology, but are not exclusively drawn from there. |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: cvterm_dbxref
In addition to the primary identifier (cvterm.dbxref_id) a cvterm can have zero or more secondary identifiers/dbxrefs, which may refer to records in external databases. The exact semantics of cvterm_dbxref are not fixed. For example: the dbxref could be a pubmed ID that is pertinent to the cvterm, or it could be an equivalent or similar term in another ontology. For example, GO cvterms are typically linked to InterPro IDs, even though the nature of the relationship between them is largely one of statistical association. The dbxref may be have data records attached in the same database instance, or it could be a "hanging" dbxref pointing to some external database. NOTE: If the desired objective is to link two cvterms together, and the nature of the relation is known and holds for all instances of the subject cvterm then consider instead using cvterm_relationship together with a well-defined relation.
| FK | Name | Type | Description |
|---|---|---|---|
| cvterm_dbxref_id | serial | PRIMARY KEY | |
| cvterm_id | integer | UNIQUE#1 NOT NULL | |
| dbxref_id | integer | UNIQUE#1 NOT NULL | |
| is_for_definition | integer | NOT NULL A cvterm.definition should be supported by one or more references. If this column is true, the dbxref is not for a term in an external database - it is a dbxref for provenance information for the definition. |
[edit] Table: cvterm_relationship
A relationship linking two cvterms. Each cvterm_relationship constitutes an edge in the graph defined by the collection of cvterms and cvterm_relationships. The meaning of the cvterm_relationship depends on the definition of the cvterm R refered to by type_id. However, in general the definitions are such that the statement "all SUBJs REL some OBJ" is true. The cvterm_relationship statement is about the subject, not the object. For example "insect wing part_of thorax".
| FK | Name | Type | Description |
|---|---|---|---|
| cvterm_relationship_id | serial | PRIMARY KEY | |
| type_id | integer | UNIQUE#1 NOT NULL The nature of the relationship between subject and object. Note that relations are also housed in the cvterm table, typically from the OBO relationship ontology, although other relationship types are allowed. | |
| subject_id | integer | UNIQUE#1 NOT NULL The subject of the subj-predicate-obj sentence. The cvterm_relationship is about the subject. In a graph, this typically corresponds to the child node. | |
| object_id | integer | UNIQUE#1 NOT NULL The object of the subj-predicate-obj sentence. The cvterm_relationship refers to the object. In a graph, this typically corresponds to the parent node. |
[edit] Table: cvtermpath
The reflexive transitive closure of the cvterm_relationship relation.
| FK | Name | Type | Description |
|---|---|---|---|
| cvtermpath_id | serial | PRIMARY KEY | |
| type_id | integer | UNIQUE#1 The relationship type that this is a closure over. If null, then this is a closure over ALL relationship types. If non-null, then this references a relationship cvterm - note that the closure will apply to both this relationship AND the OBO_REL:is_a (subclass) relationship. | |
| subject_id | integer | UNIQUE#1 NOT NULL | |
| object_id | integer | UNIQUE#1 NOT NULL | |
| cv_id | integer | NOT NULL Closures will mostly be within one cv. If the closure of a relationship traverses a cv, then this refers to the cv of the object_id cvterm. | |
| pathdistance | integer | UNIQUE#1 The number of steps required to get from the subject cvterm to the object cvterm, counting from zero (reflexive relationship). |
[edit] Table: cvtermprop
Additional extensible properties can be attached to a cvterm using this table. Corresponds to -AnnotationProperty- in W3C OWL format.
| FK | Name | Type | Description |
|---|---|---|---|
| cvtermprop_id | serial | PRIMARY KEY | |
| cvterm_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL The name of the property or slot is a cvterm. The meaning of the property is defined in that cvterm. | |
| value | text | UNIQUE#1 NOT NULL DEFAULT ''::text The value of the property, represented as text. Numeric values are converted to their text representation. | |
| rank | integer | UNIQUE#1 NOT NULL Property-Value ordering. Any cvterm can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used. |
[edit] Table: cvtermsynonym
A cvterm actually represents a distinct class or concept. A concept can be refered to by different phrases or names. In addition to the primary name (cvterm.name) there can be a number of alternative aliases or synonyms. For example, "T cell" as a synonym for "T lymphocyte".
| FK | Name | Type | Description |
|---|---|---|---|
| cvtermsynonym_id | serial | PRIMARY KEY | |
| cvterm_id | integer | UNIQUE#1 NOT NULL | |
| synonym | character varying(1024) | UNIQUE#1 NOT NULL | |
| type_id | integer | A synonym can be exact, narrower, or broader than. |
[edit] Table: dbxrefprop
Metadata about a dbxref. Note that this is not defined in the dbxref module, as it depends on the cvterm table. This table has a structure analagous to cvtermprop.
| FK | Name | Type | Description |
|---|---|---|---|
| dbxrefprop_id | serial | PRIMARY KEY | |
| dbxref_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL | |
| value | text | NOT NULL DEFAULT ''::text | |
| rank | integer | UNIQUE#1 NOT NULL |
[edit] Table: wwwuser
Keep track of WWW users. This may also be useful in an audit module at some point.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_id | serial | PRIMARY KEY | |
| username | character varying(32) | UNIQUE NOT NULL | |
| password | character varying(32) | NOT NULL | |
| character varying(128) | NOT NULL | ||
| profile | text |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: wwwuser_cvterm
Track wwwuser interest in cvterms.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_cvterm_id | serial | PRIMARY KEY | |
| wwwuser_id | integer | UNIQUE#1 NOT NULL | |
| cvterm_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
[edit] Table: wwwuser_expression
Track wwwuser interest in expressions.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_expression_id | serial | PRIMARY KEY | |
| wwwuser_id | integer | UNIQUE#1 NOT NULL | |
| expression_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
[edit] Table: wwwuser_feature
Track wwwuser interest in features.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_feature_id | serial | PRIMARY KEY | |
| wwwuser_id | integer | UNIQUE#1 NOT NULL | |
| feature_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
[edit] Table: wwwuser_genotype
Track wwwuser interest in genotypes.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_genotype_id | serial | PRIMARY KEY | |
| wwwuser_id | integer | UNIQUE#1 NOT NULL | |
| genotype_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
[edit] Table: wwwuser_organism
Track wwwuser interest in organisms.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_organism_id | serial | PRIMARY KEY | |
| wwwuser_id | integer | UNIQUE#1 NOT NULL | |
| organism_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
[edit] Table: wwwuser_phenotype
Track wwwuser interest in phenotypes.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_phenotype_id | serial | PRIMARY KEY | |
| wwwuser_id | integer | UNIQUE#1 NOT NULL | |
| phenotype_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
[edit] Table: wwwuser_project
Link wwwuser accounts to projects
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_project_id | serial | PRIMARY KEY | |
| wwwuser_id | integer | UNIQUE#1 NOT NULL | |
| project_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
[edit] Table: wwwuser_pub
Track wwwuser interest in publications.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuser_pub_id | serial | PRIMARY KEY | |
| wwwuser_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
[edit] Table: wwwuserrelationship
Track wwwuser interest in other wwwusers.
| FK | Name | Type | Description |
|---|---|---|---|
| wwwuserrelationship_id | serial | PRIMARY KEY | |
| objwwwuser_id | integer | UNIQUE#1 NOT NULL | |
| subjwwwuser_id | integer | UNIQUE#1 NOT NULL | |
| world_read | smallint | NOT NULL DEFAULT 1 |
Generated by PostgreSQL Autodoc
[edit] Table: feature
A feature is a biological sequence or a section of a biological sequence, or a collection of such sections. Examples include genes, exons, transcripts, regulatory regions, polypeptides, protein domains, chromosome sequences, sequence variations, cross-genome match regions such as hits and HSPs and so on; see the Sequence Ontology for more.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_id | serial | PRIMARY KEY | |
| dbxref_id | integer | An optional primary public stable identifier for this feature. Secondary identifiers and external dbxrefs go in the table feature_dbxref. | |
| organism_id | integer | UNIQUE#1 NOT NULL The organism to which this feature belongs. This column is mandatory. | |
| name | character varying(255) | The optional human-readable common name for a feature, for display purposes. | |
| uniquename | text | UNIQUE#1 NOT NULL The unique name for a feature; may not be necessarily be particularly human-readable, although this is preferred. This name must be unique for this type of feature within this organism. | |
| residues | text | A sequence of alphabetic characters representing biological residues (nucleic acids, amino acids). This column does not need to be manifested for all features; it is optional for features such as exons where the residues can be derived from the featureloc. It is recommended that the value for this column be manifested for features which may may non-contiguous sublocations (e.g. transcripts), since derivation at query time is non-trivial. For expressed sequence, the DNA sequence should be used rather than the RNA sequence. | |
| seqlen | integer | The length of the residue feature. See column:residues. This column is partially redundant with the residues column, and also with featureloc. This column is required because the location may be unknown and the residue sequence may not be manifested, yet it may be desirable to store and query the length of the feature. The seqlen should always be manifested where the length of the sequence is known. | |
| md5checksum | character(32) | The 32-character checksum of the sequence, calculated using the MD5 algorithm. This is practically guaranteed to be unique for any feature. This column thus acts as a unique identifier on the mathematical sequence. | |
| type_id | integer | UNIQUE#1 NOT NULL A required reference to a table:cvterm giving the feature type. This will typically be a Sequence Ontology identifier. This column is thus used to subclass the feature table. | |
| is_analysis | boolean | NOT NULL DEFAULT false Boolean indicating whether this feature is annotated or the result of an automated analysis. Analysis results also use the companalysis module. Note that the dividing line between analysis and annotation may be fuzzy, this should be determined on a per-project basis in a consistent manner. One requirement is that there should only be one non-analysis version of each wild-type gene feature in a genome, whereas the same gene feature can be predicted multiple times in different analyses. | |
| is_obsolete | boolean | NOT NULL DEFAULT false Boolean indicating whether this feature has been obsoleted. Some chado instances may choose to simply remove the feature altogether, others may choose to keep an obsolete row in the table. | |
| timeaccessioned | timestamp without time zone | NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado. | |
| timelastmodified | timestamp without time zone | NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado. |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: feature_cvterm
Associate a term from a cv with a feature, for example, GO annotation.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_cvterm_id | serial | PRIMARY KEY | |
| feature_id | integer | UNIQUE#1 NOT NULL | |
| cvterm_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL Provenance for the annotation. Each annotation should have a single primary publication (which may be of the appropriate type for computational analyses) where more details can be found. Additional provenance dbxrefs can be attached using feature_cvterm_dbxref. | |
| is_not | boolean | NOT NULL DEFAULT false If this is set to true, then this annotation is interpreted as a NEGATIVE annotation - i.e. the feature does NOT have the specified function, process, component, part, etc. See GO docs for more details. |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: feature_cvterm_dbxref
Additional dbxrefs for an association. Rows in the feature_cvterm table may be backed up by dbxrefs. For example, a feature_cvterm association that was inferred via a protein-protein interaction may be backed by by refering to the dbxref for the alternate protein. Corresponds to the WITH column in a GO gene association file (but can also be used for other analagous associations). See http://www.geneontology.org/doc/GO.annotation.shtml#file for more details.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_cvterm_dbxref_id | serial | PRIMARY KEY | |
| feature_cvterm_id | integer | UNIQUE#1 NOT NULL | |
| dbxref_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: feature_cvterm_pub
Secondary pubs for an association. Each feature_cvterm association is supported by a single primary publication. Additional secondary pubs can be added using this linking table (in a GO gene association file, these corresponding to any IDs after the pipe symbol in the publications column.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_cvterm_pub_id | serial | PRIMARY KEY | |
| feature_cvterm_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: feature_cvtermprop
Extensible properties for feature to cvterm associations. Examples: GO evidence codes; qualifiers; metadata such as the date on which the entry was curated and the source of the association. See the featureprop table for meanings of type_id, value and rank.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_cvtermprop_id | serial | PRIMARY KEY | |
| feature_cvterm_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. cvterms may come from the OBO evidence code cv. | |
| value | text | The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query. | |
| rank | integer | UNIQUE#1 NOT NULL Property-Value ordering. Any feature_cvterm can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used. |
[edit] Table: feature_dbxref
Links a feature to dbxrefs. This is for secondary identifiers; primary identifiers should use feature.dbxref_id.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_dbxref_id | serial | PRIMARY KEY | |
| feature_id | integer | UNIQUE#1 NOT NULL | |
| dbxref_id | integer | UNIQUE#1 NOT NULL | |
| is_current | boolean | NOT NULL DEFAULT true The is_current boolean indicates whether the linked dbxref is the current -official- dbxref for the linked feature. |
[edit] Table: feature_pub
Provenance. Linking table between features and publications that mention them.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_pub_id | serial | PRIMARY KEY | |
| feature_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: feature_pubprop
Property or attribute of a feature_pub link.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_pubprop_id | serial | PRIMARY KEY | |
| feature_pub_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL | |
| value | text | ||
| rank | integer | UNIQUE#1 NOT NULL |
[edit] Table: feature_relationship
Features can be arranged in graphs, e.g. "exon part_of transcript part_of gene"; If type is thought of as a verb, the each arc or edge makes a statement [Subject Verb Object]. The object can also be thought of as parent (containing feature), and subject as child (contained feature or subfeature). We include the relationship rank/order, because even though most of the time we can order things implicitly by sequence coordinates, we can not always do this - e.g. transpliced genes. It is also useful for quickly getting implicit introns.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_relationship_id | serial | PRIMARY KEY | |
| subject_id | integer | UNIQUE#1 NOT NULL The subject of the subj-predicate-obj sentence. This is typically the subfeature. | |
| object_id | integer | UNIQUE#1 NOT NULL The object of the subj-predicate-obj sentence. This is typically the container feature. | |
| type_id | integer | UNIQUE#1 NOT NULL Relationship type between subject and object. This is a cvterm, typically from the OBO relationship ontology, although other relationship types are allowed. The most common relationship type is OBO_REL:part_of. Valid relationship types are constrained by the Sequence Ontology. | |
| value | text | Additional notes or comments. | |
| rank | integer | UNIQUE#1 NOT NULL The ordering of subject features with respect to the object feature may be important (for example, exon ordering on a transcript - not always derivable if you take trans spliced genes into consideration). Rank is used to order these; starts from zero. |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: feature_relationship_pub
Provenance. Attach optional evidence to a feature_relationship in the form of a publication.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_relationship_pub_id | serial | PRIMARY KEY | |
| feature_relationship_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: feature_relationshipprop
Extensible properties for feature_relationships. Analagous structure to featureprop. This table is largely optional and not used with a high frequency. Typical scenarios may be if one wishes to attach additional data to a feature_relationship - for example to say that the feature_relationship is only true in certain contexts.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_relationshipprop_id | serial | PRIMARY KEY | |
| feature_relationship_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. Currently there is no standard ontology for feature_relationship property types. | |
| value | text | The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query. | |
| rank | integer | UNIQUE#1 NOT NULL Property-Value ordering. Any feature_relationship can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used. |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: feature_relationshipprop_pub
Provenance for feature_relationshipprop.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_relationshipprop_pub_id | serial | PRIMARY KEY | |
| feature_relationshipprop_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: feature_synonym
Linking table between feature and synonym.
| FK | Name | Type | Description |
|---|---|---|---|
| feature_synonym_id | serial | PRIMARY KEY | |
| synonym_id | integer | UNIQUE#1 NOT NULL | |
| feature_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL The pub_id link is for relating the usage of a given synonym to the publication in which it was used. | |
| is_current | boolean | NOT NULL DEFAULT true The is_current boolean indicates whether the linked synonym is the current -official- symbol for the linked feature. | |
| is_internal | boolean | NOT NULL DEFAULT false Typically a synonym exists so that somebody querying the db with an obsolete name can find the object theyre looking for (under its current name. If the synonym has been used publicly and deliberately (e.g. in a paper), it may also be listed in reports as a synonym. If the synonym was not used deliberately (e.g. there was a typo which went public), then the is_internal boolean may be set to -true- so that it is known that the synonym is -internal- and should be queryable but should not be listed in reports as a valid synonym. |
[edit] Table: featureloc
The location of a feature relative to another feature. Important: interbase coordinates are used. This is vital as it allows us to represent zero-length features e.g. splice sites, insertion points without an awkward fuzzy system. Features typically have exactly ONE location, but this need not be the case. Some features may not be localized (e.g. a gene that has been characterized genetically but no sequence or molecular information is available). Note on multiple locations: Each feature can have 0 or more locations. Multiple locations do NOT indicate non-contiguous locations (if a feature such as a transcript has a non-contiguous location, then the subfeatures such as exons should always be manifested). Instead, multiple featurelocs for a feature designate alternate locations or grouped locations; for instance, a feature designating a blast hit or hsp will have two locations, one on the query feature, one on the subject feature. Features representing sequence variation could have alternate locations instantiated on a feature on the mutant strain. The column:rank is used to differentiate these different locations. Reflexive locations should never be stored - this is for -proper- (i.e. non-self) locations only; nothing should be located relative to itself.
| FK | Name | Type | Description |
|---|---|---|---|
| featureloc_id | serial | PRIMARY KEY | |
| feature_id | integer | UNIQUE#1 NOT NULL The feature that is being located. Any feature can have zero or more featurelocs. | |
| srcfeature_id | integer | The source feature which this location is relative to. Every location is relative to another feature (however, this column is nullable, because the srcfeature may not be known). All locations are -proper- that is, nothing should be located relative to itself. No cycles are allowed in the featureloc graph. | |
| fmin | integer | The leftmost/minimal boundary in the linear range represented by the featureloc. Sometimes (e.g. in Bioperl) this is called -start- although this is confusing because it does not necessarily represent the 5-prime coordinate. Important: This is space-based (interbase) coordinates, counting from zero. To convert this to the leftmost position in a base-oriented system (eg GFF, Bioperl), add 1 to fmin. | |
| is_fmin_partial | boolean | NOT NULL DEFAULT false This is typically false, but may be true if the value for column:fmin is inaccurate or the leftmost part of the range is unknown/unbounded. | |
| fmax | integer | The rightmost/maximal boundary in the linear range represented by the featureloc. Sometimes (e.g. in bioperl) this is called -end- although this is confusing because it does not necessarily represent the 3-prime coordinate. Important: This is space-based (interbase) coordinates, counting from zero. No conversion is required to go from fmax to the rightmost coordinate in a base-oriented system that counts from 1 (e.g. GFF, Bioperl). | |
| is_fmax_partial | boolean | NOT NULL DEFAULT false This is typically false, but may be true if the value for column:fmax is inaccurate or the rightmost part of the range is unknown/unbounded. | |
| strand | smallint | The orientation/directionality of the location. Should be 0, -1 or +1. | |
| phase | integer | Phase of translation with respect to srcfeature_id. Values are 0, 1, 2. It may not be possible to manifest this column for some features such as exons, because the phase is dependant on the spliceform (the same exon can appear in multiple spliceforms). This column is mostly useful for predicted exons and CDSs. | |
| residue_info | text | Alternative residues, when these differ from feature.residues. For instance, a SNP feature located on a wild and mutant protein would have different alternative residues. for alignment/similarity features, the alternative residues is used to represent the alignment string (CIGAR format). Note on variation features; even if we do not want to instantiate a mutant chromosome/contig feature, we can still represent a SNP etc with 2 locations, one (rank 0) on the genome, the other (rank 1) would have most fields null, except for alternative residues. | |
| locgroup | integer | UNIQUE#1 NOT NULL This is used to manifest redundant, derivable extra locations for a feature. The default locgroup=0 is used for the DIRECT location of a feature. Important: most Chado users may never use featurelocs WITH logroup > 0. Transitively derived locations are indicated with locgroup > 0. For example, the position of an exon on a BAC and in global chromosome coordinates. This column is used to differentiate these groupings of locations. The default locgroup 0 is used for the main or primary location, from which the others can be derived via coordinate transformations. Another example of redundant locations is storing ORF coordinates relative to both transcript and genome. Redundant locations open the possibility of the database getting into inconsistent states; this schema gives us the flexibility of both warehouse instantiations with redundant locations (easier for querying) and management instantiations with no redundant locations. An example of using both locgroup and rank: imagine a feature indicating a conserved region between the chromosomes of two different species. We may want to keep redundant locations on both contigs and chromosomes. We would thus have 4 locations for the single conserved region feature - two distinct locgroups (contig level and chromosome level) and two distinct ranks (for the two species). | |
| rank | integer | UNIQUE#1 NOT NULL Used when a feature has >1 location, otherwise the default rank 0 is used. Some features (e.g. blast hits and HSPs) have two locations - one on the query and one on the subject. Rank is used to differentiate these. Rank=0 is always used for the query, Rank=1 for the subject. For multiple alignments, assignment of rank is arbitrary. Rank is also used for sequence_variant features, such as SNPs. Rank=0 indicates the wildtype (or baseline) feature, Rank=1 indicates the mutant (or compared) feature. |
| Name | Constraint |
|---|---|
| featureloc_c2 | CHECK ((fmin <= fmax)) |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: featureloc_pub
Provenance of featureloc. Linking table between featurelocs and publications that mention them.
| FK | Name | Type | Description |
|---|---|---|---|
| featureloc_pub_id | serial | PRIMARY KEY | |
| featureloc_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: featureprop
A feature can have any number of slot-value property tags attached to it. This is an alternative to hardcoding a list of columns in the relational schema, and is completely extensible.
| FK | Name | Type | Description |
|---|---|---|---|
| featureprop_id | serial | PRIMARY KEY | |
| feature_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. Certain property types will only apply to certain feature types (e.g. the anticodon property will only apply to tRNA features) ; the types here come from the sequence feature property ontology. | |
| value | text | The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query. | |
| rank | integer | UNIQUE#1 NOT NULL Property-Value ordering. Any feature can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: featureprop_pub
Provenance. Any featureprop assignment can optionally be supported by a publication.
| FK | Name | Type | Description |
|---|---|---|---|
| featureprop_pub_id | serial | PRIMARY KEY | |
| featureprop_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: synonym
A synonym for a feature. One feature can have multiple synonyms, and the same synonym can apply to multiple features.
| FK | Name | Type | Description |
|---|---|---|---|
| synonym_id | serial | PRIMARY KEY | |
| name | character varying(255) | UNIQUE#1 NOT NULL The synonym itself. Should be human-readable machine-searchable ascii text. | |
| type_id | integer | UNIQUE#1 NOT NULL Types would be symbol and fullname for now. | |
| synonym_sgml | character varying(255) | NOT NULL The fully specified synonym, with any non-ascii characters encoded in SGML. |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: phylonode
This is the most pervasive element in the phylogeny module, cataloging the "phylonodes" of tree graphs. Edges are implied by the parent_phylonode_id reflexive closure. For all nodes in a nested set implementation the left and right index will be *between* the parents left and right indexes.
| FK | Name | Type | Description |
|---|---|---|---|
| phylonode_id | serial | PRIMARY KEY | |
| phylotree_id | integer | UNIQUE#1 UNIQUE#2 NOT NULL | |
| parent_phylonode_id | integer | Root phylonode can have null parent_phylonode_id value. | |
| left_idx | integer | UNIQUE#1 NOT NULL | |
| right_idx | integer | UNIQUE#2 NOT NULL | |
| type_id | integer | Type: e.g. root, interior, leaf. | |
| feature_id | integer | Phylonodes can have optional features attached to them e.g. a protein or nucleotide sequence usually attached to a leaf of the phylotree for non-leaf nodes, the feature may be a feature that is an instance of SO:match; this feature is the alignment of all leaf features beneath it. | |
| label | character varying(255) | ||
| distance | double precision |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: phylonode_dbxref
For example, for orthology, paralogy group identifiers; could also be used for NCBI taxonomy; for sequences, refer to phylonode_feature, feature associated dbxrefs.
| FK | Name | Type | Description |
|---|---|---|---|
| phylonode_dbxref_id | serial | PRIMARY KEY | |
| phylonode_id | integer | UNIQUE#1 NOT NULL | |
| dbxref_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: phylonode_organism
This linking table should only be used for nodes in taxonomy trees; it provides a mapping between the node and an organism. One node can have zero or one organisms, one organism can have zero or more nodes (although typically it should only have one in the standard NCBI taxonomy tree).
| FK | Name | Type | Description |
|---|---|---|---|
| phylonode_organism_id | serial | PRIMARY KEY | |
| phylonode_id | integer | UNIQUE NOT NULL One phylonode cannot refer to >1 organism. | |
| organism_id | integer | NOT NULL |
[edit] Table: phylonode_pub
| FK | Name | Type | Description |
|---|---|---|---|
| phylonode_pub_id | serial | PRIMARY KEY | |
| phylonode_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: phylonode_relationship
This is for relationships that are not strictly hierarchical; for example, horizontal gene transfer. Most phylogenetic trees are strictly hierarchical, nevertheless it is here for completeness.
| FK | Name | Type | Description |
|---|---|---|---|
| phylonode_relationship_id | serial | PRIMARY KEY | |
| subject_id | integer | UNIQUE#1 NOT NULL | |
| object_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL | |
| rank | integer | ||
| phylotree_id | integer | NOT NULL |
[edit] Table: phylonodeprop
| FK | Name | Type | Description |
|---|---|---|---|
| phylonodeprop_id | serial | PRIMARY KEY | |
| phylonode_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL type_id could designate phylonode hierarchy relationships, for example: species taxonomy (kingdom, order, family, genus, species), "ortholog/paralog", "fold/superfold", etc. | |
| value | text | UNIQUE#1 NOT NULL DEFAULT ''::text | |
| rank | integer | UNIQUE#1 NOT NULL |
[edit] Table: phylotree
Global anchor for phylogenetic tree.
| FK | Name | Type | Description |
|---|---|---|---|
| phylotree_id | serial | PRIMARY KEY | |
| dbxref_id | integer | NOT NULL | |
| name | character varying(255) | ||
| type_id | integer | Type: protein, nucleotide, taxonomy, for example. The type should be any SO type, or "taxonomy". | |
| analysis_id | integer | ||
| comment | text |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: phylotree_pub
Tracks citations global to the tree e.g. multiple sequence alignment supporting tree construction.
| FK | Name | Type | Description |
|---|---|---|---|
| phylotree_pub_id | serial | PRIMARY KEY | |
| phylotree_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: library
| FK | Name | Type | Description |
|---|---|---|---|
| library_id | serial | PRIMARY KEY | |
| organism_id | integer | UNIQUE#1 NOT NULL | |
| name | character varying(255) | ||
| uniquename | text | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL The type_id foreign key links to a controlled vocabulary of library types. Examples of this would be: "cDNA_library" or "genomic_library" |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: library_cvterm
The table library_cvterm links a library to controlled vocabularies which describe the library. For instance, there might be a link to the anatomy cv for "head" or "testes" for a head or testes library.
| FK | Name | Type | Description |
|---|---|---|---|
| library_cvterm_id | serial | PRIMARY KEY | |
| library_id | integer | UNIQUE#1 NOT NULL | |
| cvterm_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: library_feature
library_feature links a library to the clones which are contained in the library. Examples of such linked features might be "cDNA_clone" or "genomic_clone".
| FK | Name | Type | Description |
|---|---|---|---|
| library_feature_id | serial | PRIMARY KEY | |
| library_id | integer | UNIQUE#1 NOT NULL | |
| feature_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: library_pub
| FK | Name | Type | Description |
|---|---|---|---|
| library_pub_id | serial | PRIMARY KEY | |
| library_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: library_synonym
| FK | Name | Type | Description |
|---|---|---|---|
| library_synonym_id | serial | PRIMARY KEY | |
| synonym_id | integer | UNIQUE#1 NOT NULL | |
| library_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL The pub_id link is for relating the usage of a given synonym to the publication in which it was used. | |
| is_current | boolean | NOT NULL DEFAULT true The is_current bit indicates whether the linked synonym is the current -official- symbol for the linked library. | |
| is_internal | boolean | NOT NULL DEFAULT false Typically a synonym exists so that somebody querying the database with an obsolete name can find the object they are looking for under its current name. If the synonym has been used publicly and deliberately (e.g. in a paper), it my also be listed in reports as a synonym. If the synonym was not used deliberately (e.g., there was a typo which went public), then the is_internal bit may be set to "true" so that it is known that the synonym is "internal" and should be queryable but should not be listed in reports as a valid synonym. |
[edit] Table: libraryprop
| FK | Name | Type | Description |
|---|---|---|---|
| libraryprop_id | serial | PRIMARY KEY | |
| library_id | integer | UNIQUE#1 NOT NULL | |
| type_id | integer | UNIQUE#1 NOT NULL | |
| value | text | ||
| rank | integer | UNIQUE#1 NOT NULL |
[edit] Table: contact
Model persons, institutes, groups, organizations, etc.
| FK | Name | Type | Description |
|---|---|---|---|
| contact_id | serial | PRIMARY KEY | |
| type_id | integer | What type of contact is this? E.g. "person", "lab". | |
| name | character varying(255) | UNIQUE NOT NULL | |
| description | character varying(255) |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: contact_relationship
Model relationships between contacts
| FK | Name | Type | Description |
|---|---|---|---|
| contact_relationship_id | serial | PRIMARY KEY | |
| type_id | integer | UNIQUE#1 NOT NULL Relationship type between subject and object. This is a cvterm, typically from the OBO relationship ontology, although other relationship types are allowed. | |
| subject_id | integer | UNIQUE#1 NOT NULL The subject of the subj-predicate-obj sentence. In a DAG, this corresponds to the child node. | |
| object_id | integer | UNIQUE#1 NOT NULL The object of the subj-predicate-obj sentence. In a DAG, this corresponds to the parent node. |
[edit] Table: stock
Any stock can be globally identified by the combination of organism, uniquename and stock type. A stock is the physical entities, either living or preserved, held by collections. Stocks belong to a collection; they have IDs, type, organism, description and may have a genotype.
| FK | Name | Type | Description |
|---|---|---|---|
| stock_id | serial | PRIMARY KEY | |
| dbxref_id | integer | The dbxref_id is an optional primary stable identifier for this stock. Secondary indentifiers and external dbxrefs go in table: stock_dbxref. | |
| organism_id | integer | UNIQUE#1 NOT NULL The organism_id is the organism to which the stock belongs. This column is mandatory. | |
| name | character varying(255) | The name is a human-readable local name for a stock. | |
| uniquename | text | UNIQUE#1 NOT NULL | |
| description | text | The description is the genetic description provided in the stock list. | |
| type_id | integer | UNIQUE#1 NOT NULL The type_id foreign key links to a controlled vocabulary of stock types. The would include living stock, genomic DNA, preserved specimen. Secondary cvterms for stocks would go in stock_cvterm. | |
| is_obsolete | boolean | NOT NULL DEFAULT false |
Tables referencing this one via Foreign Key Constraints:
[edit] Table: stock_cvterm
stock_cvterm links a stock to cvterms. This is for secondary cvterms; primary cvterms should use stock.type_id.
| FK | Name | Type | Description |
|---|---|---|---|
| stock_cvterm_id | serial | PRIMARY KEY | |
| stock_id | integer | UNIQUE#1 NOT NULL | |
| cvterm_id | integer | UNIQUE#1 NOT NULL | |
| pub_id | integer | UNIQUE#1 NOT NULL |
[edit] Table: stock_dbxref
stock_dbxref links a stock to dbxrefs. This is for secondary identifiers; primary identifiers should use stock.dbxref_id.
| FK | Name | Type | Description |
|---|---|---|---|
| stock_dbxref_id | serial | PRIMARY KEY | |
| stock_id | integer | UNIQUE#1 NOT NULL | |
| dbxref_id | integer | UNIQUE#1 NOT NULL | |
| is_current | boolean | NOT NULL DEFAULT true The is_current boolean indicates whether the linked dbxref is the current -official- dbxref for the linked stock. |
[edit] Table: stock_genotype
Simple table linking a stock to a genotype. Features with genotypes can be linked to stocks thru feature_genotype -> genotype -> stock_genotype -> stock.
| FK | Name | Type | Description | <
|---|