GMOD

Chado New Users

This page, and it’s associated discussion page follow the learning curve for new Chado users learning the system at CSHL.

Contents

Getting an empty Chado PostgreSQL on our machines

Installation Notes

If the easy way fails, the old documentation outside the wiki can be pretty confusing.

Loading the Ontologies

This works via make ontologies. How to do updates?

Getting the Sequence Module Working

We think GFF3 can be thought of as a view into Chado using the Sequence module and the CV module, or we can think of GFF3 as a denormalized view of Chado.

Migration from other databases

Sample Data

To understand Chado Best Practices, where the documentation is sometimes incomplete, we’ve tried to get some samples of Chado data in use. Things we’ve looked at so far, and comments on them:

Understanding how things are represented in Chado

Central Dogma

Chado Best Practices describes some of the representations. Unfortunately it’s somewhat incomplete at present.

Gene

Chado uses a eukaryotic-centric gene definition which is based on monocistronic mRNAs. In this view, the gene includes information in the genomic DNA outside of the part that codes for the mRNA. To represent a gene, there needs to be:

Completing the representation of the gene seems to require additional features of types ‘mRNA’ and ‘exon’ (and ‘polypeptide’ if it’s protein coding). What happens if software tries to write a feature record as a gene without creating these? Presumably the gene feature has to be entered first in order to have an object_id for feature_relationship.

mRNA

mRNA features are entered with part_of relationships to genes. This is straightforward in cases where the mRNA is derived from a high-quality full length cDNA (but what’s the feature_relationship type?). Does an mRNA have to have a featureloc? What if the CDS is known but the precise ends of the UTRs are not?

Polycistronic transcription units

As of this writing, the description of handling dicistronic genes is not very clear. Based on the GFF3 spec:

other RNAs

tRNAs, rRNAs, snRNAs etc have similar relationships to genes. Note that even in eukaryotes, rRNAs and tRNAs are often polycistronic transcripts!

Polypeptides

Polypeptides derive_from mRNAs

Proteins

Note that proteins ≠ polypeptides. Hemoglobin is a heterotetramer of two α and two β subunits. Is there a feature type that represents this?

See also

Categories:

Namespaces

Documentation

Community

Tools