GMOD

Chado Update via GFF

There has frequently been interest in updating a Chado database using a GFF file, and I’ve finally gotten around to trying to implement it. My initial efforts were centered around converting GFF to Chado XML using Bio::SeqIO::chadoxml, but I was never completely satisfied with the result, and I was unable to load it with XORT or DBIx::DBStag. So, I’ve decided to work on the GFF3 bulk loader gmod_bulk_load_gff3.pl to have it do updates and deletes as well. Accordingly, I’ve identified these cases that should be addressed:

Contents

Updating properties

Perhaps the simplest case is when updating feature properties (for purposes of this discussion, ‘feature properties’ encompasses items in the featureprop, feature_cvterm and feature_dbxref tables) is desired, nevertheless, it poses some possible hang ups. For instance:

Updating feature locations

If name, type and srcfeature are the same, allow featureloc updates?

Updating complete gene models

If updating child features, what happens to the old features? Remove their featureloc entries and create completely new children? Only allow this for features of type ‘gene’?

Deleting features

Again, if name, type and srcfeature are the same, allow the delete?

Comments

 RefChr  Source  Type  (st) (en) (sc) (st) (ph)   Attributes
 ChrX    MyDB    gene    .    .   .    .    .      ID=MyGene1;CRUDop=DROP
 ChrX    MyDB    gene    .    .   .    .    .      ID=MyGene2;CRUDop=UPDATE;Dbxref=SW:U1234
 ChrX    MyDB    gene    1    2   9    -    .      ID=MyGene3;CRUDop=REPLACE;Dbxref=SW:U1234;..more..

Dongilbert 16:48, 30 March 2007 (EDT)

Category:

Documentation

Community

Tools