There has frequently been interest in updating a
Chado database
using a GFF file, and I’ve finally gotten around to trying
to implement it. My initial efforts were centered around converting GFF
to Chado XML using
Bio::SeqIO::chadoxml, but I was
never completely satisfied with the result, and I was unable to load it
with XORT or
DBIx::DBStag. So, I’ve decided
to work on the GFF3 bulk loader gmod_bulk_load_gff3.pl
to have it do updates and deletes as well. Accordingly, I’ve identified
these cases that should be addressed:
Perhaps the simplest case is when updating feature properties (for purposes of this discussion, ‘feature properties’ encompasses items in the featureprop, feature_cvterm and feature_dbxref tables) is desired, nevertheless, it poses some possible hang ups. For instance:
If name, type and srcfeature are the same, allow featureloc updates?
If updating child features, what happens to the old features? Remove their featureloc entries and create completely new children? Only allow this for features of type ‘gene’?
Again, if name, type and srcfeature are the same, allow the delete?
RefChr Source Type (st) (en) (sc) (st) (ph) Attributes
ChrX MyDB gene . . . . . ID=MyGene1;CRUDop=DROP
ChrX MyDB gene . . . . . ID=MyGene2;CRUDop=UPDATE;Dbxref=SW:U1234
ChrX MyDB gene 1 2 9 - . ID=MyGene3;CRUDop=REPLACE;Dbxref=SW:U1234;..more..
Dongilbert 16:48, 30 March 2007 (EDT)