Chado Audit Module

From GMOD
Revision as of 18:17, 14 February 2007 by Bosborne (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

The Audit module is not a module in the same sense as the other Chado modules. It is a set of recommendations on how one might implement auditing in Chado. This documentation describes an example of such an implementation which is in use at FlyBase. The SQL used at Flybase is chado/modules/audit/audit_tables.sql.


Audit Policy

We consider the implementation of auditing policy as a database administration issue and subject to the local requirements of specific database groups. We therefore have not developed a detailed audit module, providing instead this policy statement.

Database project groups who would use IDB as their primary dbms would be well advised to implement as robust and detailed an audit trail as possible, while groups who would use IDB more informally, for instance primarily as a read-only resource, may have no need for auditing.

We here describe how auditing will be implemented at FlyBase, where IDB will serve as the primary dbms.

Auditing will be implemented in a seperate, orthogonal database to the IDB, with each IDB table having a corresponding shadow audit table which will include all of the columns of the IDB table, plus these additional columns:

transaction_date timestamp not null transaction_type char not null

where transaction_type value is one of:

i (insertion) u (update) d (deletion)

Thus an IDB db instance IDB.1 containing table, eg, "gene":

create table gene ( gene_id serial not null, primary key (gene_id), name varchar(255) not null, dbxref_id int, foreign key (dbxref_id) references dbxref(dbxref_id), unique(name), unique(dbxref_id) );

would have a corresponding audit db, IDB.1_audit, containing a shadow audit table, eg, "gene_audit":

create table gene_audit ( gene_id serial not null, primary key (gene_id), name varchar(255) not null, dbxref_id int,

              foreign key (dbxref_id) references dbxref(dbxref_id),
              unique(name),
              unique(dbxref_id),
              transaction_date timestamp not null,
              transaction_type char not null

);


Transaction use-cases


INSERTION (of a new record):

When a record is intially inserted into an IDB table, the same record is inserted into the corresponding shadow audit table, with the current timestamp and transaction_type = 'i'.

UPDATE (of an existing record):

When a record is to be updated, before the update transaction is executed in IDB, the existing IDB record is copied to the corresponding shadow audit table, with the current timestamp and transaction_type = 'u'. The update transaction is then executed in IDB.

DELETION (of an existing record):

When a record is to be deleted, before the delete trasaction is executed in IDB, the existing IDB record is copied to the corresponding shadow audit table, with the corrent timestamp and transaction_type = 'd'. The delete transaction is then executed in IDB.


Implementation Issues

The benefit of this system is that it allows a certain amount of "time travel", or the ability to reconstruct the db to its state at a particular time in the past. We don't anticipate making this functionality available to the general user population, but recognize that from a db programming point of view, it is often quite useful, and sometimes vital, for a programmer to be able to track changes in the db in detail.

While the auditing policy described here fulfills all of our auditing requirements in principle, it remains to be seen how it will work in practice. The size of the shadow audit db may become an issue over time. How the handling of audit transactions will affect overall db transaction performance is also unknown. FlyBase makes a practice of reporting certain audit details to the public (eg, date entered and date last updated of gene records); will implementing this be too expensive using this shadow audit db method?