This use of chado to wiki is dependent on prepared templates in wiki to handle gene page formatting and tables of gene information within the pages. There can be several templates designed for a genome project, so we want to store logic for populating these from chado db inside each template. This will include metadata (not displayed) that middleware can use to format data for each wiki template. Example templates are found from a wiki via Special,All Pages, category Templates. We will add a set of common gene page templates for this example.
(there was a mediawiki template here)
==Notes==
==References==
<references/>
<headings> Gene name||gene_name Description||description Synonyms||synonyms </headings> <type>1</type> <heading_style>heading_style link<heading_style> <table_style>Prettytable link</table_style>
usage: php loadwiki.php -p page_template -t table_template -f input_filename
page_template == gene page template for wiki
table_template == table edit template inside gene page
input_filename == gene data in wiki-string format
input file (one line, for wiki table with ‘||’ delimiters for wiki table columns)
sadA sadA||EGF repeat-containing 9 transmembrane molecule involved in substrate adhesion.||Jim, Don
or
$gene_name."\t".$gene_name.'||'.$description.'||'.$synonym_string."\n";
We plan to extend the above to work with a fuller gene ‘page’ of output from chado. This will use one common wiki Template:gene_page. This page template will have information linking the chado table output fields with the gene wiki table templates.
Extending the above format to handle many table templates, and page template, per row of data information.
pagename [tab] page_template [tab] table_template [tab] row_data (wiki-string) [tab] metadata [return]
sadA \t gene \t gene_basics \t sadA||EGF repeat-containing 9 transmembrane molecule involved in substrate adhesion.||sadA-like,sadA-by-another-name \t metastring \n
sadA \t gene \t gene_location \t gene-location-wiki-string \t metastring \n
sadA \t gene \t gene_function \t gene-function-value-string \t metastring \n
notA \t gene \t gene_basics \t notA||Another gene ...
The page and table templates are storeed in wiki, and can be accessed via url to wiki/Special:Export/Template:page_template, or via other wiki php tools. For GMOD gene pages and tables, we would like to include a mapping of chado fields to/from wiki table fields. THat whey the wiki-string in above exchange table can be generated if need by by inspection of the template pages.
Simple example to collect gene(s) information from Chado db, produce intermediate Wiki-text file (script 1). This is then loaded into Mediawiki database with gene page templates (script 2). Community folks edit the genes thru Table Edit mechanism as desired. Then updated gene info is dumped (from mysql wikidb), converted to chado xml, then loaded into Chado with transaction update checks, via XORT (script 3).
- From hackathon
see e.g. http://eugenes.org/gmod/genbank2chado/conf/v_genepage3.sql
>> this is larger;loading into wikipedia db via wikipedia.xml
** flybase harvard has scripts for general bulk data to chado.xml