NOTE: We are working on migrating this site away from MediaWiki, so editing pages will be disabled for now.
Difference between revisions of "GMOD RPC API"
m (→Example URLs) |
m (→Compression) |
||
Line 41: | Line 41: | ||
= Compression = | = Compression = | ||
− | Sending of compressed XML and JSON results should | + | Sending of compressed XML and JSON results should follow the same rules used for HTTP communication. If a client wishes to receive compressed output it should indicate this by setting the Accept-Encoding HTTP request header. When a service provider receives a request it should check for the Accept-Encoding request header and compress the output if it supports the requested compression algorithm. If the service provider does compress the content it should then set the Content-Encoding HTTP response header to indicate the compression algorithm used. |
= Services = | = Services = |
Revision as of 20:58, 20 January 2009
Contents
- 1 Document Status
- 2 Background
- 3 Members
- 4 Goals
- 5 Data classes
- 6 API Version
- 7 Data version
- 8 Result dates
- 9 Return types
- 10 Compression
- 11 Services
- 12 TODO
Document Status
In progress.
Background
This effort was started after Josh Goodman's talk at the January 2009 GMOD Meeting meeting titled "MOD Web API (A RESTful interface for MODs)". The main idea is to increase interoperability among the various model organism databases by creating an easy to use high level RESTful API. The queries iterated below are currently in the proposal stage and have no been implemented at any MOD.
Members
- Josh Goodman - FlyBase
- Robert Buels - SOL Genomics Network
- Add your name here
Goals
- Data model agnostic
- Language agnostic
- Easy to use
- Versioned URLs for API stability over time
Data classes
At present, this API only covers querying and retrieving information for the gene data class.
API Version
In order to provide a stable URL API all web calls should be versioned according to what version of the GMOD REST API they are using. The version number included in the URLs corresponds to the API version and not the data version. As changes to the API are made the version number will be incremented. Access to the older API versions should be provided indefinitely.
Current GMOD REST API Version: 1
Data version
The search results must contain an XML tag called data_version that contains the database release number. If the database does not use release numbers then the date for when the entire data class was updated should be used. If that is not available or if individual records within a data class are updated and released asynchronously then a timestamp derived from the time of query execution should be used.
For example, FlyBase releases its data on a roughly monthly basis and tags all data contained in that release with a release number (e.g. FB2008_10). If a release number was not used then a date stamp for when the entire genes data class was updated should be used. If this is not available or if individual genes are updated in a piecemeal fashion then a timestamp based on the query execution time should be used.
Result dates
The individual search results must provide a creation and last modified timestamp via the date_created and last_modified XML tags respectively. If this information is not available then the current timestamp at query execution should be used.
Return types
Each service may define its own return types. The client may request a specific return type by appending the appropriate file extension to the URL. If no file extension is appended then the default return type for that service is used.
Compression
Sending of compressed XML and JSON results should follow the same rules used for HTTP communication. If a client wishes to receive compressed output it should indicate this by setting the Accept-Encoding HTTP request header. When a service provider receives a request it should check for the Accept-Encoding request header and compress the output if it supports the requested compression algorithm. If the service provider does compress the content it should then set the Content-Encoding HTTP response header to indicate the compression algorithm used.
Services
Searches
Organism List
Purpose
Provides a list of organisms that are able to be queried with the service provider.
URL
http://yourmod.org/gmodrest/v<api version>/organisms[.xml | .json]
Return types
XML or JSON
Default return type
XML
Example URLs
- http://flybase.org/gmodrest/v1/organisms
- http://flybase.org/gmodrest/v1/organisms.xml
- http://flybase.org/gmodrest/v1/organisms.json
XML Result
<xml> <?xml version="1.0" encoding="UTF-8"?> <resultset>
<api_version>1</api_version> <data_provider>FlyBase</data_provider> <data_version>FB2008_10</data_version> <organism> <genus>Drosophila</genus> <species>melanogaster</species> <taxonomy_id>7227</taxonomy_id> </organism> <organism> <species>Drosophila</species> <genus>simulans</genus> <taxonomy_id>7240</taxonomy_id> </organism>
</resultset> </xml>
JSON Result
{ resultset:{ api_version:1, data_provider:'FlyBase', data_version:'FB2008_10', organism:[ { genus:'Drosophila', species:'melanogaster', taxonomy_id:7227 }, { species:'Drosophila', genus:'simulans', taxonomy_id:7240 } ] } }
Gene full text search
Purpose
Performs a full text search on gene records and returns the IDs for matching reocrds.
Description
This service returns genes that contain the search term anywhere in the gene record. Results can be restricted to a specific organism by supplying the NCBI taxonomy id.
URL
http://yourmod.org/gmodrest/v<api version>/fulltext/gene/<search term>[/organism/<taxonomy id>][.xml | .json]
Return types
XML or JSON
Default return type
XML
Example URLs
- http://flybase.org/gmodrest/v1/fulltext/gene/cotransfection - Find genes that contain the term cotransfection.
- http://flybase.org/gmodrest/v1/fulltext/gene/cotransfection/organism/7227 - Find Drosophila melanogaster genes that contain the term cotransfection.
- http://flybase.org/gmodrest/v1/fulltext/gene/AE003845.json - Find genes that contain the term AE003845 and return a JSON result.
- http://flybase.org/gmodrest/v1/fulltext/gene/IPR000483/organism/7240.json - Find Drosophila simulans genes that are labeled with InterPro ID IPR000483.
XML Result
<xml> <?xml version="1.0" encoding="UTF-8"?> <resultset>
<api_version>1</api_version> <data_provider>FlyBase</data_provider> <data_version>FB2008_10</data_version> <query_time>2009-01-15 09:03:00</query_time> <query_url>http://flybase.org/gmodrest/v1/fulltext/gene/cotransfection</query_url> <result> <id>FBgn0085432</id> <date_created>2003-03-08 00:00:00</date_created> <last_modified>2005-01-15 09:03:00</last_modified> </result> <result> <id>FBgn0004364</id> <date_created>2005-01-08 00:00:00</date_created> <last_modified>2009-01-01 00:00:00</last_modified> </result>
</resultset> </xml>
JSON Result
{ resultset:{ api_version:1, data_provider:'FlyBase', data_version:'FB2008_10', query_time:'2009-01-15 09:03:00', query_url:'http://flybase.org/gmodrest/v1/fulltext/gene/cotransfection.json', result:[ { id:'FBgn0085432', date_created:'2003-03-08 00:00:00', last_modified:'2005-01-15 09:03:00' }, { id:'FBgn0004364', date_created:'2005-01-08 00:00:00', last_modified:'2009-01-01 00:00:00' } ] } }
Gene keyword search
Purpose
Performs a keyword search on gene records and returns the IDs for matching reocrds.
Description
This service returns genes that contain the search term in a keyword field. Keyword fields should include primary and secondary IDs, symbols, synonyms, full names, annotation IDs, or other fields. Results can be restricted to a specific organism by supplying the NCBI taxonomy id.
URL
http://yourmod.org/gmodrest/v<api version>/keyword/gene/<search term>[/organism/<taxonomy id>][.xml | .json]
Return types
XML or JSON
Default return type
XML
Example URLs
- http://flybase.org/gmodrest/v1/keyword/gene/pangolin - Find a gene named pangolin.
- http://flybase.org/gmodrest/v1/keyword/gene/pan/organism/7240 - Find the Drosophila simulans gene named pan.
- http://flybase.org/gmodrest/v1/keyword/gene/FBgn0019664.json - Find the gene with a primary or secondary ID of FBgn0019664 and return a JSON result.
XML Result
<xml> <?xml version="1.0" encoding="UTF-8"?> <resultset>
<api_version>1</api_version> <data_provider>FlyBase</data_provider> <data_version>FB2008_10</data_version> <query_time>2009-01-15 09:03:00</query_time> <query_url>http://flybase.org/gmodrest/v1/keyword/gene/pangolin</query_url> <result> <id>FBgn0085432</id> <date_created>2003-03-08 00:00:00</date_created> <last_modified>2005-01-15 09:03:00</last_modified> </result>
</resultset> </xml>
JSON Result
{ resultset:{ api_version:1, data_provider:'FlyBase', data_version:'FB2008_10', query_time:'2009-01-15 09:03:00', query_url:'http://flybase.org/gmodrest/v1/keyword/gene/pangolin.json', result:[ { id:'FBgn0085432', date_created:'2003-03-08 00:00:00', last_modified:'2005-01-15 09:03:00' } ] } }
Gene ontology search
Purpose
Searches for genes that have a particular ontology ID.
Description
This service returns genes that have been annotated with a particular ontology term. Results can be restricted to a specific organism by supplying the NCBI taxonomy id.
URL
http://yourmod.org/gmodrest/v<api version>/ontology/gene/<ontology ID>[/organism/<taxonomy id>][.xml | .json]
Return types
XML or JSON
Default return type
XML
Example URLs
- http://flybase.org/gmodrest/v1/ontology/gene/GO:12345 - Find all genes annotated with GO:12345.
- http://flybase.org/gmodrest/v1/ontology/gene/GO:12345/organism/7227 - Find all Drosophila melanogaster genes annotated with GO:12345.
- http://flybase.org/gmodrest/v1/ontology/gene/GO:12345.json - Find all genes annotated with GO:12345 and return a JSON result.
XML Result
<xml> <?xml version="1.0" encoding="UTF-8"?> <resultset>
<api_version>1</api_version> <data_provider>FlyBase</data_provider> <data_version>FB2008_10</data_version> <query_time>2009-01-15 09:03:00</query_time> <query_url>http://flybase.org/gmodrest/v1/ontology/gene/GO:12345</query_url> <result> <id>FBgn0085432</id> <date_created>2003-03-08 00:00:00</date_created> <last_modified>2005-01-15 09:03:00</last_modified> </result> <result> <id>FBgn0004364</id> <date_created>2005-01-08 00:00:00</date_created> <last_modified>2009-01-01 00:00:00</last_modified> </result>
</resultset> </xml>
JSON Result
{ resultset:{ api_version:1, data_provider:'FlyBase', data_version:'FB2008_10', query_time:'2009-01-15 09:03:00', query_url:'http://flybase.org/gmodrest/v1/ontology/gene/GO:12345.json', result:[ { id:'FBgn0085432', date_created:'2003-03-08 00:00:00', last_modified:'2005-01-15 09:03:00' }, { id:'FBgn0004364', date_created:'2005-01-08 00:00:00', last_modified:'2009-01-01 00:00:00' } ] } }
Gene ortholog search
Purpose
Search for orthologs of the supplied gene ID.
Description
This service returns genes that have been determined by some means to be orthologous to the supplied gene ID. If the supplied gene ID is within the namespace of the web service provider then all known orthologs of that gene are returned. If the supplied gene ID is not within the namespace of the web service provider then it only returns genes for organisms that are offered by the service provider.
For example, for a given gene FlyBase stores orthology calls to other FlyBase genes and non FlyBase genes. Thus, given a FlyBase gene ID you can obtain a list of gene IDs for genes within FlyBase and to other non Drosophila species. In addition, given a non FlyBase gene ID you can obtain a list of FlyBase genes that are orthologous to it.
Results can be restricted to a specific organism by supplying the NCBI taxonomy id.
URL
http://yourmod.org/gmodrest/v<api version>/ortholog/gene/<gene ID>[/organism/<taxonomy id>][.xml | .json]
Return types
XML or JSON
Default return type
XML
Example URLs
- http://flybase.org/gmodrest/v1/ortholog/gene/FBgn0004364 - Find all FlyBase and non FlyBase genes that are orthologous to FBgn0004364.
- http://flybase.org/gmodrest/v1/ortholog/gene/FBgn0004364/organism/7240 - Find out if FBgn0004364 has an ortholog in Drosophila simulans.
- http://flybase.org/gmodrest/v1/ortholog/gene/WBGene12345.json - Find all FlyBase genes that are orthologous to WBGene12345 and return a JSON result.
XML Result:
<xml> <?xml version="1.0" encoding="UTF-8"?> <resultset>
<api_version>1</api_version> <data_provider>FlyBase</data_provider> <data_version>FB2008_10</data_version> <query_time>2009-01-15 09:03:00</query_time> <query_url>http://flybase.org/gmodrest/v1/ortholog/gene/FBgn0000490</query_url> <result> <id>FBgn0097591</id> <date_created>2003-03-08 00:00:00</date_created> <last_modified>2005-01-15 09:03:00</last_modified> </result> <result> <id>ENSBTAP00000004992</id> <date_created>2005-01-08 00:00:00</date_created> <last_modified>2009-01-01 00:00:00</last_modified> </result>
</resultset> </xml>
JSON Result
{ resultset:{ api_version:1, data_provider:'FlyBase', data_version:'FB2008_10', query_time:'2009-01-15 09:03:00', query_url:'http://flybase.org/gmodrest/v1/ortholog/gene/FBgn0000490.json', result:[ { id:'FBgn0097591', date_created:'2003-03-08 00:00:00', last_modified:'2005-01-15 09:03:00' }, { id:'ENSBTAP00000004992', date_created:'2005-01-08 00:00:00', last_modified:'2009-01-01 00:00:00' } ] } }
Fetching
Gene records
Purpose
To fetch gene records in the Generic gene page XML format as implemented by Bio GMOD GenericGenePage.
Description
URL
http://yourmod.org/gmodrest/v<api version>/fetch/<gene ID>
Return types
XML
Default return type
XML
Example URLs
- http://flybase.org/gmodrest/v1/fetch/FBgn0097591
XML Result
See Bio GMOD GenericGenePage for example XML.
TODO
- Should we use the NCBI eUtils XML formats?
- Write schemas for the XML formats.
- Double check REST compliance.
- Do we need additional fields for full text result format (organism, etc...)?
- Specify a timestamp format.
- Add more details on HTTP headers required to use compression.
- Add ability to use multiple search and ontology terms in one query and perform basic logical operations (AND, OR, NOT).
- Add support for GO's NOT operator.
- Need a null result format.