Difference between revisions of "GSoC"

From GMOD
Jump to: navigation, search
(Project Ideas)
(Project Ideas)
Line 95: Line 95:
 
Idea by: [[User:Scott|Scott Cain]]<br>
 
Idea by: [[User:Scott|Scott Cain]]<br>
 
Potential Mentors: [[User:Scott|Scott Cain]], [[User:Jogoodma|Josh Goodman]]
 
Potential Mentors: [[User:Scott|Scott Cain]], [[User:Jogoodma|Josh Goodman]]
 +
 +
====='''IDEA 8: GIS Indexing for Chado'''=====
 +
Genome feature sequence data is stored in Chado and getting it out quickly can be a bottleneck.  In this project, we propose creating "geographic" coordinate indexes on the feature location table, as well as functions to use them.
 +
 +
Language and Skills: PostgreSQL and plpgsql, Perl<br>
 +
Idea by: [[User:Scott|Scott Cain]]<br>
 +
Potential Mentors: [[User:Scott|Scott Cain]]
  
 
===='''Resources'''====
 
===='''Resources'''====

Revision as of 21:56, 28 February 2011

Welcome to the Genome Informatics GSoC

“Google Summer of Code (GSoC) is a global program that offers student developers stipends to write code for various open source software projects. We have worked with several open source, free software, and technology-related groups to identify and fund several projects over a three month period. Since its inception in 2005, the program has brought together over 4,500 students and more than more than 4,000 mentors & co-mentors from over 85 countries worldwide, all for the love of code. Through Google Summer of Code, accepted student applicants are paired with a mentor or mentors from the participating projects, thus gaining exposure to real-world software development scenarios and the opportunity for employment in areas related to their academic pursuits. In turn, the participating projects are able to more easily identify and bring in new developers. Best of all, more source code is created and released for the use and benefit of all.”

GSoC has several goals:

  • get more open source code created and released for the benefit of all
  • inspire young developers to begin participating in open source development
  • help open source projects identify and bring in new developers and committers
  • provide students the opportunity to do work related to their academic pursuits during the summer
  • give students more exposure to real-world software development scenarios.

Google Summer of Code (GSoC)

About Genome Informatics GSoC

The Genome Infomatics group is organizing the joint efforts of WormBase, Reactome, and GBrowse (see below). This is a great opportunity for students to contribute to the work of three different bioinformatics projects.

Wormbase is an online bioinformatics database of the biology and genome of the model organism Caenorhabditis elegans and related nematodes. It is used by the C. elegans research community both as an information resource and as a mode to publish and distribute their results. The database is constantly updated and new versions are released on a monthly basis. WormBase is a collaboration among the Wellcome Trust Sanger Institute, Ontario Institute for Cancer Research, Washington University in St. Louis, and the California Institute of Technology. Links: Website.

Reactome is a manually curated database of core pathways and reactions in human biology that functions as a data mining resource and electronic textbook. The Reactome data model describes diverse processes in the human system, including the pathways of intermediary metabolism, regulatory pathways, signal transduction, and high-level processes, such as the cell cycle. Reactome software uses only freely available (and often open source) components and has been created with cross-platform compatibility and wide usability in mind. Data is stored in a MySQL database, the web site is implemented in Perl and data entry tool in Java programming language. The Reactome team is composed of individuals who are both biologists and programmers at the Ontario Institute for Cancer Research, New York University Langone Medical Center, Cold Spring Harbor Laboratory, and The European Bioinformatics Institute. Links: Website, ReactomeWiki .

Generic Model Organism Database (GMOD) is an open source project to develop a complete set of software for creating and administering a model organism database. Components of this project include genome visualization and editing tools, literature curation tools, a robust database schema, biological ontology tools, and a set of standard operating procedures. This project is collaboration of several database projects, including WormBase, FlyBase, Mouse Genome Informatics, Gramene, the Rat Genome Database, TAIR, EcoCyc, and the Saccharomyces Genome Database. Links: Website, GMOD Blog

Contact Us

  • Email: robin.haw[AT]oicr.on.ca - contact me to find out more about a project or your potential mentor(s).
  • Discussion mailing lists: Genome Informatics Google Groups - ask about our projects; join the community!

How to apply

We would like to know who you are and how you think. Incorporate the following into your application:

  • Your information
    • Name, email, and website (optional)
  • Brief background: education and relevant work experience
  • Your programming interests and strengths
    • What are your languages of choice?
    • Any prior experience with open source development?
    • Your interest and background in biology or bioinformatics
    • Any prior exposure to biology or bioinformatics?
  • Your ideas for a project (an original idea or one expanded from our Ideas Page)
    • Provide as much detail as possible
    • Strong applicants include an implementation plan and timeline (hint!)
    • Refer to and link to other projects or products that illustrate your ideas
    • Identify possible hurdles and questions that will require more research/planning
  • What can you bring to the team?

Guidelines and Advice

Project Ideas

As we are developing new features for WormBase, Reactome and GBrowse, we are exploring a number of areas ideal for Google Summer of Code students. These projects include a broad set of skills, technologies and domains, such as GUIs, database integration and algorithms. You are also encouraged to propose your own ideas related to our projects. If you have strong computer skills and have an interest in biology or bioinformatics, then you should apply!

IDEA 1: Original Idea

Feel free to propose your own idea. As long as it relates to one of our projects, we will give it serious consideration. Creativity and self-motivation are great traits for open source programmers, but make sure your proposal is also relevant.

IDEA 2: Export Reactome layout in BioPAX

It is difficult to exchange pathway layout or diagrams among different pathway databases. BioPAX is a pathway exchange format, but without support of pathway diagram. In this project, we propose to create an XML file format for visualization based on Reactome pathway diagrams for BioPAX level 2 and 3 exports, and a Java applet or Web Start visualization tool based on our curator tool to display this format.

Language and Skills: Java, XML, OWL and BioPAX
Idea by: Guanming Wu
Potential Mentors: Guanming Wu

IDEA 3: HTML 5 canvas based visualization tool for pathways and networks

With modern browsers' support of HTML5 canvas, it is a time to develop a canvas-based, dynamic visualization tool for biological interaction networks and pathways based on the new canvas tag. In this project, we propose to develop a canvas-based network interaction visualization prototype that can run in both modern browsers in a full-fledged computer, or in a tablet (e.g. ipad). The whole ideas are based on these two canvas based web applications: LucidChart and HTML 5 canvas painting.

Language and Skills: HTML5 (esp canvas), JavaScript, Java, GWT
Idea by: Guanming Wu
Potential Mentors: Guanming Wu, Robin Haw and Marc Gillespie

IDEA 4: Cytoscape Web based visualization tool

This would reuse server-side code to replace the Reactome pathway browser.

Language and Skills: GWT
Idea by: David Croft
Potential Mentors: David Croft and Guanming Wu

IDEA 5: Pathway Summary visualization

Provides a user with an overview of all of the pathways in Reactome, able to show expression and species comparison information in a multi-pathway context.

Language and Skills: GWT
Idea by: David Croft
Potential Mentors: David Croft

IDEA 6: Reactome RESTful API

It is crucial to have a nice API for Reactome so that the Reactome annotated data can be used programmatically and integrated in other web applications (e.g. Wormbase) easily with very low maintenance. In this project, we propose to develop a Reactome RESTful API which should be lightweight, easy to use, and well-documented.

Language and Skills:RESTful Web Service, Java, Jersey , MySQL
Idea by: Guanming Wu
Potential Mentors: Guanming Wu

IDEA 7: Chado RESTful API

Chado is the database schema underlying several GMOD projects, and there are many installed instances around the world. Having a RESTful API would allow both the creation of "mashup" sites that pull data from various Chado instances but would also facilitate data sharing between sites that have Chado installed.

Language and Skills: RESTful Web Service, Perl (Dancer or Catalyst), PostgreSQL
Idea by: Scott Cain
Potential Mentors: Scott Cain, Josh Goodman

IDEA 8: GIS Indexing for Chado

Genome feature sequence data is stored in Chado and getting it out quickly can be a bottleneck. In this project, we propose creating "geographic" coordinate indexes on the feature location table, as well as functions to use them.

Language and Skills: PostgreSQL and plpgsql, Perl
Idea by: Scott Cain
Potential Mentors: Scott Cain

Resources

For Students
For Mentors