Difference between revisions of "GSoC"

From GMOD
Jump to: navigation, search
m (Students)
m (Google Summer of Code 2019 @ Open Genome Informatics)
 
(47 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Google Summer of Code 2015 @ Genome Informatics ==
+
[[File:GoogleSummer_2016logo.jpg|373px|right|link=GSoC]]
  
'''[http://code.google.com/soc/ Google Summer of Code]''' is a global program that offers student developers stipends to write code for various open source software projects. We work with many open source, free software, and technology-related groups to identify and fund projects over a three month period. Since its inception in 2005, the program has brought together over 8,500 successful student participants from 101 countries and over 8,300 mentors from over 109 countries worldwide to produce over 50 million lines of code. Through Google Summer of Code, accepted student applicants are paired with a mentor or mentors from the participating projects, thus gaining exposure to real-world software development scenarios and the opportunity for employment in areas related to their academic pursuits. In turn, the participating projects are able to more easily identify and bring in new developers. Best of all, more source code is created and released for the use and benefit of all. (''Excerpt from the [http://www.google-melange.com Google Summer of Code website]'')
+
== Google Summer of Code 2019 @ Open Genome Informatics ==
  
Since 2011, the Genome Informatics group has served as an "umbrella organization" to a variety of bioinformatics projects, including [http://gmod.org GMOD] and its software projects -- GBrowse, JBrowse, etc.; [http://galaxy.psu.edu Galaxy]; [http://porteco.org PortEco]; [http://www.reactome.org Reactome]; [http://seqware.github.io SeqWare]; [http://www.wormbase.org WormBase]; and others. More information about this year's participating bioinformatics groups can be found here [[GSOC_Groups | here]].
+
'''[https://summerofcode.withgoogle.com/ Google Summer of Code]''' is a global program that offers student developers stipends to write code for various open source software projects. We work with many open source, free software, and technology-related groups to identify and fund projects over a three month period. Since its inception in 2005, the program has brought together over 14,000 successful student participants from 118 countries, 651 open source organizations, and over 35 million lines of code. Through Google Summer of Code, accepted student applicants are paired with a mentor or mentors from the participating projects, thus gaining exposure to real-world software development scenarios and the opportunity for employment in areas related to their academic pursuits. In turn, the participating projects are able to more easily identify and bring in new developers. Best of all, more source code is created and released for the use and benefit of all. (''Excerpt from the [https://summerofcode.withgoogle.com/ Google Summer of Code website]'')
  
To learn more about this year's event and how GSoC works, please refer to the [http://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2015/help_page#7._How_does_the_program_work GSoC FAQ].
+
Since 2011, the Open Genome Informatics group has served as an "umbrella organization" to a variety of bioinformatics projects, including [[Main Page|GMOD]] and its software projects -- [[JBrowse]], [[Apollo]], [[Chado]], [[Galaxy]] etc.; [http://www.informatics.jax.org/ Mouse Genome Informatics]; [https://oicr.on.ca/research-portfolio/ OICR]; [http://www.reactome.org Reactome]; [http://www.wormbase.org WormBase]; and [https://bioconda.github.io/ Bioconda].
  
 +
'''More information about this year's participating bioinformatics groups can be found [[GSOC_Groups | here]].'''
 +
 +
To learn more about this year's event and how GSoC works, please refer to the [https://developers.google.com/open-source/gsoc/faq FAQ].
  
 
==Mailing lists, IRC, and other ways to get in touch  ==
 
==Mailing lists, IRC, and other ways to get in touch  ==
*Email: [mailto:help@gmod.org help@gmod.org] '''and''' [mailto:robin.haw@oicr.on.ca robin.haw@oicr.on.ca] -- find out more about GSoC, a specific project, or your potential mentor(s).
+
 
 +
*Email: [mailto:robin.haw@oicr.on.ca robin.haw@oicr.on.ca] '''and''' [mailto:help@gmod.org help@gmod.org] -- find out more about GSoC, a specific project, or your potential mentor(s).
 
*Discussion mailing lists: [http://groups.google.com/group/genome-informatics Genome Informatics Google Groups] - ask about our projects; join the community!
 
*Discussion mailing lists: [http://groups.google.com/group/genome-informatics Genome Informatics Google Groups] - ask about our projects; join the community!
 
*IRC channel: #genomeinformatics on Freenode.
 
*IRC channel: #genomeinformatics on Freenode.
* Mentors can email both Robin and Scott to get more information about the program and get signed up.
+
* Students and Mentors can email both [[User:Robin.haw|Robin]] and [[User:Scott|Scott]] to get more information about the program.
  
 +
== [[GSOC_Project_Ideas_2019 | Project Ideas]] ==
  
== [[GSOC_Project_Ideas_2015 | Project Ideas]] ==
+
'''Got an idea for a GSOC project? [[GSOC_Project_Ideas_2019 |Add it here]].'''  Ideas will be included in the proposal we send to GSOC, and great ideas make for a great proposal, so please add yours now.
There are plenty of challenging and interesting [[Project_Ideas_2015 |project ideas]] this year]. These projects include a broad set of skills, technologies and domains, such as GUIs, database integration and algorithms.  
+
 +
These projects can use a broad set of skills, technologies, and domains, such as GUIs, database integration, and algorithms. Students are also encouraged to propose their own ideas related to our projects. If you have strong computer skills and have an interest in biology or bioinformatics, you should definitely apply! '''Do not hesitate to propose your own project idea: some of the best applications we see are by students that go this route.''' As long as it is relevant to one of our projects, we will give it serious consideration. Creativity and self-motivation are great traits for open source programmers.
  
Students are also encouraged to propose their own ideas related to our projects. If you have strong computer skills and have an interest in biology or bioinformatics, you should definitely apply! '''Do not hesitate to propose your own project idea: some of the best applications we see are by students that go this route.''' As long as it is relevant to one of our projects, we will give it serious consideration. Creativity and self-motivation are great traits for open source programmers.<br>
 
  
 
+
== Preparing for GSoC 2019 ==
== Preparing for GSoC 2015 ==
+
Right now it is the organization application process for GSoC - we won't know if Open Genome Informatics has been accepted as a GSOC 2019 mentoring organization until [https://developers.google.com/open-source/gsoc/timeline February 6th]. Nevertheless, it is a perfect time if students would like to talk to mentors about project ideas. If you are interested in mentoring, please check the Mentors section below, and contact the organization admin.
Right now it is off-season for GSoC - we won't know if Genome Infromatics has bee accepted as a GSOC 2015 mentoring organization until March 2nd. The timeline for GSoC for 2015 has now been posted [https://www.google-melange.com/gsoc/events/google/gsoc2015 here].
+
  
 
===Students===
 
===Students===
Line 30: Line 34:
 
We encourage mentors and mentoring organizations to think about new projects year round! If you'd like help with your ideas page or your separate mentoring org application, please feel to contact the organization admins. Links to [[GSOC_Mentoring_Guide | advice about mentoring and other resources]] are available.
 
We encourage mentors and mentoring organizations to think about new projects year round! If you'd like help with your ideas page or your separate mentoring org application, please feel to contact the organization admins. Links to [[GSOC_Mentoring_Guide | advice about mentoring and other resources]] are available.
  
 
If you have any difficulty using the wiki, please email your project proposal to [mailto:help@gmod.org help@gmod.org] and we will add it for you.
 
 
=== Example of Idea ===
 
 
Brief description of the idea, including any relevant links, etc.
 
 
*Languages and skills: programming language(s) to be used, plus any other particular computer science skills needed
 
*Idea: ''name + contact details of the person(s) who thought up the idea''
 
*Mentor(s): ''name + contact details of the proposed mentor(s)''
 
 
== 2014 Project Ideas ==
 
=== Reactome: Visualising Large Diagrams ===
 
 
Reactome is a free, open-source, curated and peer reviewed database of biomolecular pathways with about 12.000 distinct visitors/month. The Reactome Pathway Diagram viewer was develop initially as a GSoC project and it has become part of the Reactome Pathway Browser (http://www.reactome.org/PathwayBrowser/). The widget works fine for the current size of the diagrams but there is a need of including larger diagrams in the future, so we need to improve the current implementation using a different approach.
 
 
*'''Languages and skills''': Java, GWT, HTML5 Canvas, Data visualisation
 
*'''Idea''': Henning Hermjakob <hhe@ebi.ac.uk>, Antonio Fabregat <fabregat@ebi.ac.uk>
 
*'''Mentor(s'''): Antonio Fabregat Mundo <fabregat@ebi.ac.uk>, Robin Haw <robin.haw@oicr.on.ca>
 
 
'''Description''': The current pathway diagram widget works fine for the pathways in Reactome but diagrams with a large number of entities, for example large biomolecular disease maps, slow the widget down unacceptably. A different approach is needed in order to draw larger pathways in the canvas. Including techniques used for gaming can help to our propose, for example using quadtrees would help to filter the number of objects to be drawn in each canvas iteration (depending of the zoom level and the targeted frame) and will also speed up the object hovering detection while the user moves the mouse over the diagram. Another useful improvement to the diagram could be implementing a multi-layer approach using several canvases for representing different layers of information. In this case exporting the view as an image will be a little more complicated but it is a good use case to take into account at the end of the internship.
 
 
=== Pathway Comparison Widget ===
 
The Pathway Comparison viewer was developed initially as a GSoC project and it has become part of the Reactome Analysis Tool (http://www.reactome.org).
 
The idea is to improve the pathway comparison widget in order to make it interactive so the user can navigate through the result by clicking the nodes, edges and using the zoom level.
 
 
*'''Languages and skills''': Java, GWT, HTML5 Canvas, Data visualisation, BioJS
 
*'''Idea''': Henning Hermjakob <hhe@ebi.ac.uk>, Antonio Fabregat <fabregat@ebi.ac.uk>
 
*'''Mentor(s)''': Antonio Fabregat Mundo <fabregat@ebi.ac.uk>, Robin Haw <robin.haw@oicr.on.ca>
 
 
'''Description''': The current widget represents pathways as circles, whose size is determined by the number of proteins contained in the pathway and the coloration is by mean expression level. The width of the lines connecting two nodes is determined by the similarity of both pathways in terms of contained proteins (see Figure below).
 
 
[[Image:reactome_network_summary.png | thumb | 400px | centre|  Reactome Network Summary]]
 
 
<br>
 
 
For the new version we would like to have a slightly different look and feel (study different options is required) and adding interactivity will be one of the main requirements. A basic approach would be showing a popup when the user clicks a node or a link between nodes. The pop up will show a summary of the data proteins contained in the case of nodes or the similarity summary in case of the links.
 
A most advanced improvement could be allowing the user to move nodes across the canvas or allowing to show or hide nodes (and so only showing links between the ones that are shown) and adding zoom in order to show different data granularity depending on the zoom level.
 
The last requirement is to include the widget in the [https://www.ebi.ac.uk/Tools/biojs/registry/index.html EBI BioJS registry].
 
 
=== SeqWare ===
 
 
*'''Languages and skills''': Java, Bash/Linux, AWS, Google Cloud, Ansible, Vagrant, HBase/NOSQL, MapReduce+associated Hadoop technologies<br>
 
*'''Mentor(s)''': Brian O'Connor <boconnor@oicr.on.ca>, Denis Yuen <denis.yuen@oicr.on.ca><br>
 
 
There are quite a few projects that I would like to see happen for SeqWare and it would be great to get a student to help on these:
 
 
* add hybrid workflow support to SeqWare Pipeline so users can write workflows that include support for Hadoop tools (Pig, Hive, M/R, etc) and traditional command line tools
 
* push forward the design of our multi-cloud cluster provisioning technology stack based on Vagrant.  This includes incorporating cool provision technologies like Ansible.
 
* leverage Elastic Map Reduce on Amazon's AWS as an environment to run SeqWare
 
* leverage the Google cloud, add support for spinning up SeqWare clusters in this environment and to interact with their bucket store
 
* work with the Galaxy tool and finish the compatibility layer that allows SeqWare workflows to run/interact with Galaxy
 
* write a AngularJS-based web application on top of our HBase variant/read NOSQL database, write proof of concept analytical plugins that use machine learning and other advanced techniques to analyze data stored in this scalable backend
 
 
 
===InterMine===
 
 
*'''Languages and skills''': Java, JavaScript, Python
 
*'''Mentor(s)''': InterMine team members
 
 
Some brief ideas for InterMine projects:
 
 
* InterMine and the Semantic Web - make InterMine more semantic.
 
* Building biological tools - eg: a synteny viewer
 
* Mobile (Android/iOS) apps
 
* Data importer/Mine builder: an application to build a mine from a set of standard files and web-services.
 
 
=== Tripal Pedigree Viewer ===
 
 
*'''Languages and skills''': PHP, HTML 5 and Javascript
 
*'''Mentor(s)''': Lacey-Anne Sanderson <lacey.sanderson@usask.ca>
 
 
'''Description''': Development of an interactive, collapsible pedigree diagram to be displayed on Tripal Germplasm pages. The  nodes of the diagram need to contain the name of the stock with a link to the page and the edges of the diagram need to be named with the relationship type (ie: maternal parent of). All of the data is already stored within a PHP tree class with traversal methods. Thus we are looking for a student to use the traversal methods to generate the markup needed for their application and the actual drawing of the pedigree using languages and libraries of their choosing. Here is [http://mbostock.github.io/d3/talk/20111018/tree.html an example] showing the collapsibility desired; however, names within the node circles (as compared to beside in the example) and labelled connector lines (edges) are needed.
 
 
'''Background''': Tripal is a Drupal module that implements display and management of biological data within a Drupal site. Drupal is a PHP-based, database-driven content management system used for development of websites (from blogs, to ecommerce sites, and now organism community sites such as [http://knowpulse2.usask.ca/portal KnowPulse: Legume Breeding & Genomics], [http://www.citrusgenomedb.org/ Citrus Genome] and many more). See [http://tripal.info our website] for more Tripal sites as well as additional information. The Tripal Germplasm module provides the ability to display and manage plant/animal breeding programs. Currently the pedigree is displayed in the community standard textual format (ie: ParentA//ParentB1/ParentB2 which says the offspring of ParentB1 & ParentB2 mated with ParentA to produce the current germplasm). Although this is descriptive and common in the community, a graphical diagram showing these relationships would be a lot more intuitive which is the motivation behind this project.
 
 
 
===JBrowse: REST daemon for Chado ===
 
 
Implement a self-contained server in the language of your choice (such as Python/WSGI, Perl/Plack, node.js, or Java/Jetty) to serve feature data and name completions out of a GMOD Chado database schema according to the JBrowse 1 REST API, enabling an instance of JBrowse1 to run directly atop a Chado database.  Possible addition: implement another daemon in Perl/Plack that does the same thing for a GBrowse 2 installation.
 
 
*'''Skills''': server-side language of student's choice
 
 
 
===JBrowse "regions of interest" lists===
 
 
Add functionality to JBrowse 2 to manage lists of "regions of interest" on a per-user basis, storing the lists using the JavaScript localStorage API.  Allow a user to "apply" a regions list to a view in JBrowse 2 so that the view shows only the user's regions, without any of the intervening space in between.
 
 
*'''Skills''': advanced JavaScript
 
 
 
===Drupal-based GMOD Tool Information Tool===
 
 
*'''Skills''': PHP / Drupal, HTML, Javascript
 
*'''Mentors''': Lacey-Anne Sanderson, Amelia Ireland
 
 
Description to be added soon.
 
 
 
===GMOD Virtual Server Configurator===
 
 
*'''Skills''': cgi-capable language of your choice (e.g. Perl, PHP, JS), html, javascript
 
*'''Mentors''': Scott Cain
 
*'''Idea''': Amelia Ireland, Scott Cain
 
 
[[:Category:GMOD_virtual_server|GMOD virtual servers]] are preconfigured sets of GMOD components that allow users a quick, easy way to set up a bioinformatics resource for their data. Each tool has configuration options that are currently set using a plain text files. Create a user-friendly configuration client that will allow users to customise components without having to dig into a text editor.
 
 
 
=== Galaxy CloudMan ===
 
 
* '''Languages and skills:''' Pyhton, JavaScript, Backbone, Mako, Bash/Linux, AWS
 
* '''Idea:''' Enis Afgan (afgane AT gmail.com)
 
* '''Mentor(s):''' [https://wiki.galaxyproject.org/EnisAfgan Enis Afgan] (afgane AT gmail DOT com), [https://wiki.galaxyproject.org/DannonBaker Dannon Baker] (dannon DOT baker AT gmail DOT com)
 
 
Galaxy CloudMan (http://usecloudman.org) is a cloud manager that orchestrates all the steps required to provision and manage a set of cloud resources to deliver a functional compute cluster in the cloud. A deployed instance of CloudMan comes preconfigured with the Galaxy application, dozens of bioinformatics tools and gigabytes of genome reference data. The application is used around the world to launch hundreds of clusters per month. The following are suggestions for the student improvements that would help the project grow further (each would be a separate project):
 
* A new web interface, exposing key application functionality and focusing on scalability and accessibility
 
* An automated process for deploying/replicating Galaxy on the Cloud across all AWS regions
 
* Advanced cluster autoscaling (responsive, based on individual cluster’s workload, taking advantage of different cloud instance types)
 
 
=== Galaxy Charts and Open Requests ===
 
 
* '''Languages and skills:''' Pyhton, JavaScript, Bash/Linux
 
* '''Idea:''' Aysam Guerler  (aysam.guerler AT gmail.com)
 
* '''Mentor(s):''' [https://wiki.galaxyproject.org/SamGuerler Sam Guerler] (aysam DOT guerler AT mail.com)
 
 
Ideas:
 
 
* Improving Galaxy Charts by e.g. adding new visualizations or options to customize visualizations. This is a very confined project. It has the advantage that the student can (basically) not break code and does not have to grasp Galaxy’s inner layers, but still would be able to make a major contribution.
 
 
* Something from the [https://trello.com/b/75c1kASa/galaxy-development Tool requests and Developer ideas lists at Trello] although one card may not be enough.
 
 
 
=== dictyBase: Integration of HTML5 based live content editor ===
 
 
'''Languages and skills:''' HTML5, Javascript('''angularjs''') and CSS('''Bootstrap/Pure framework''') markup.
 
 
'''Idea:''' Siddhartha Basu(siddhartha DASH basu AT northwestern DOT edu)
 
 
'''Mentor(s):''' Siddhartha Basu(siddhartha DASH basu AT northwestern DOT edu), Petra Fey(pfey AT northwestern DOT edu)
 
==== Idea ====
 
[http://dictybase.org dictyBase] has quite a lot of static HTML pages(for
 
example the front page) that are handcrafted and maintained by manually
 
editing on the server side. The pages are content heavy, however the
 
manual nature of it makes it incredibly difficult to add new content,
 
integrate third party widgets (such as twitter feed) or do collaborative
 
editing. The proposal is to integrate one of [https://www.raptor-editor.com/ raptor],
 
[http://jejacks0n.github.io/mercury/ mercury] or
 
[http://vitalets.github.io/x-editable/ bootstrap X-editable] client side HTML5 editor
 
to make the content editable right from the browser. The content will be
 
pushed back and forth through a RESTful backend. The project is expected
 
to be split into the following sections...
 
 
* Generate a bootstrap(optionally pure framework) based markup of core page structure. This includes header/footers and parts of pages that are not editable.
 
* Identify the contentblocks and integrate one of the editors (student's choice) to make them editable.
 
* Use angularjs ([https://github.com/mgonto/restangular restangular] prefered) framework to save the edited content to a RESTful backend. The RESTful backend (written in golang) along with HTTP resource specification would be made available (deployable binary) to the student.
 
* Integrate image inclusion. Could explore angularjs based option such as [https://github.com/danialfarid/angular-file-upload angular file upload]
 
* Make the editor available only to authorized users. For this, integrate the frontend to our RESTful authentication backend.
 
 
 
 
=== WormBase: data visualization ===
 
[http://www.wormbase.org WormBase (www.wormbase.org)] is a central data repository supporting the nematode research community.
 
*'''Languages and skills''': javascript, HTML5, JS graphical library of your choice (eg. d3), some perl
 
* '''Mentor(s)''': Abigail Cabunoc <abigail.cabunoc@oicr.on.ca>
 
There are several areas of improvement for data visualization on the wormbase website. Here are a couple requests we've received from the community, but we are open to other ideas:
 
* Create a chromosome map tool - allow users to input and visualize the position of genetic loci.
 
** Original community request: https://github.com/WormBase/website/issues/1103
 
* Create a central dogma view to tie together our gene/protein/sequence pages
 
** Original community request: https://bitbucket.org/tharris/wormbase/issue/557/add-central-dogma-nav-to-overview
 
 
[[Category:Galaxy]]
 
[[Category:JBrowse]]
 
[[Category:WormBase]]
 
[[Category:GSoC]]
 
 
[[Category:Galaxy]]
 
[[Category:Galaxy]]
 
[[Category:JBrowse]]
 
[[Category:JBrowse]]
 +
[[Category:MGI]]
 
[[Category:WormBase]]
 
[[Category:WormBase]]
 
[[Category:GSoC]]
 
[[Category:GSoC]]
 +
[[Category:Reactome]]
 +
[[Category:WebApollo]]

Latest revision as of 17:09, 18 December 2018

GoogleSummer 2016logo.jpg

Google Summer of Code 2019 @ Open Genome Informatics

Google Summer of Code is a global program that offers student developers stipends to write code for various open source software projects. We work with many open source, free software, and technology-related groups to identify and fund projects over a three month period. Since its inception in 2005, the program has brought together over 14,000 successful student participants from 118 countries, 651 open source organizations, and over 35 million lines of code. Through Google Summer of Code, accepted student applicants are paired with a mentor or mentors from the participating projects, thus gaining exposure to real-world software development scenarios and the opportunity for employment in areas related to their academic pursuits. In turn, the participating projects are able to more easily identify and bring in new developers. Best of all, more source code is created and released for the use and benefit of all. (Excerpt from the Google Summer of Code website)

Since 2011, the Open Genome Informatics group has served as an "umbrella organization" to a variety of bioinformatics projects, including GMOD and its software projects -- JBrowse, Apollo, Chado, Galaxy etc.; Mouse Genome Informatics; OICR; Reactome; WormBase; and Bioconda.

More information about this year's participating bioinformatics groups can be found here.

To learn more about this year's event and how GSoC works, please refer to the FAQ.

Mailing lists, IRC, and other ways to get in touch

Project Ideas

Got an idea for a GSOC project? Add it here. Ideas will be included in the proposal we send to GSOC, and great ideas make for a great proposal, so please add yours now.

These projects can use a broad set of skills, technologies, and domains, such as GUIs, database integration, and algorithms. Students are also encouraged to propose their own ideas related to our projects. If you have strong computer skills and have an interest in biology or bioinformatics, you should definitely apply! Do not hesitate to propose your own project idea: some of the best applications we see are by students that go this route. As long as it is relevant to one of our projects, we will give it serious consideration. Creativity and self-motivation are great traits for open source programmers.


Preparing for GSoC 2019

Right now it is the organization application process for GSoC - we won't know if Open Genome Informatics has been accepted as a GSOC 2019 mentoring organization until February 6th. Nevertheless, it is a perfect time if students would like to talk to mentors about project ideas. If you are interested in mentoring, please check the Mentors section below, and contact the organization admin.

Students

More information about writing your application will be available closer to the start of the student application period.

Mentors

We encourage mentors and mentoring organizations to think about new projects year round! If you'd like help with your ideas page or your separate mentoring org application, please feel to contact the organization admins. Links to advice about mentoring and other resources are available.