GSOC Project Ideas 2022
Got an idea for GSOC 2022?
Then please post it. You can either
- Add it here, by directly editing this page. Just copy, paste and
update the template below. This requires that you have
or create a GMOD.org login.
Projects can use a broad set of skills, technologies, and domains, such
as GUIs, database integration and algorithms.
Students are also encouraged to propose their own ideas related to our
projects. If you have strong computer skills and have an interest in
biology or bioinformatics, you should definitely apply! Do not hesitate
to propose your own project idea: some of the best applications we see
are by students that go this route. As long as it is relevant to one of
our projects, we will give it serious consideration. Creativity and
self-motivation are great traits for open-source programmers.
Contents
Proposed project ideas for 2022
- Brief explanation: Write a new JBrowse 2 plugin to support MSPCrunch data input, BLAST
results, and .coord files from MUMmer
- Expected results: a new JBrowse 2 plugin that adds support for one
or more of the data formats listed above
- Project Home Page URL:
JBrowse.org
- Project chat:
[1]
- Knowledge prerequisites: JavaScript
- Skill level: Medium
- Project Time: 175 hours
- Mentors: JBrowse development team
JBrowse 2 Multi-way synteny visualization
- Brief explanation: JBrowse 2 can currently compare two genomes at a
time, but comparing more than two genomes is common! Write new JBrowse
2 features to support this functionality using data from PAF files or
other synteny pipelines.
- Expected results: a new method to compare more than two genomes at a
time
- Project Home Page URL:
JBrowse.org
- Project chat:
[2]
- Knowledge prerequisites: JavaScript, Typescript, React
- Skill level: Difficult
- Project Time: 350 hours
- Mentors: JBrowse development team
Use Galaxy to run Reactome analysis and processes on proteomics data (Reactome)
- Brief explanation: Reactome is a free, open-source, curated and
peer-reviewed pathway database. Our goal is to provide intuitive
bioinformatics tools for the visualization, interpretation and
analysis of pathway knowledge to support basic research, genome
analysis, modelling, systems biology and education. Galaxy is an open,
web-based platform for data-intensive biomedical research, which
allows users to perform, reproduce, and share complete analyses.
- Expected results: There are two potential sub-projects. 1) Adding
Reactome as a data resource in Galaxy, to enable Galaxy users to use
Reactome reaction and pathway annotation data, and 2) Performing
identifier mapping and over-representation analysis workflows from
Reactome in Galaxy. Reactome
Github.
- Project Home Page URL: if there is one.
- Project paper reference and URL:
reactome.org,
galaxyproject.org,
ProteoRE (Proteomics Research Environment)
- Knowledge prerequisites: Galaxy, Java, web services.
- Skill level: Medium.
- Project Time: 350 hours
- Mentors: Robin Haw (robin.haw[AT]oicr.on.ca)
Centralized dashboard or metrics system (Reactome)
- Brief explanation: Reactome has both manual and automated
statistical tracking of its quarterly release data. This project would
seek to fully automate and consolidate the quantification of release
data measurement for metrics such as the number of pathways,
reactions, distinct proteins (with and without UniProt isoforms),
complexes, small molecules, drugs/therapeutics, literature references,
etc. for human (curated) and non-human (electronically inferred)
species and stratified for normal and disease biology. a centralized
dashboard would be useful by the team for discussing metrics
externally and community outreach.
- Expected results: A program which will produce a standardized report
of statistics for a Reactome release database with aesthetic visuals
- Project Home Page URL:
reactome.org.
- Knowledge prerequisites: Java, MySQL and/or Neo4j, creating visuals
for statistical data (preferred but not required)
- Skill level: Medium.
- Project Time: 175 hours
- Mentors: Robin Haw (robin.haw[AT]oicr.on.ca)
- Brief explanation: Reactome generates new pathway and other
annotation data on a quarterly basis. With each new release, the
preceding data set is archived to an AWS S3 bucket. As part of our
data sharing policy, we would like to develop web interface to allow
users to request specific versions of archived data and to make it
available to download.
- Expected results: Web interface for users to request data and
download via a shareable link that either expires within a certain
timeframe or after data is downloaded.
- Project Home Page URL:
reactome.org.
- Knowledge prerequisites: Java, AWS, Joomla, Cloudfront
- Skill level: Medium.
- Project Time: 175 hours
- Mentors: Robin Haw (robin.haw[AT]oicr.on.ca)
Datatypes Help in Galaxy (Galaxy)
- Brief explanation: Create infrastructure for providing datatype help
in Galaxy. Includes expanding datatype definitions and updating Galaxy
user interface to take advantage of it.
- Expected results: Datatype format and semantics help would be widely
available when using Galaxy, including in tools that consume and
produce particular datatypes, as well as server-wide help describing
supported datatypes.
- Project Home Page URL:
https://galaxyproject.org/
- Project paper reference and URL: Jalili, V., Afgan, E., Gu, Q.,
Clements, D., Blankenberg, D., Goecks, J., Taylor, J., &
Nekrutenko, A. (2020). The Galaxy platform for accessible,
reproducible and collaborative biomedical analyses: 2020 update.
Nucleic Acids Research, 48(W1), W395–W402.
https://doi.org/10.1093/nar/gkaa434
- Knowledge prerequisites: Python and JavaScript. Will use Vue.js
components in front end.
- Skill level: Basic
- Project Time: 175 hours
- Mentors: Björn Grüning, University of Freiburg
(bjoern.gruening[at]gmail.com); Galaxy Support Working Group,
global.
- Brief explanation: Publish each server’s quotas in a standard way;
provide users with more information about what analyses and datasets
are consuming their quota allocation.
- Expected results: Users will know immediately what a server’s quotas
are, and what items are contributing most to consuming their quota.
Users will have a clear idea of what they can expect, and what they
can do to increase their available resources.
- Project Home Page URL:
https://galaxyproject.org/
- Project paper reference and URL: Jalili, V., Afgan, E., Gu, Q.,
Clements, D., Blankenberg, D., Goecks, J., Taylor, J., &
Nekrutenko, A. (2020). The Galaxy platform for accessible,
reproducible and collaborative biomedical analyses: 2020 update.
Nucleic Acids Research, 48(W1), W395–W402.
https://doi.org/10.1093/nar/gkaa434
- Knowledge prerequisites: Python and JavaScript. Will use Vue.js
components in front end.
- Skill level: Medium
- Project Time: 175 hours
- Mentors: Björn Grüning, University of Freiburg
(bjoern.gruening[at]gmail.com); Galaxy Support Working Group,
global.
Creating learning paths within the Galaxy Training Network
- Brief explanation: Implement and display learning paths in the
Galaxy Training Material infrastructure to show learners, especially
newcomers, which tutorial they should take first or which sequence of
tutorials to follow to become knowledgeable about a particular topic.
- Expected results: An easy way to configure and update learning
paths. Easy to understand and navigate learning paths for web site
users.
- Project Home Page URL:
https://training.galaxyproject.org/
- Project paper reference and URL: Batut, B., Hiltemann, S.,
Bagnacani, A., Baker, D., Bhardwaj, V., Blank, C., Bretaudeau, A.,
Brillet-Guéguen, L., Čech, M., Chilton, J., Clements, D.,
Doppelt-Azeroual, O., Erxleben, A., Freeberg, M. A., Gladman, S.,
Hoogstrate, Y., Hotz, H.-R., Houwaart, T., Jagtap, P., … Grüning, B.
(2018). Community-Driven Data Analysis Training for Biology. Cell
Systems, 6(6), 752-758.e1.
https://doi.org/10.1016/j.cels.2018.05.012
- Knowledge prerequisites: Ruby, JavaScript (some Python).
- Skill level: Medium
- Project Time: 175 hours
- Mentors: Bérénice Batut, University of Freiburg (berenice DOT
batut[at] gmail.com). Galaxy Outreach & Training Working Group,
global.
Integrate MP-BioPath into Reactome FI-Viz Cytoscape plugin
- Brief explanation: MP-BioPath is a similar algorithm to the options
available within the Reactome FI Viz Cytoscape plugin
(https://reactome.org/tools/reactome-fiviz). We are
interested in incorporating our latest algorithm MP-BioPath
(https://github.com/OICR/mp-biopath) into the
- Expected results: A new version of the FI-Viz plugin that
incorporates MP-BioPath with customized visualizations for MP-BioPaths
pathways and results
- Project Home Page URL:
https://reactome.org/tools/reactome-fiviz
- Project paper reference and URL:
- Knowledge prerequisites: JAVA programming language
- Skill level: Medium
- Project Time: 175 hours
- Mentors: Adam Wright (adam.wright@oicr.on.ca)
Converting Release Jenkins Pipeline to incorporate Continuous Integration / Continuous Deployment (CI/CD) principles and AWS services where useful
- Brief explanation: In order to display the rich displays within the
Reactome portal an Extract Transform Load (ETL) - mainly Java -
Jenkins pipeline(s) is currently being run. This process could be,
made more robust and facilitate rapid development cycles through
incorporating CI/CD principles into our existing pipelines.
- Expected results: Have Jenkins CI/CD pipelines pulling the latest
changes from GitHub automatically, compiling the Java code,
dockerizing each component and deploying the code after all relevant
test
s have passed.
- Project Home Page URL:
https://github.com/Reactome/
- Project paper reference and URL:
- Knowledge prerequisites: Groovy (pipeline code), Bash
- Skill level: Basic
- Project Time: 175 hours
- Mentors: Adam Wright (adam.wright@oicr.on.ca), Joel Weiser
(joel.weiser@oicr.on.ca)
Template: Project Idea Name (Project Name/Lab Name)
- Brief explanation: Brief description of the idea, including any
relevant links, etc.
- Expected results: describe the outcome of the project idea.
- Project Home Page URL: if there is one.
- Project paper reference and URL: Is there a paper about the project
this effort will be a part of?
- Knowledge prerequisites: programming language(s) to be used, plus
any other particular computer science skills needed.
- Skill level: Basic, Medium or Advanced.
- Project Time: 175 hours or 350 hours
- Mentors: name + contact details of the lead mentor, name + contact
details of 1 or 2 backup mentors.
Navigation
Documentation