GMOD

2009 GMOD Community Survey

Contents

The 2009 GMOD Community Survey focused on genome and comparative genomics visualization. The survey was open for 10 days in September 2009 and received 45 responses.

For a broader picture of the GMOD community and how GMOD Components are used, see the 2008 GMOD Community Survey.

Which components have you used?

GMOD has a wealth of genome and comparative genomics browsers. Which of the following have you used?

Component  %
GBrowse  87% 
CMap  31% 
JBrowse  29% 
GBrowse_syn  29% 
Apollo's synteny viewer  13% 
BLAST Graphic Viewer  13% 
SynBrowse  7% 
GBrowse karyotype  7% 
SynView  4% 
Sybil  2% 

Documentation

How satisfied are you with the documentation for these components?

Administration Documentation - e.g. GBrowse Configuration HOWTO, GBrowse_syn Tutorial, …

 %
Very  50% 
Average  35% 
Not at all  5% 
No Opinioin  10% 

End user Documentation - e.g. GBrowse User Tutorial, CMap Tutorial, …

 %
Very  45% 
Average  33% 
Not at all  3% 
No Opinioin  20% 

Overview Documentation - e.g. Comparative Genomics, Overview, …

 %
Very  23% 
Average  40% 
Not at all  0% 
No Opinioin  38% 

Use Cases

The aim of this section was to determine what types of questions people want to ask with visualization software.

What's an example question that you would like to answer using genome or comparative genomics browsing software? Leave this question blank to skip to the next section of the survey. To the best of your knowledge, which if any, existing tools support this type of question? 
What is the distribution of transposable elements (TEs) relative to the gene coding regions of the genome. What is the global distribution of genes and TEs in the genome (ie. a heatmap based view). I would like to be able to look at heatmap distributions using multiple algorithms for partitioning the data by color (ie. equal interval, quantiles etc.). Difficult to do this easily with existing tools.
Display deep sequencing paired reads in a pileup that is vertically sorted by length of the matching pair. I think the AJAX code in Lookseq could be integrated into JBrowse or Gbrowse2. Lookseq
Within a genome I want to be able to make it easy to visually find areas of the genome where two or more features overlap or coincide e.g. a QTL and a gene GBrowse
Compare genomes between related species.
I would like to be able to see SNP existing between a reference genome strain and other genome strains of interest. I think it could be possible using GBrowse
My request is actually more general then that. I'd like some way to say :
  • show me all regions holding this feature (ie : a protein annotation) AND holding also this kind of features (ie : SNP, QTL, ....)
None, but I'm only very familiar with GBrowse.
We want to visualize high throughput data coming from cgh, or expression array and also from short read technologies. the organisms we study are not always well annotated. we want to aggregate annotation data, gene models from other sources.

in the end, the ideal would be the possibility of querying efficiently the browser on numerical data.

Gbrowse2 with Bio::DB::Sam is a good start point even if the documentation about the connection between GBrowse and the perl module is too short. I intend to install MAKER for the annotation part and its connection to Gbrowse2.
I know about the expression of DNA (for instance I have the mRNA or protein sequence that is expressed) and I would like to view the strech of DNA this maps to while aligning cDNA, EST, predictor, promoter, etc. information Apollo
In bacterial genomics we often look at regions of many (>20) strains and species stacked above each other, aligned at a gene of interest based on selections or homology searches. These regions may consist of inter AND intra species locations. To my knowledge there are a few tools that can do this, like the commercial ERGO package, Genoscope has these capabilities and the Microbial Genome Viewer as well (although MGV lacks other functionalities that make GBrowse unique)
How can we view repetitive regions using GBrowse_syn?
Ability to easily leverage data from CMap and GBrowse for comparative genomics Gbrowse Sythn (GBrowse_syn?)
Good multiple genome support SynBrowse, but not very well
I want to find lots of annotation and context for genes of interest from new papers or new data discoveries. UCSC Genome Browser, GBrowse (varies by species), Ensembl, Map Viewer
I have a few.
  • I'd like to be able to move a back and forth between a phylogenetic tree viewer where I can select a clade (which may include paralogs) and then see the selected members of this clade in a genomic context. And vice versa.
  • I'd also like to be able to see a protein multiple sequence alignment and select a region or set of and then see where these regions are overlaid onto the 3-D structure of the protein. (i.e. where each of the proteins have been threaded into a 3-D protein rendering of this family).
  • I'd like to have some visual clues indicating common/shared functions for genes in a given syntenic region. Do they appear in the same order?
None that I know of
What insight does the coverage of next-generation sequence data give regarding repetitive elements within the genome? This sort of data is quite complicated to view using GBrowse currently.
  • Genome visualisation (very specialised tracks)
  • NGS support
  • Comparative maps support
GBrowse, CMap
I would like to integrate the information from physical and genetic maps as well as the genome sequence. Including BACs, markers, BAC-ends, unigenes, etc. Chado, GBrowse and CMap.
Scalable view of genomes side by side, linked by markers, locations, or features (genes) GBrowse
I like to easily visualize microarray and next generation sequencing data from chromatin IP experiments across the genome and in relation to genomic features. Adjusting data plot parameters, order, and graph type spontaneously is important. Multiple competing genome browsers are capable of this function, but I am most familiar with GBrowse.
I would like to compare on-going sequencing projects (e.g.' an incomplete genome, chromosome or plasmid) with closely related finished and annotated sequences (e.g. a finished genome). It would be great to see the reference genome with annotation and the pieces of the unfinished project together, to manually infer putative genes, for example. May be with some of the synteny-aimed software, but this is not their scope.
If possible, I'd like a genome browser that can show sequence similarity (i.e. multiple sequence alignments) between portions of genomes. I'm not aware.
Is the region that is transcribed according to tiling array data associated with published data (such as siRNAs, DNA methylation, chromatin modifications etc)? GBrowse
I'd like to map all types of information onto the genome in the browser, eg Microarray data. But would also like to see it mapped to multiple genomes with comparative displays Much like ACT (from Sanger) can do, but in a web interface ACT, although this is a standalone tool
We would like to visualize where EST contigs sequenced from a novel genome align to the genome of a model organism. I would assume that any genome browser should be able to do this relatively well. While I have used GBrowse, JBrowse, UCSC Genome Browser, and Apollo, I have used only GBrowse extensively.
Given a class of transposable elements, how does the distribution of these elements differ between two genomes. This class could be very broad such as DNA transposons, or it could be a superfamily of DNA TEs or an individual family.
With multiple genomes I want to make it easy to find regions where there has been some sort of rearrangement (indel or inversion or translocation) GBrowse_syn
I would like to be able to enter a AA sequence and then get all tblastn (not really a blast search allowing gaps, but more a typical substring search) hits and have all those hits displayed within their neighborhood. none- Typically I have to come up with the coordinates of genome hits myself and then get it displayed somehow.
I want to quickly find SNPs in regions of interest, and have them color coded based on my own criteria. UCSC Genome Browser
Ability to add arbitrary number of genome elements/features, choices to make them permanently or privately part of a database GBrowse
Where are the start codons, according to the conservation in a clade?

I mean, to be able to determine mispredicted start codons based solely in previous annotations and the conservation of all the potential start codons (off course, assuming this is a good criterion based on the previous knowledge of the protein(s) analysed).

Any with comparative capabilities, but I ask because is great!
I'd like to show members of one gene family in different genomes as a stack on a synteny viewer. With the functional domains (in the sequence) highlighted. I'm still exploring GBrowse_syn so maybe it can be customized to do this.
Right now I am working on integrating GBrowse system to my gene prediction program. So user will be able to visualize the complete genome picture according to their input raw sequence. I was very happy with previous work and I did not look for alternative.
Would like to get guidance for the annotation of a novel sequence by showing similarity to a well-annotated sequence Apollo?

Features

This section asked participants to prioritize features in browsers.

For each feature, please indicate that feature’s importance to you. Please try to classify no more than 1/3 of the features as high importance.

Features are listed in the order they appeared in the survey. You can resort the table by clicking on a column header.

| Key: | High | Medium | Low | Not at all | No opinion | |——|——|——–|—–|————|————|

Feature
Browser response time (speed!)  71%   22%   4%   2%   0% 
Data loading speed. How long it takes to process and load data into backing databases.  31%   33%   29%   2%   4% 
Browser install and setup Script The GBrowse NetInstaller is an example.  20%   31%   29%   4%   16% 
Graphical user interface for administering the browser. Update configuration, add tracks, load data, ... through a GUI.  24%   24%   28%   7%   7% 
Configuration file checker with helpful error messages.  31%   51%   9%   0%   9% 
User management Allow users to login. This would enable other functionality.  27%   29%   24%   4%   16% 
Community annotation Support users adding annotation to individual features and/or uploading features or tracks for sharing with others.  38%   40%   18%   2%   2% 
Package browser software within a ready-to-install virtual machine that includes several other commonly used GMOD components. For example, see the community Annotation System.  16%   33%   24%   11%   16% 
Make browser instance metadata available via web services See this page for an explanation of how this might be done in GBrowse.  13%   36%   20%   7%   24% 
A public repository of browser-ready reference genomes, including example annotations such as gene models, NGS data, quantitative data (wiggle), ... GBrowse.org is a step in this direction.  29%   44%   20%   0%   7% 
Extensibility Support for plugins and user defined glyphs  27%   38%   24%   0%   11% 
Individual feature display customization Allow browser admin to write their own code to adjust how a feature is shown (height, color, border, ...), based on the feature's attributes. (This is done with Perl callbacks in GBrowse.)  42%   36%   16%   2%   4% 
Individual base display customization Allow browser admin to write their own code to adjust how an individual base is shown (height, color, border, ...) at run time, based on the base's attributes. This could show the alignment quality, or coverage, or ... for next generation sequencing data.  22%   36%   27%   4%   11% 
Admin control of browser layout. The browser admin configuration of what sections (e.g., search box, instructions, etc.) appear and where, what text appears and where, and so on. GBrowse allows admins to control some aspects of the layout.  20%   42%   24%   4%   9% 
Hierarchical listing of available tracks. GBrowse already supports this.  31%   29%   22%   0%   18% 
Show multiple regions simultaneously Select and then show multiple regions of the genome.  29%   40%   18%   0%   13% 
Comparing two or more genomes. GBrowse_syn, for example, does this.  49%   24%   7%   2%   18% 
Whole genome/chromosome browsing e.g., GBrowse karyotype  22%   38%   13%   4%   22% 
Browsing on mobile devices  4%   7%   16%   44%   29% 
Semantic zooming  22%   27%   27%   0%   24% 
Autocomletion of Search Terms  18%   40%   18%   11%   13% 
Popup Balloons  20%   36%   24%   7%   13% 
Rubber Band Selection  27%   40%   16%   0%   18% 
Linkage disequilibrium tracks  20%   18%   18%   4%   40% 
Support next generation sequencing individual reads. Visualize the NGS short reads themselves, showing items like read quality for individual bases.  53%   24%   11%   0%   11% 
Display markers from CMap in the genome browser.

See this page for an explanation of how this might be done in GBrowse.

 22%   18%   20%   2%   38% 
Quantitative data shown with color intensity i.e., wiggle_density tracks  53%   20%   11%   2%   13% 
Quantitative data shown on an x-y graph i.e., wiggle_xyplot tracks  49%   22%   11%   2%   16% 
Log scaling for quantitative data  36%   20%   16%   2%   27% 
Show multiple datasets in a single quantitative track. Data in color intensity tracks (wiggle_density) could be stacked; data in x-y tracks (wiggle-xyplot) could be superimposed (may be require multiple scales), or stacked.  42%   18%   20%   2%   18% 
Aggregation functions for quantitative data e.g., show mean, max, min, across all data or sliding windows.  44%   9%   22%   2%   22% 
Alignment tracks. Showing insertions, deletions, ...  33%   33%   9%   4%   20% 
Geolocation data how things like genotype and allele frequencies phenotypes, environment, ... by geolocation.  11%   11%   29%   13%   36% 

| Key: | High | Medium | Low | Not at all | No opinion | |——|——|——–|—–|————|————|

Expansion / Clarification

If you want to explain/expand any of your answers above, please do so here.

Other High Priority Features

Are there other high priority features you would like to see that are not in the list above?

Other Medium Priority Features

Are there other medium priority features you would like to see that are not in the list above?

Other Low Priority Features

Are there other Low priority features you would like to see that are not in the list above?

Other Feedback on Visualization Tools

Do you have any other feedback on any of these tools?

Of the tools you have used, were they useful and why (or why not)? Did you try to use any of them, but couldn’t get them to work?

Other Feedback

If you have any additional feedback, questions, or information you would like to provide, please tell us here.

Categories:

Documentation

Community

Tools