This glossary explains terms that
This glossary does not define biology terms.
AJAX is a web user interface technology used in some GMOD Components. It is used to provide a richer user experience than was typically available during the first 10 years of the web. AJAX stands for Asynchronous Javascript and XML.
See Also:
API stands for Application Programming Interface. An API is a well-defined programmatic interface to some resource. That is, it is an interface meant to be used by other programs to access that resource. It is distinct and sometime complementary to a Graphical User Interface or GUI, which is a direct user interface to a resource.
BAM is a binary version of Sequence Alignment/Map (SAM) format. BAM and SAM are both part of SAMtools. BAM is compressed, binary, indexed format for Next Generation Sequencing data. GBrowse 2 has an adaptor that can read BAM data.
CPAN is the Comprehensive Perl Archive Network, a repository of Perl modules that bring additional functionality to the Perl language.
See also
Cascading Style Sheets (CSS) are a way to control the appearance of web pages. CSS is used to separate style (colors, fonts, layout, etc.) from content (the actual information on a page), allowing styles to defined in a single place and then referred to from many pages.
See also
CVS is a source code control system that used to be used by most of GMOD. Source code control systems, also known as revision control or version control systems are used to record changes to computer files. GMOD now uses SVN.
See Also:
A directed acyclic graph (DAG) is a set of nodes and connections between the nodes where every connection has a direction, and there are no loops in the connections. That is, if you start at any node, and follow connections out of that node, you will never return to it.
See also:
See Distributed Annotation System
A database can be any set of organized data that is readable by a computer. It can be anywhere from an implementation of a database schema in a particular database management system to regular files that have a defined format.
For example, the database behind the FlyBase web site contains data on drosopholids, and uses the Chado schema and the PostgreSQL database management system.
See also:
Database management systems (DBMSs) are software systems that can manage data. PostgreSQL, MySQL, Oracle and Sybase are all examples of DBMSs. DBMSs are containers of databases. That is, they are the systems that manage databases, which is distinct from the data that they manage.
Most DBMSs are relational, which is a particular way of representing data. All DBMSs that GMOD is concerned with are relational, so GMOD uses the termsdatabase management system and relational database management system (RDBMS) interchangeably.
See also:
A database schema is the design of a particular database, independent of its contents. Chado is an example of a database schema. Designs (like Chado) can be reused across multiple databases.
See also:
See Database Management System.
The topmost hierarchal element in a DBMS’s collection of data. By definition, data stored within different databases cannot be related by the DBMS, by query or otherwise.
See also:
The layer below the topmost in a DBMS’s collection of data. An organizing concept somewhat similar to that of a folder or directory. Unlike data stored within different DBMS-Databases, data stored within different schema of the same DBMS-Database can be related and otherwise mutually manipulated within the DBMS.
See also:
FASTA is a widely used text-based data format for representing nucleic acid and peptide sequence data. FASTA entries start with a header line, followed by the sequence on the immediately following lines. The header line starts with the sequence identifier. It can also contain additional information, which is often pipe (“|”) separated.
A basic example, showing “ctg123”, a DNA sequence that is 338 nucleotides long:
>ctg123
cttctgggcgtacccgattctcggagaacttgccgcaccattccgccttg
tgttcattgctgcctgcatgttcattgtctacctcggctacgtgtggcta
tctttcctcggtgccctcgtgcacggagtcgagaaaccaaagaacaaaaa
aagaaattaaaatatttattttgctgtggtttttgatgtgtgttttttat
aatgatttttgatgtgaccaattgtacttttcctttaaatgaaatgtaat
cttaaatgtatttccgacgaattcgaggcctgaaaagt
FASTA entries can be included at the end of GFF3 files.
See also:
In a database, related tables are linked together by taking the primary key from one table and placing in the related table. The primary key then becomes a foreign key.
A former name for GFF.
See GFF.
A former name for GFF.
GFF is a standard file format for storing genomic features in a text file. GFF stands for Generic Feature Format. GFF files are plain text, 9 column, tab-delimited files. GFF databases also exist. They use a schema custom built to represent GFF data. GFF is frequently used in GMOD for data exchange and representation of genomic data.
There are two versions of GFF supported in GMOD: GFF3 and GFF2. GFF2 is now deprecated.
See also:
GFF2 is a supported GFF format in GMOD, but it is now deprecated and if you have a choice you should use GFF3. Unfortunately, data is sometimes only available in GFF2 format. GFF2 has a number of shortcomings compared to GFF3.
See also:
GFF3 is the most recent version of the GFF format. It has many advantages over the now deprecated GFF2 and should be used in favor of GFF2 whenever possible.
See also:
Git is a version control system, like Subversion (SVN), that is used to track and coordinate updates to files, usually software and/or documentation. Git is a distributed version control system, in that it does not require use of a central server. However, in practice, most projects use a central server, either hosted themselves or on a public host such as GitHub.
GTF is a genomic annotation file format that is very similar to GFF2 and is sometimes referred to as GFF2.5. GTF is not a supported format in GMOD so if you have a GTF file you’ll need to convert it to GFF3.
See also:
GUI is an acronym for Graphical User Interface. GUIs are interfaces to computer programs that use graphics, mice, pull down menus, check boxes, and other interactive elements. GUIs contrast with command line interfaces, where you interact with the program using only the keyboard.
Java is arguably the world’s most popular programming language but it is not as popular for command-line work on Unix as Perl. It’s encountered in GMOD primarily as a language to construct user interfaces (e.g. Apollo).
See also:
Java programs run in a virtual machine known as a Java Runtime Environment or JRE.
JSON is an acronym for JavaScript Object Notation, a lightweight data-interchange format. It is used in GMOD in Galaxy and JBrowse.
See also:
Linux is an open source operating system that is based on he Unix operating system. Linux is the default operating system for GMOD.
See also:
Middleware is software that connects other software components so they can talk together. You can think of it as project plumbing. Like plumbing, it is hard to do well, and people take it for granted until it does not work.
See also:
Objects and relations are two different ways to represent information in computing. Objects tend to be used by programming languages such as Java, while relations are widely used in databases, particularly relational databases. Object-relational mapping (ORM) converts information from one model to the other, usually at the point of interaction between object-oriented languages, and relational databases.
See also:
An operating system (OS) is the software that controls a computer and manages the sharing of resources on that computer. Example operating systems are Microsoft Windows and Linux.
See also:
See Object-Relational Mapping.
See Operating System.
Perl is the programming language most used in the bioinformatics realm, and it is the language most used by GMOD developers. It is well-suited to text and data processing and is also characterized by an extensive open source library, so it’s highly functional. Many of GMOD components use BioPerl, a bioinformatics toolkit written in Perl.
Some parts of GMOD, like GBrowse, can be extended or customized using Perl but beginners’ skills in Perl is sufficient for this work.
See also:
See Database Management System.
Most Database Management Systems (DBMSs) are relational, which is a particular way of representing data. All DBMSs that GMOD is concerned with are relational, so GMOD uses the terms database management system and relational database management system (RDBMS) interchangeably.
See also:
See Relational and Database Management System.
Sequence Alignment/Map format. SAM is a text format for Next Generation Sequencing data. It is a part of SAMtools. GBrowse 2 has an adaptor that can read SAM data.
SAMtools is a set of formats and programs for storing, manipulating, and accessing Next Generation Sequencing data.
See Database Schema
SQL is a standard query language used with relational database management systems (DBMSs). Is is used to update and retrieve data that is in a database.
SQL is generally similar for different DBMSs but varies in many details from one DBMS to another.
SVN, short for Subversion, is a source code control system that is used by most of GMOD. Source code control systems, also known as revision control or version control systems are used to record changes to computer files. GMOD converted from CVS to SVN on 2009/09/15.
GMOD’s main source code repository is at SourceForge. Subversion explains how to both download and update the main GMOD repository at SourceForge.
See Also:
Unix is a group of operating systems that are descended from the original Unix operating system developed in the 1970s. This includes Solaris, HP-UX, Linux, Mac OS X, and many others.
XML is an acronym for eXtensible Markup Language, a data format used primarily for sharing data. It looks similar to HTML, but has a much tighter syntax than does HTML.
See also: