@@ Line 1: / Line 1: @@
-== Introduction ==
-This guide will walk you through the server side installation for Web Apollo.  Web Apollo is a web-based application, so the only client side requirement is a web browser.  Note that Web Apollo has only been tested on Chrome, Firefox, and Safari.  It has not been tested with Internet Explorer.
-== Quick start guide ==
-While there are a number of prerequisites to WebApollo, we hope that this quick-start guide can help by automating some setup steps. Commands can be run in the console. This assumes the use of a package manager for some program installation, and cpanm for perl package management.
-This "Quick start guide" can be used to initialize a "blank" machine with a WebApollo instance from scratch. More discussion of the particular configurations can be seen in the full [[#Installation|installation guide]] below.
- # install system prerequisites (debian/ubuntu)
- sudo apt-get install tomcat7 openjdk-7-jdk cpanminus libpng12-0 libpng12-0-dev zlib1g zlib1g-dev libexpat1-dev postgresql-9.3 postgresql-server-dev-9.3 nodejs-legacy git maven
- # install system prerequisites (centOS/redhat)
- sudo yum epel-release
- sudo yum install tomcat cpanminus zlib-devel libpng-devel gcc postgresql postgresql-devel git maven npm
- # install system prerequisites (macOSX/homebrew)
- brew install git maven tomcat node cpanminus --no-tcl postgresql
- # on centOS/redhat, manually init and start postgres
- sudo su -c "service postgresql initdb && service postgresql start"
- # setup cpanm and install jbrowse and webapollo perl prerequisites
- cpanm --local-lib=~/perl5 local::lib && eval $(perl -I ~/perl5/lib/perl5/ -Mlocal::lib)
- cpanm DateTime Text::Markdown Crypt::PBKDF2 DBI DBD::Pg
- # init postgres by logging into postgres user and creating the web_apollo_users_admin user and web_apollo_users database
- sudo su postgres
- createuser --username=postgres -RDIElPS web_apollo_users_admin
- createdb --username=postgres -E UTF-8 -O web_apollo_users_admin web_apollo_users
- exit
-Edit your pg_hba.conf to have a line for your web_apollo_users_admin (see [[#Authentication|Authentication]] for details)
- # TYPE  DATABASE        USER            ADDRESS                 METHOD
- local   all             web_apollo_users_admin                        md5
- # restart to setup privileges for web_apollo_users_admin
- sudo su -c "service postgresql restart"
- # clone Apollo repository and download sample data to WEB_APOLLO_ROOT/pyu_data
- git clone --depth 1 https://github.com/gmod/Apollo.git
- cd Apollo
- wget http://icebox.lbl.gov/webapollo/data/pyu_data.tgz
- tar xvzf pyu_data.tgz
- # initialize PostgreSQL data base for sample data. Enter the password web_apollo_users_admin for firs tstep
- psql -U web_apollo_users_admin web_apollo_users < tools/user/user_database_postgresql.sql
- tools/user/add_user.pl -D web_apollo_users -U web_apollo_users_admin -P web_apollo_users_admin -u web_apollo_admin -p web_apollo_admin
- tools/user/extract_seqids_from_fasta.pl -p Annotations- -i pyu_data/scf1117875582023.fa -o seqids.txt
- tools/user /add_tracks.pl -D web_apollo_users -U web_apollo_users_admin -P web_apollo_users_admin -t seqids.txt
- tools/user/set_track_permissions.pl -D web_apollo_users -U web_apollo_users_admin -P web_apollo_users_admin -u web_apollo_admin -t seqids.txt -a
- # build a compressed release package and install jbrowse binaries (also installs many perl prerequisites using cpanm)
- ./build.sh release
- ./install_jbrowse_bin.sh cpanm
- # setup jbrowse data directory in WEB_APOLLO_ROOT/data
- mkdir split_gff
- tools/data/split_gff_by_source.pl -i pyu_data/scf1117875582023.gff -d split_gff
- prepare-refseqs.pl --fasta pyu_data/scf1117875582023.fa --out data
- flatfile-to-json.pl --gff split_gff/maker.gff --arrowheadClass trellis-arrowhead --subfeatureClasses '{"wholeCDS": null, "CDS":"brightgreen-80pct", "UTR": "darkgreen-60pct", "exon":"container-100pct"}' --className container-16px --type mRNA --trackLabel maker --out data
- client/apollo/bin/add-webapollo-plugin.pl -i data/trackList.json
- # configure data directories using config.properties using "web_apollo_users_admin" as the psql login for the web_apollo_users table
- mkdir annotations
- echo jbrowse.data=`pwd`/data > config.properties
- echo datastore.directory=`pwd`/annotations >> config.properties
- echo database.url=jdbc:postgresql:web_apollo_users >> config.properties
- echo database.username=web_apollo_users_admin >> config.properties
- echo database.password=web_apollo_users_admin >> config.properties
- # launch instance, login to the webapollo app as web_apollo_admin with password web_apollo_admin
- ./run.sh
-== Installation ==
-You can download the latest Web Apollo release as a [https://github.com/gmod/Apollo.git tarball] or from [genomearchitect.org] (not available for 1.x release branch yet).  All installation steps will be done through a shell.  We'll be using Tomcat 7 as our servlet container and PostgreSQL as our relational database management system.  We'll use sample data from the Pythium ultimum genome, provided as a [http://icebox.lbl.gov/webapollo/data/pyu_data.tgz separate download].
-===Server operating system===
-Any Unix like system (e.g., Unix, Linux, Mac OS X)
-===Prerequisites===
-Note: see the [[#Quick start guide|Quick-start guide]] for the quickest way to take care of pre-requisites.
-* System prerequisites
-** Servlet container (must support servlet spec 3.0+) [officially supported: Tomcat 7]
-** Java 7+
-** Maven3+ (most package managers will have this)
-** Relational Database Management System [officially supported: PostgreSQL]
-** Git
-** NodeJS
-* Perl prerequisites that need manual installation
-** DateTime
-** Text::Markdown
-** Crypt::PBKDF2
-** DBI
-** DBD::Pg
-* Data generation pipeline prerequisites (see [[JBrowse#Prerequisites|JBrowse prerequisites]] for more information on its prerequisites)
-** System packages
-*** libpng12-0
-*** libpng12-dev
-*** zlib1g (Debian/Ubuntu)
-*** zlib1g-dev (Debian/Ubuntu)
-*** zlib (RedHat/CentOS)
-*** zlib-devel (RedHat/CentOS)
-*** libexpat1-dev (Debian/Ubuntu)
-* Sequence search (optional)
-** Blat (download [http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/ Linux] or [http://hgdownload.cse.ucsc.edu/admin/exe/macOSX.x86_64/|Mac OSX] binaries)
-===Tomcat memory===
-The memory requirements will depend on the the size of your genome and how many instances of Web Apollo you host in the same Tomcat instance.  We recommend at least 1g for the heap size and 256m for the permgen size as a starting point.   Suggested settings are:
- <span class="enter">-Xms512m -Xmx1g -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled -XX:+UseConcMarkSweepGC -XX:MaxPermSize=256m</span>
-The location of your Tomcat environment configuration will be dependent on how you installed it (manually vs using a package manager).  It's recommended that you add this configuration in <tt>$TOMCAT_BIN_DIR/setenv.sh</tt> where <tt>$TOMCAT_BIN_DIR</tt> is where the directory where the Tomcat binaries reside.
-===Conventions===
-This guide will use the following conventions to make it more concise (you might want to keep these convention definitions handy so that you can easily reference them as you go through this guide):
-* WEB_APOLLO_DIR
-** Location where the tarball was uncompressed and will include <tt>WebApollo-RELEASE_DATE</tt> (e.g., <tt>~/webapollo/WebApollo-2012-10-08</tt>)
-* WEB_APOLLO_SAMPLE_DIR
-** Location where the sample tarball was uncompressed (e.g., <tt>~/webapollo/webapollo_sample</tt>)
-* WEB_APOLLO_DATA_DIR
-** Location for WebApollo annotations (e.g., <tt>/data/webapollo/annotations</tt>)
-* JBROWSE_DATA_DIR
-** Location for JBrowse data (e.g., <tt>/data/webapollo/jbrowse/data</tt>)
-* TOMCAT_WEBAPPS_DIR
-** Location where deployed servlets for Tomcat go (e.g., <tt>/var/lib/tomcat7/webapps</tt>)
-* BLAT_DIR
-** Location where the Blat binaries are installed (e.g., <tt>/usr/local/bin</tt>)
-* BLAT_TMP_DIR
-** Location for temporary Blat files (e.g., <tt>/data/webapollo/blat/tmp</tt>)
-* BLAT_DATABASE
-**Location for the Blat database (e.g., <tt>/data/webapollo/blat/db/pyu.2bit</tt>)
-The Tomcat related paths are the ones used by default in Ubuntu 12.04 and Ubuntu's provided Tomcat7 package.  Paths will likely be different in your system depending on how Tomcat was installed.
-===Authentication===
-Postgres can use Ident and password authentication. Because it is set up to use Ident by default, you might have to add a line to ''pg_hba.conf'' specifying that the user will connect via password authentication.
-Edit /etc/postgres/8.4/main/pg_hba.conf and add the following line:
- local    all    web_apollo_users_admin     md5
-Restart the postgres server for changes to take effect
- $ <span class="">/etc/init.d/postgresql-8.4 restart</span>
-=== User database ===
-Web Apollo uses a database to determine who can access and edit annotations for a given sequence.
-First we’ll need to create a database.  You can call it whatever you want (remember the name as you’ll need to point the configuration to it).  For the purposes of this guide, we’ll call it <tt>web_apollo_users</tt>  You might want to create a separate account to manage the database.  We’ll have the user <tt>web_apollo_users_admin</tt> with password <tt>web_apollo_users_admin</tt> who has database creation privilege.  Depending on how your database server is setup, you might not need to set a password for the user.  See the [http://www.postgresql.org/docs PostgreSQL documentation] for more information.  We'll assume that the database is in the same server where Web Apollo is being installed ("localhost").
-These commands will be run as the ''postgres'' user.
- $ <span class="enter">sudo su postgres</span>
- $ <span class="enter">createuser -P web_apollo_users_admin
- Enter password for new role:
- Enter it again:
- Shall the new role be a superuser? (y/n) n
- Shall the new role be allowed to create databases? (y/n) y
- Shall the new role be allowed to create more new roles? (y/n) n
-</span>
-Next we'll create the user database.
- $ <span class="enter">createdb -U web_apollo_users_admin web_apollo_users</span>
-If you get an authentication error, use the -W flag to get a password prompt.
- $ <span class="enter">createdb -U web_apollo_users_admin -W web_apollo_users</span>
-Now that the database is created, we need to load the schema to it.
- $ <span class="enter">cd WEB_APOLLO_DIR/tools/user</span>
- $ <span class="enter">psql -U web_apollo_users_admin web_apollo_users < user_database_postgresql.sql</span>
-Now the user database has been setup.
-Let's populate the database.
-First we’ll create an user with access to Web Apollo.  We’ll use the <tt>add_user.pl</tt> script in <tt>WEB_APOLLO_DIR/tools/user</tt>.  Let’s create an user named <tt>web_apollo_admin</tt> with the password <tt>web_apollo_admin</tt>.
- $ <span class="enter">./add_user.pl -D web_apollo_users -U web_apollo_users_admin -P web_apollo_users_admin \
- -u web_apollo_admin -p web_apollo_admin</span>
-Next we’ll add the annotation tracks ids for the genomic sequences for our organism.  We’ll use the <tt>add_tracks.pl</tt> script in the same directory.  We need to generate a file of genomic sequence ids for the script.  For convenience, there’s a script called <tt>extract_seqids_from_fasta.pl</tt> in the same directory which will go through a FASTA file and extract all the ids from the deflines.  Let’s first create the list of genomic sequence ids.  We'll store it in <tt>~/scratch/seqids.txt</tt>.  We’ll want to add the prefix “Annotations-” to each identifier.
- $ <span class="enter">mkdir ~/scratch</span>
- $ <span class="enter">./extract_seqids_from_fasta.pl -p Annotations- -i WEB_APOLLO_SAMPLE_DIR/scf1117875582023.fa \
- -o ~/scratch/seqids.txt</span>
-Now we’ll add those ids to the user database.
- $ <span class="enter">./add_tracks.pl -D web_apollo_users -U web_apollo_users_admin -P web_apollo_users_admin \
- -t ~/scratch/seqids.txt</span>
-Now that we have an user created and the annotation track ids loaded, we’ll need to give the user permissions to access the sequence.  We’ll have the all permissions (read, write, publish, user manager).  We’ll use the <tt>set_track_permissions.pl</tt> script in the same directory.  We’ll need to provide the script a list of genomic sequence ids, like in the previous step.
- $ <span class="enter">./set_track_permissions.pl -D web_apollo_users -U web_apollo_users_admin \
- -P web_apollo_users_admin -u web_apollo_admin -t ~/scratch/seqids.txt -a</span>
-We’re all done setting up the user database.
-Note that we’re only using a subset of the options for all the scripts mentioned above.  You can get more detailed information on any given script (and other available options) using the “-h” or “--help” flag when running the script.
-== Installing WebApollo ==
-From the top of inside the downloaded release, you need to run maven to build a war file.  This is then placed in tomcat's webapps directory.  Tomcat will be responsible extracting the file.
-'''IMPORTANT: the JBrowse data directories should no longer be placed anywhere inside the Tomcat webapps folder, not even when using symlinks!! The data directory should be created outside of the webapps folder to avoid data loss when doing Undeploy operations!!'''
-=== Before you build ===
-You need to configure your instance using a config.properties and a config.xml file, which are copied into the war file.
-* Copy the sample config / logging files to the right location.
- $ <span class="enter">cd WEB_APOLLO_DIR</span>
- $ <span class="enter">cp sample_config.properties config.properties </span>
- $ <span class="enter">cp sample_config.xml config.xml </span>
- $ <span class="enter">cp sample_log4j2.json log4j2.json </span>
- $ <span class="enter">cp sample_log4j2-test.json log4j2-test.json </span>
-* Edit the config.properties file and config.xml to point to the appropriate directories.
-** Note: You ''must'' edit the config.properties file to point to your jbrowse data directory, e.g. jbrowse.data=/opt/apollo/jbrowse/data to point to your data directory.   The other parameters are optional and can still be configured in your config.xml file (to comment out, prepend with a #).
-=== Building the servlet ===
- $ <span class="enter">cd WEB_APOLLO_DIR</span>
-There are a variety of targets available to build the war.  For the debug|release, make sure you have the prerequisites for building, including NodeJS, {{CPAN|DateTime}}, and {{CPAN|Text::Markdown}}
-To generate a build, run deploy.sh with some optional parameters
- $ <span class="enter"> ./deploy.sh [release|debug|github|help]      </span>
-This is used to generate a release, debug, or unoptimized copy of github repo for the WebApollo+JBrowse plugin, and creates a war file for deployment in the WEB_APOLLO_DIR/target/
-=== Install JBrowse binaries for WebApollo ===
-For WebApollo, it is best to install the JBrowse binaries using the following script:
- $ <span class="enter"> ./install_jbrowse_bin.sh [cpanm]</span>
-This will install the binaries to the system via cpan or cpanm. If you are using cpanm, you can set use the environment variable to set specific install directories, i.e.
- $ <span class="enter"> export PERL_CPANM_OPT="--local-lib=~/perl5"</span>
-=== Deploying the servlet ===
-After the war file is generated in the WEB_APOLLO_DIR/target directory (e.g. target/apollo-1.0.war), it needs to be copied into the tomcat7 webapps directory:
-* Stop tomcat.
-* Remove the previously installed Apollo installation, if it exists (war and directory files).
-* cp WEB_APOLLO_DIR/target/apollo-1.x.war TOMCAT_WEBAPPS_DIR
-* Start tomcat.
-There are other ways to do this, but ''NEVER'' expand the war file yourself or touch it after it has been expanded.
-=== Development targets ===
-We've moved the build to maven and provide a pom.xml file.  It can be opened using most modern Java IDE's including [https://www.eclipse.org/downloads/ Eclipse], [https://netbeans.org/downloads/ Netbeans], or [http://www.jetbrains.com/idea/download/ Intellij].
-To run tomcat on 8080:
- $ <span class="enter"> ./run.sh </span>
-To run tomcat on 8080, list to debug port on 8000:
- $ <span class="enter"> ./debug.sh </span>
-Runs all unit tests:
- $ <span class="enter"> ./test.sh </span>
-==Configuration==
-Most configuration files will reside in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/config</tt>.  We’ll need to configure a number of things before we can get Web Apollo up and running.
-===Supported annotation types===
-Many configurations will require you to define which annotation types the configuration will apply to.  WebApollo supports the following "higher level" types (from the Sequence Ontology):
-* sequence:gene
-* sequence:pseudogene
-* sequence:transcript
-* sequence:mRNA
-* sequence:tRNA
-* sequence:snRNA
-* sequence:snoRNA
-* sequence:ncRNA
-* sequence:rRNA
-* sequence:miRNA
-* sequence:repeat_region
-* sequence:transposable_element
-===Main configuration===
-The main configuration is stored in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/config/config.xml</tt>.  Let’s take a look at the file.
-<syntaxhighlight lang="xml">
-<?xml version="1.0" encoding="UTF-8"?>
-<server_configuration>
-	<!-- mapping configuration for GBOL data structures -->
-	<gbol_mapping>/config/mapping.xml</gbol_mapping>
-	<!-- directory where JE database will be created -->
-	<datastore_directory>ENTER_DATASTORE_DIRECTORY_HERE</datastore_directory>
-	<!-- minimum size for introns created -->
-	<default_minimum_intron_size>1</default_minimum_intron_size>
-	<!-- size of history for each feature - setting to 0 means unlimited history -->
-	<history_size>0</history_size>
-        <!-- overlapping strategy for adding transcripts to genes -->
-        <overlapper_class>org.bbop.apollo.web.overlap.OrfOverlapper</overlapper_class>
-        <!-- javascript file for comparing track names (refseqs) (used for sorting in selection table) -->
-        <track_name_comparator>/config/track_name_comparator.js</track_name_comparator>
-        <!-- whether to use an existing CDS when creating new transcripts -->
-        <use_cds_for_new_transcripts>true</use_cds_for_new_transcripts>
-	<!-- set to false to use hybrid disk/memory store which provides a little slower performance
-	but uses a lot less memory - great for annotation rich genomes -->
-	<use_pure_memory_store>true</use_pure_memory_store>
-	<!-- user authentication/permission configuration -->
-	<user>
-		<!-- database configuration -->
-		<database>
-			<!-- driver for user database -->
-			<driver>org.postgresql.Driver</driver>
-			<!-- JDBC URL for user database -->
-			<url>ENTER_USER_DATABASE_JDBC_URL</url>
-			<!-- username for user database -->
-			<username>ENTER_USER_DATABASE_USERNAME</username>
-			<!-- password for user database -->
-			<password>ENTER_USER_DATABASE_PASSWORD</password>
-		</database>
-		<!-- class for generating user authentication page
-			(login page) -->
-		<authentication_class>org.bbop.apollo.web.user.localdb.LocalDbUserAuthentication</authentication_class>
-	</user>
-	<tracks>
-		<!-- path to JBrowse refSeqs.json file -->
-		<refseqs>ENTER_PATH_TO_REFSEQS_JSON_FILE</refseqs>
-		<!-- annotation track name the current convention is to append
-			the genomic region id to the the name of the annotation track
-			e.g., if the annotation track is called "Annotations" and the
-			genomic region is chr2L, the track name will be
-			"Annotations-chr2L".-->
-		<annotation_track_name>Annotations</annotation_track_name>
-	 	<!-- organism being annotated -->
-		<organism>ENTER_ORGANISM</organism>
-		<!-- CV term for the genomic sequences - should be in the form
-			of "CV:term".  This applies to all sequences -->
-		<sequence_type>ENTER_CVTERM_FOR_SEQUENCE</sequence_type>
-		<!-- path to file containing translation table.
-			optional - defaults to NCBI translation table 1 if absent -->
-		<translation_table>/config/translation_tables/ncbi_1_translation_table.txt</translation_table>
-		<!-- splice acceptor and donor sites. Multiple entries may be
-			added to allow multiple accepted sites.
-			optional - defaults to GT for donor and AG for acceptor
-			if absent -->
-		<splice_sites>
-			<donor_site>GT</donor_site>
-			<acceptor_site>AG</acceptor_site>
-		</splice_sites>
-	</tracks>
-	<!-- path to file containing canned comments XML -->
-	<canned_comments>/config/canned_comments.xml</canned_comments>
-	<!-- configuration for what to display in the annotation info editor.
-		Sections can be commented out to not be displayed or uncommented
-		to make them active -->
-	<annotation_info_editor>
-		<!-- grouping for the configuration.  The "feature_types" attribute takes a list of
-		SO terms (comma separated) to apply this configuration to
-		(e.g., feature_types="sequence:transcript,sequence:mRNA" will make it so the group
-		configuration will only apply to features of type "sequence:transcript" or "sequence:mRNA").
-		A value of "default" will make this the default configuration for any types not explicitly
-		defined in other groups.  You can have any many groups as you'd like -->
-		<annotation_info_editor_group feature_types="default">
-			<!-- display status section.  The text for each <status_flag>
-			element will be displayed as a radio button in the status
-			section, in the same order -->
-			<!--
-			<status>
-				<status_flag>Approved</status_flag>
-				<status_flag>Needs review</status_flag>
-			</status>
-			-->
-			<!-- display generic attributes section -->
-			<attributes />
-			<!-- display dbxrefs section -->
-			<dbxrefs />
-			<!-- display PubMed IDs section -->
-			<pubmed_ids />
-			<!-- display GO IDs section -->
-			<go_ids />
-			<!-- display comments section -->
-			<comments />
-		</annotation_info_editor_group>
-	</annotation_info_editor>
-	<!-- tools to be used for sequence searching.  This is optional.
-		If this is not setup, WebApollo will not have sequence search support -->
-	<sequence_search_tools>
-		<!-- one <sequence_search_tool> element per tool -->
-		<sequence_search_tool>
-			<!-- display name for the search tool -->
-			<key>BLAT nucleotide</key>
-			<!-- class for handling search -->
-			<class>org.bbop.apollo.tools.seq.search.blat.BlatCommandLineNucleotideToNucleotide</class>
-			<!-- configuration for search tool -->
-			<config>/config/blat_config.xml</config>
-		</sequence_search_tool>
-		<sequence_search_tool>
-			<!-- display name for the search tool -->
-			<key>BLAT protein</key>
-			<!-- class for handling search -->
-			<class>org.bbop.apollo.tools.seq.search.blat.BlatCommandLineProteinToNucleotide</class>
-			<!-- configuration for search tool -->
-			<config>/config/blat_config.xml</config>
-		</sequence_search_tool>
-	</sequence_search_tools>
-	<!-- data adapters for writing annotation data to different formats.
-		These will be used to dynamically generate data adapters within
-		WebApollo.  This is optional.  -->
-	<data_adapters>
-		<!-- one <data_adapter> element per data adapter -->
-		<data_adapter>
-			<!-- display name for data adapter -->
-			<key>GFF3</key>
-			<!-- class for data adapter plugin -->
-			<class>org.bbop.apollo.web.dataadapter.gff3.Gff3DataAdapter</class>
-			<!-- required permission for using data adapter
-			available options are: read, write, publish -->
-			<permission>read</permission>
-			<!-- configuration file for data adapter -->
- 			<config>/config/gff3_config.xml</config>
-			<!-- options to be passed to data adapter -->
-			<options>output=file&amp;format=gzip</options>
-		</data_adapter>
-		<data_adapter>
-			<!-- display name for data adapter -->
-			<key>Chado</key>
-			<!-- class for data adapter plugin -->
-			<class>org.bbop.apollo.web.dataadapter.chado.ChadoDataAdapter</class>
-			<!-- required permission for using data adapter
-			available options are: read, write, publish -->
-			<permission>publish</permission>
-			<!-- configuration file for data adapter -->
-			<config>/config/chado_config.xml</config>
-			<!-- options to be passed to data adapter -->
-			<options>display_features=false</options>
-		</data_adapter>
-		<!-- group the <data_adapter> children elements together -->
-		<data_adapter_group>
-			<!-- display name for adapter group -->
-			<key>FASTA</key>
-			<!-- required permission for using data adapter group
-			available options are: read, write, publish -->
-			<permission>read</permission>
-			<!-- one child <data_adapter> for each data adapter in the group -->
-			<data_adapter>
-				<!-- display name for data adapter -->
-				<key>peptide</key>
-				<!-- class for data adapter plugin -->
-				<class>org.bbop.apollo.web.dataadapter.fasta.FastaDataAdapter</class>
-				<!-- required permission for using data adapter
-				available options are: read, write, publish -->
-				<permission>read</permission>
-				<!-- configuration file for data adapter -->
-				<config>/config/fasta_config.xml</config>
-				<!-- options to be passed to data adapter -->
-				<options>output=file&amp;format=gzip&amp;seqType=peptide</options>
-			</data_adapter>
-			<data_adapter>
-				<!-- display name for data adapter -->
-				<key>cDNA</key>
-				<!-- class for data adapter plugin -->
-				<class>org.bbop.apollo.web.dataadapter.fasta.FastaDataAdapter</class>
-				<!-- required permission for using data adapter
-				available options are: read, write, publish -->
-				<permission>read</permission>
-				<!-- configuration file for data adapter -->
-				<config>/config/fasta_config.xml</config>
-				<!-- options to be passed to data adapter -->
-				<options>output=file&amp;format=gzip&amp;seqType=cdna</options>
-			</data_adapter>
-			<data_adapter>
-				<!-- display name for data adapter -->
-				<key>CDS</key>
-				<!-- class for data adapter plugin -->
-				<class>org.bbop.apollo.web.dataadapter.fasta.FastaDataAdapter</class>
-				<!-- required permission for using data adapter
-				available options are: read, write, publish -->
-				<permission>read</permission>
-				<!-- configuration file for data adapter -->
-				<config>/config/fasta_config.xml</config>
-				<!-- options to be passed to data adapter -->
-				<options>output=file&amp;format=gzip&amp;seqType=cds</options>
-			</data_adapter>
-		</data_adapter_group>
-	</data_adapters>
-</server_configuration>
-</syntaxhighlight>
-Let’s look through each element in more detail with values filled in.
-<syntaxhighlight lang="xml">
-<!-- mapping configuration for GBOL data structures -->
-<gbol_mapping>/config/mapping.xml</gbol_mapping>
-</syntaxhighlight>
-File that contains type mappings used by the underlying data model.  It’s best not to change the default option.
-<syntaxhighlight lang="xml">
-<!-- directory where JE database will be created -->
-<datastore_directory>WEB_APOLLO_DATA_DIR</datastore_directory>
-</syntaxhighlight>
-Directory where user generated annotations will be stored.  The data is stored using Berkeley DB.
-<syntaxhighlight lang="xml">
-<!-- minimum size for introns created -->
-<default_minimum_intron_size>1</default_minimum_intron_size>
-</syntaxhighlight>
-Minimum length of intron to be created when using the “Make intron” operation.  The operation will try to make the shortest intron that’s at least as long as this parameter.  So if you set it to a value of “40”, then all calculated introns will be at least of length 40.
-<syntaxhighlight lang="xml">
-<!-- size of history for each feature - setting to 0 means unlimited history -->
-<history_size>0</history_size>
-</syntaxhighlight>
-The size of your history stack, meaning how many “Undo/Redo” steps you can do.  The larger the number, the larger the storage space needed.  Setting it to “0” makes it to that there’s no limit.
-<syntaxhighlight lang="xml">
-<!-- overlapping strategy for adding transcripts to genes -->
-<overlapper_class>org.bbop.apollo.web.overlap.OrfOverlapper</overlapper_class>
-</syntaxhighlight>
-Defines the strategy to be used for deciding whether overlapping transcripts should be considered splice variants to the same gene.  This points to a Java class implementing the <tt>org.bbop.apollo.overlap.Overlapper</tt> interface.  This allows you to create your own custom overlapping strategy should the need arise.  Currently available options are:
-*<tt>org.bbop.apollo.web.overlap.NoOverlapper</tt>
-**No transcripts should be considered splice variants, regardless of overlap.
-*<tt>org.bbop.apollo.web.overlap.SimpleOverlapper</tt>
-**Any overlapping of transcripts will cause them to be part of the same gene
-*<tt>org.bbop.apollo.web.overlap.OrfOverlapper</tt>
-**Only transcripts that overlap within the coding region and within frame are considered part of the same gene
-<syntaxhighlight lang="xml">
-<!-- javascript file for comparing track names (refseqs) (used for sorting in selection table) -->
-<track_name_comparator>/config/track_name_comparator.js</track_name_comparator>
-</syntaxhighlight>
-Defines how to compare genomic sequence names for sorting purposes in the genomic region selection list.  Points to a javascript file.  You can implement your logic to allow whatever sorting you’d like for your own organism.  This doesn't make much of a difference in our case since we're only dealing with one genomic region.  The default behavior is to sort names lexicographically.
-<syntaxhighlight lang="xml">
-<!-- whether to use an existing CDS when creating new transcripts -->
-<use_cds_for_new_transcripts>true</use_cds_for_new_transcripts>
-</syntaxhighlight>
-Tells Web Apollo whether to use an existing CDS when creating a new transcript (otherwise it computes the longest ORF).  This can be useful when gene predictors suggest a CDS that's not the longest ORF and you want to use that instead.  This is only applicable when using features that have a CDS associated with them.
-<syntaxhighlight lang="xml">
-<!-- set to false to use hybrid disk/memory store which provides a little slower performance
-but uses a lot less memory - great for annotation rich genomes -->
-<use_pure_memory_store>true</use_pure_memory_store>
-</syntaxhighlight>
-Defines whether the internal data store is purely a memory one or a hybrid memory/disk store.  The memory store provides faster performance at the cost of more memory.  The hybrid store provides a little slower performance but uses a lot less memory, so it's a good option for annotation rich genomes.  Set to <tt>true</tt> to use the memory store and <tt>false</tt> to use the hybrid one.
-Let’s take look at the <tt>user</tt> element, which handles configuration for user authentication and permission handling.
-<syntaxhighlight lang="xml">
-<!-- user authentication/permission configuration -->
-<user>
-	<!-- database configuration -->
-	<database>
-		<!-- driver for user database -->
-		<driver>org.postgresql.Driver</driver>
-		<!-- JDBC URL for user database -->
-		<url>ENTER_USER_DATABASE_JDBC_URL</url>
-		<!-- username for user database -->
-		<username>ENTER_USER_DATABASE_USERNAME</username>
-		<!-- password for user database -->
-		<password>ENTER_USER_DATABASE_PASSWORD</password>
-	</database>
-	<!-- class for generating user authentication page (login page) -->
-	<authentication_class>org.bbop.apollo.web.user.localdb.LocalDbUserAuthentication</authentication_class>
-</user>
-</syntaxhighlight>
-Let’s first look at the <tt>database</tt> element that defines the database that will handle user permissions (which we created previously).
-<syntaxhighlight lang="xml">
-<!-- driver for user database -->
-<driver>org.postgresql.Driver</driver>
-</syntaxhighlight>
-This should point the JDBC driver for communicating with the database.  We’re using a PostgreSQL driver since that’s the database we’re using for user permission management.
-<syntaxhighlight lang="xml">
-<!-- JDBC URL for user database -->
-<url>jdbc:postgresql://localhost/web_apollo_users</url>
-</syntaxhighlight>
-JDBC URL to the user permission database.  We'll use <tt>jdbc:postgresql://localhost/web_apollo_users</tt> since the database is running in the same server as the annotation editing engine and we named the database <tt>web_apollo_users</tt>.
-<syntaxhighlight lang="xml">
-<!-- username for user database -->
-<username>web_apollo_users_admin</username>
-</syntaxhighlight>
-User name that has read/write access to the user database.  The user with access to the user database has the user name <tt>web_apollo_users_admin</tt>.
-<syntaxhighlight lang="xml">
-<!-- password for user database -->
-<password>web_apollo_users_admin</password>
-</syntaxhighlight>
-Password to access user database.  The user with access to the user database has the password </tt>web_apollo_users_admin</tt>.
-Now let’s look at the other elements in the <tt>user</tt> element.
-<syntaxhighlight lang="xml">
-<!-- class for generating user authentication page (login page) -->
-<authentication_class>org.bbop.apollo.web.user.localdb.LocalDbUserAuthentication</authentication_class>
-</syntaxhighlight>
-Defines how user authentication is handled.  This points to a class implementing the <tt>org.bbop.apollo.web.user.UserAuthentication</tt> interface.  This allows you to implement any type of authentication you’d like (e.g., LDAP).  Currently available options are:
-*<tt>org.bbop.apollo.web.user.localdb.LocalDbUserAuthentication</tt>
-**Uses the user permission database to also store authentication information, meaning it stores user passwords in the database
-*<tt>org.bbop.apollo.web.user.browserid.BrowserIdUserAuthentication</tt>
-**Uses Mozilla’s [https://browserid.org BrowserID] service for authentication.  This has the benefits of offloading all authentication security to Mozilla and allows one account to have access to multiple resources (as long as they have BrowserID support).  Being that the service is provided through Mozilla, it will require users to create a BrowserID account
-Now let’s look at the configuration for accessing the annotation tracks for the genomic sequences.
-<syntaxhighlight lang="xml">
-<tracks>
-	<!-- path to JBrowse refSeqs.json file -->
-	<refseqs>ENTER_PATH_TO_REFSEQS_JSON_FILE</refseqs>
-	<!-- annotation track name the current convention is to append
-		the genomic region id to the the name of the annotation track
-		e.g., if the annotation track is called "Annotations" and the
-		genomic region is chr2L, the track name will be
-		"Annotations-chr2L".-->
-	<annotation_track_name>Annotations</annotation_track_name>
-	<!-- organism being annotated -->
-	<organism>ENTER_ORGANISM</organism>
-	<!-- CV term for the genomic sequences - should be in the form
-		of "CV:term".  This applies to all sequences -->
-	<sequence_type>ENTER_CVTERM_FOR_SEQUENCE</sequence_type>
-	<!-- path to file containing translation table.
-		optional - defaults to NCBI translation table 1 if absent -->
-	<translation_table>/config/translation_tables/ncbi_1_translation_table.txt</translation_table>
-	<!-- splice acceptor and donor sites. Multiple entries may be
-		added to allow multiple accepted sites.
-		optional - defaults to GT for donor and AG for acceptor
-		if absent -->
-	<splice_sites>
-		<donor_site>GT</donor_site>
-		<acceptor_site>AG</acceptor_site>
-	</splice_sites>
-</tracks>
-</syntaxhighlight>
-Let’s look at each element individually.
-<syntaxhighlight lang="xml">
-<!-- path to JBrowse refSeqs.json file -->
-<refseqs>TOMCAT_WEBAPPS_DIR/WebApollo/jbrowse/data/seq/refSeqs.json</refseqs>
-</syntaxhighlight>
-Location where the <tt>refSeqs.json</tt> file resides, which is created from the data generation pipeline (see the [[#Data generation|data generation]] section).  By default, the JBrowse data needs to reside in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/jbrowse/data</tt>.  If you want the data to reside elsewhere, you’ll need to do configure your servlet container to handle the appropriate alias to <tt>jbrowse/data</tt> or symlink the <tt>data</tt> directory to somewhere else.  Web Apollo is pre-configured to allow symlinks.
-<font color="red">IMPORTANT</font>: In the previous versions of Web Apollo (2013-05-16 and prior), this element pointed to the symlink created from the data generation pipeline.  The current pipeline no longer creates the symlink, so you need to point to the actual file itself (hence <tt>jbrowse/data/<font color="red">seq</font>/refSeqs.json</tt> as opposed to <tt>jbrowse/data/refSeqs.json</tt> in the previous versions.  If you're accessing data generated from a previous version of Web Apollo, you'll still need to point to the symlink.
-<syntaxhighlight lang="xml">
-<annotation_track_name>Annotations</annotation_track_name>
-</syntaxhighlight>
-Name of the annotation track.  Leave it as the default value of <tt>Annotations</tt>.
-<syntaxhighlight lang="xml">
-<!-- organism being annotated -->
-<organism>Pythium ultimum</organism>
-</syntaxhighlight>
-Scientific name of the organism being annotated (genus and species).  We're annotating <tt>Pythium ultimum</tt>.
-<syntaxhighlight lang="xml">
-<!-- CV term for the genomic sequences - should be in the form
-	of "CV:term".  This applies to all sequences -->
-<sequence_type>sequence:contig</sequence_type>
-</syntaxhighlight>
-The type for the genomic sequences.  Should be in the form of <tt>CV:term</tt>.  Our genomic sequences are of the type <tt>sequence:contig</tt>.
-<syntaxhighlight lang="xml">
-<!-- path to file containing translation table.
-	optional - defaults to NCBI translation table 1 if absent -->
-<translation_table>/config/translation_tables/ncbi_1_translation_table.txt</translation_table>
-</syntaxhighlight>
-File that contains the codon translation table.  This is optional and defaults to NCBI translation table 1 if absent.  See the [[#Translation tables|translation tables]] section for details on which tables are available and how to customize your own table.
-<syntaxhighlight lang="xml">
-<!-- splice acceptor and donor sites. Multiple entries may be
-	added to allow multiple accepted sites.
-	optional - defaults to GT for donor and AG for acceptor
-	if absent -->
-<splice_sites>
-	<donor_site>GT</donor_site>
-	<acceptor_site>AG</acceptor_site>
-</splice_sites>
-</syntaxhighlight>
-Defines what the accepted donor and acceptor splice sites are.  This will determine whether the client displays a warning on splice sites (if the splice site sequence doesn't match what's defined here, then it flags the splice site).  You can add multiple <tt><donor_site></tt> and <tt><acceptor_site></tt> elements if your organism should support multiple values.  This is optional and defaults to <tt>GT</tt> for donor and <tt>AG</tt> for acceptor sites.
-<syntaxhighlight lang="xml">
-<!-- path to file containing canned comments XML -->
-<canned_comments>/config/canned_comments.xml</canned_comments>
-</syntaxhighlight>
-File that contains canned comments (predefined comments that will be available from a pull-down menu when creating comments).  It’s best not to change the default option.  See the [[#Canned comments|canned comments]] section for details on configuring canned comments.
-<syntaxhighlight lang="xml">
-<!-- configuration for what to display in the annotation info editor.
-Sections can be commented out to not be displayed or uncommented
-to make them active -->
-<annotation_info_editor>
-	<!-- grouping for the configuration.  The "feature_types" attribute takes a list of
-	SO terms (comma separated) to apply this configuration to
-	(e.g., feature_types="sequence:transcript,sequence:mRNA" will make it so the group
-	configuration will only apply to features of type "sequence:transcript" or "sequence:mRNA").
-	A value of "default" will make this the default configuration for any types not explicitly
-	defined in other groups.  You can have any many groups as you'd like -->
-	<annotation_info_editor_group feature_types="default">
-		<!-- display status section.  The text for each <status_flag>
-		element will be displayed as a radio button in the status
-		section, in the same order -->
-		<!--
-		<status>
-			<status_flag>Approved</status_flag>
-			<status_flag>Needs review</status_flag>
-		</status>
-		-->
-		<!-- display generic attributes section -->
-		<attributes />
-		<!-- display dbxrefs section -->
-		<dbxrefs />
-		<!-- display PubMed IDs section -->
-		<pubmed_ids />
-		<!-- display GO IDs section -->
-		<go_ids />
-		<!-- display comments section -->
-		<comments />
-	</annotation_info_editor_group>
-</annotation_info_editor>
-</syntaxhighlight>
-Here's the configuration on what to display in the annotation info editor.  It will always display <tt>Name</tt>, <tt>Symbol</tt>, and <tt>Description</tt> but the rest is optional.  This allows you to make the editor more compact if you're not interested in editing certain metadata.  Let's look at the options in more detail.
-<syntaxhighlight lang="xml">
-<!-- grouping for the configuration.  The "feature_types" attribute takes a list of
-SO terms (comma separated) to apply this configuration to
-(e.g., feature_types="sequence:transcript,sequence:mRNA" will make it so the group
-configuration will only apply to features of type "sequence:transcript" or "sequence:mRNA").
-A value of "default" will make this the default configuration for any types not explicitly
-defined in other groups.  You can have any many groups as you'd like -->
-<annotation_info_editor_group feature_types="default">
-	...
-</annotation_info_editor_group>
-</syntaxhighlight>
-Each configuration is grouped by annotation type.  This allows you to have different options on what's displayed for specified types.  The <tt>feature_types</tt> attribute defines which types this group will apply to.  <tt>feature_types</tt> takes a list of SO terms (comma separated), such as <tt>"sequence:transcript,sequence:mRNA"</tt>, which will apply this configuration to annotations of type <tt>sequence:transcript</tt> and <tt>sequence:mRNA</tt>.  Alternatively, you can set the value to <tt>"default"</tt> which will become the default configuration for any types not explicitly defined in other groups.  You can have any many groups as you'd like.  All [[#Supported_annotation_types|supported annotation types]] can be used.
-Next, let's look at each item to configure in each group.
-<syntaxhighlight lang="xml">
-<!-- display status section.  The text for each <status_flag>
-	element will be displayed as a radio button in the status
-	section, in the same order -->
-<status>
-	<status_flag>Approved</status_flag>
-	<status_flag>Needs review</status_flag>
-</status>
-</syntaxhighlight>
-Allows selecting the status for a particular annotation.  The value for <tt><status_flag></tt> is arbitrary (you can enter any text) and you can add as many as you'd like, but you need to at least have one (they'll show up as selectable buttons in the editor).
-<syntaxhighlight lang="xml">
-<!-- display generic attributes section -->
-<attributes />
-</syntaxhighlight>
-Allows editing of generic attributes (tag/value pairs).  Think non-reserved GFF3 tags for column 9.
-<syntaxhighlight lang="xml">
-<!-- display dbxrefs section -->
-<dbxrefs />
-</syntaxhighlight>
-Allows editing of database cross references.
-<syntaxhighlight lang="xml">
-<!-- display PubMed IDs section -->
-<pubmed_ids />
-</syntaxhighlight>
-Allows editing of PubMed IDs (for associating an annotation with a publication).
-<syntaxhighlight lang="xml">
-<!-- display GO IDs section -->
-<go_ids />
-</syntaxhighlight>
-Allows editing of Gene Ontology terms (for associating an annotation to a particular function).
-<syntaxhighlight lang="xml">
-<!-- display comments section -->
-<comments />
-</syntaxhighlight>
-Allows editing of comments for annotations.
-<syntaxhighlight lang="xml">
-<!-- tools to be used for sequence searching.  This is optional.
-	If this is not setup, WebApollo will not have sequence search support -->
-<sequence_search_tools>
-	<!-- one <sequence_search_tool> element per tool -->
-	<sequence_search_tool>
-		<!-- display name for the search tool -->
-		<key>BLAT nucleotide</key>
-		<!-- class for handling search -->
-		<class>org.bbop.apollo.tools.seq.search.blat.BlatCommandLineNucleotideToNucleotide</class>
-		<!-- configuration for search tool -->
-		<config>/config/blat_config.xml</config>
-	</sequence_search_tool>
-	<sequence_search_tool>
-		<!-- display name for the search tool -->
-		<key>BLAT protein</key>
-		<!-- class for handling search -->
-		<class>org.bbop.apollo.tools.seq.search.blat.BlatCommandLineProteinToNucleotide</class>
-		<!-- configuration for search tool -->
-		<config>/config/blat_config.xml</config>
-	</sequence_search_tool>
-</sequence_search_tools>
-</syntaxhighlight>
-Here’s the configuration for sequence search tools (allows searching your genomic sequences).  Web Apollo does not implement any search algorithms, but instead relies on different tools and resources to handle searching (this provides much more flexible search options).  This is optional.  If it’s not configured, Web Apollo will not have sequence search support.  You'll need one <tt>sequence_search_tool</tt> element per search tool.  Let's look at the element in more detail.
-<syntaxhighlight lang="xml">
-<!-- display name for the search tool -->
-<key>BLAT nucleotide</key>
-</syntaxhighlight>
-This is a string that will be used for the display name for the search tool, in the pull down menu that provides search selection for the user.
-<syntaxhighlight lang="xml">
-<!-- class for handling search -->
-<class>org.bbop.apollo.tools.seq.search.blat.BlatCommandLineNucleotideToNucleotide</class>
-</syntaxhighlight>
-Should point to the class that will handle the search request.  Searching is handled by classes that implement the <tt>org.bbop.apollo.tools.seq.search.SequenceSearchTool</tt> interface.  This allows you to add support for your own favorite search tools (or resources).  We currently only have support for command line Blat, in the following flavors:
-*<tt>org.bbop.apollo.tools.seq.search.blat.BlatCommandLineNucleotideToNucleotide</tt>
-**Blat search for a nucleotide query against a nucleotide database
-*<tt>org.bbop.apollo.tools.seq.search.blat.BlatCommandLineProteinToNucleotide</tt>
-**Blat search for a protein query against a nucleotide database
-<syntaxhighlight lang="xml">
-<!-- configuration for search tool -->
-<config>/config/blat_config.xml</config>
-</syntaxhighlight>
-File that contains the configuration for the searching plugin chosen.  If you’re using Blat, you should not change this.  If you’re using your own plugin, you’ll want to point this to the right configuration file (which will be dependent on your plugin).  See the [[#Blat|Blat]] section for details on configuring Web Apollo to use Blat.
-<syntaxhighlight lang="xml">
-<!-- data adapters for writing annotation data to different formats.
-These will be used to dynamically generate data adapters within
-WebApollo.  It contains either <data_adapter> or <data_adapter_group> elements.
-<data_adapter_group> will allow grouping adapters together and will provide a
-submenu for those adapters in WebApollo. This is optional.  -->
-<data_adapters>
-	<!-- one <data_adapter> element per data adapter -->
-	<data_adapter>
-		<!-- display name for data adapter -->
-		<key>GFF3</key>
-		<!-- class for data adapter plugin -->
-		<class>org.bbop.apollo.web.dataadapter.gff3.Gff3DataAdapter</class>
-		<!-- required permission for using data adapter
-		available options are: read, write, publish -->
-		<permission>read</permission>
-		<!-- configuration file for data adapter -->
- 		<config>/config/gff3_config.xml</config>
-		<!-- options to be passed to data adapter -->
-		<options>output=file&amp;format=gzip</options>
-	</data_adapter>
-	<data_adapter>
-		<!-- display name for data adapter -->
-		<key>Chado</key>
-		<!-- class for data adapter plugin -->
-		<class>org.bbop.apollo.web.dataadapter.chado.ChadoDataAdapter</class>
-		<!-- required permission for using data adapter
-		available options are: read, write, publish -->
-		<permission>publish</permission>
-		<!-- configuration file for data adapter -->
-		<config>/config/chado_config.xml</config>
-		<!-- options to be passed to data adapter -->
-		<options>display_features=false</options>
-	</data_adapter>
-	<!-- group the <data_adapter> children elements together -->
-	<data_adapter_group>
-		<!-- display name for adapter group -->
-		<key>FASTA</key>
-		<!-- required permission for using data adapter group
-		available options are: read, write, publish -->
-		<permission>read</permission>
-		<!-- one child <data_adapter> for each data adapter in the group -->
-		<data_adapter>
-			<!-- display name for data adapter -->
-			<key>peptide</key>
-			<!-- class for data adapter plugin -->
-			<class>org.bbop.apollo.web.dataadapter.fasta.FastaDataAdapter</class>
-			<!-- required permission for using data adapter
-			available options are: read, write, publish -->
-			<permission>read</permission>
-			<!-- configuration file for data adapter -->
-			<config>/config/fasta_config.xml</config>
-			<!-- options to be passed to data adapter -->
-			<options>output=file&amp;format=gzip&amp;seqType=peptide</options>
-		</data_adapter>
-		<data_adapter>
-			<!-- display name for data adapter -->
-			<key>cDNA</key>
-			<!-- class for data adapter plugin -->
-			<class>org.bbop.apollo.web.dataadapter.fasta.FastaDataAdapter</class>
-			<!-- required permission for using data adapter
-			available options are: read, write, publish -->
-			<permission>read</permission>
-			<!-- configuration file for data adapter -->
-			<config>/config/fasta_config.xml</config>
-			<!-- options to be passed to data adapter -->
-			<options>output=file&amp;format=gzip&amp;seqType=cdna</options>
-		</data_adapter>
-		<data_adapter>
-			<!-- display name for data adapter -->
-			<key>CDS</key>
-			<!-- class for data adapter plugin -->
-			<class>org.bbop.apollo.web.dataadapter.fasta.FastaDataAdapter</class>
-			<!-- required permission for using data adapter
-			available options are: read, write, publish -->
-			<permission>read</permission>
-			<!-- configuration file for data adapter -->
-			<config>/config/fasta_config.xml</config>
-			<!-- options to be passed to data adapter -->
-			<options>output=file&amp;format=gzip&amp;seqType=cds</options>
-		</data_adapter>
-	</data_adapter_group>
-</data_adapters>
-</syntaxhighlight>
-Here’s the configuration for data adapters (allows writing annotations to different formats). This is optional. If it’s not configured, Web Apollo will not have data writing support. You'll need one <tt>&lt;data_adapter&gt;</tt> element per data adapter. You can group data adapters by placing each <tt>&lt;data_adapter&gt;</tt> inside a <tt>&lt;data_adapter_group&gt;</tt> element.  Let's look at the <tt>&lt;data_adapter&gt;</tt> element in more detail.
-<syntaxhighlight lang="xml">
-<!-- display name for data adapter -->
-<key>GFF3</key>
-</syntaxhighlight>
-This is a string that will be used for the data adapter name, in the dynamically generated data adapters list for the user.
-<syntaxhighlight lang="xml">
-<!-- class for data adapter plugin -->
-<class>org.bbop.apollo.web.dataadapter.gff3.Gff3DataAdapter</class>
-</syntaxhighlight>
-Should point to the class that will handle the write request. Writing is handled by classes that implement the <tt>org.bbop.apollo.web.dataadapter.DataAdapter</tt> interface. This allows you to add support for writing to different formats.  We currently only have support for:
-*<tt>org.bbop.apollo.web.dataadapter.gff3.Gff3DataAdapter</tt>
-**GFF3 (see the [[#GFF3|GFF3]] section for details on this adapter)
-*<tt>org.bbop.apollo.web.dataadapter.chado.ChadoDataAdapter</tt>
-**Chado (see the [[#Chado|Chado]] section for details on this adapter)
-<syntaxhighlight lang="xml">
-<!-- required permission for using data adapter
-	available options are: read, write, publish -->
-<permission>publish</permission>
-</syntaxhighlight>
-Required user permission for accessing this data adapter.  If the user does not have the required permission, it will not be available in the list of data adapters.  Available permissions are <tt>read</tt>, <tt>write</tt>, and <tt>publish</tt>.
-<syntaxhighlight lang="xml">
-<!-- configuration for data adapter -->
-<config>/config/gff3_config.xml</config>
-</syntaxhighlight>
-File that contains the configuration for the data adapter plugin chosen.
-<syntaxhighlight lang="xml">
-<!-- options to be passed to data adapter -->
-<options>output=file&amp;format=gzip</options>
-</syntaxhighlight>
-Options to be passed to the data adapter.  These are dependent on the data adapter.
-Next, let's look at the <tt>&lt;data_adapter_group&gt;</tt> element:
-<syntaxhighlight lang="xml">
-<!-- display name for adapter group -->
-<key>FASTA</key>
-</syntaxhighlight>
-This is a string that will be used for the data adapter submenu name.
-<!-- required permission for using data adapter group
-available options are: read, write, publish -->
-<permission>read</permission>
-Required user permission for accessing this data adapter group.  If the user does not have the required permission, it will not be available in the list of data adapters.  Available permissions are <tt>read</tt>, <tt>write</tt>, and <tt>publish</tt>.
-===Translation tables===
-Web Apollo has support for alternate translation tables.  For your convenience, Web Apollo comes packaged with the current NCBI translation tables.  They reside in the <tt>config/translation_tables</tt> directory in your installation (<tt>TOMCAT_WEBAPPS_DIR/WebApollo/config/translation_tables</tt>).  They're all named <tt>ncbi_#_translation_table.txt</tt> where <tt>#</tt> represents the NCBI translation table number (for example, for ciliates, you'd use <tt>ncbi_6_translation_table.txt</tt>).
-You can also customize your own translation table.  The format is tab delimited, with each entry containing either 2 or 3 columns.  The 3rd column is only used in the cases of start and stop codons.  You only need to put entries for codons that differ from the standard translation table (#1).  The first column has the codon triplet and the second has the IUPAC single letter representation for the translated amino acid.  The stop codon should be represented as <tt>*</tt> (asterisk).
-<syntaxhighlight lang="text">
-TAA	Q
-</syntaxhighlight>
-As mentioned previously, you'll only need the 3rd column for start and stop codons.  To denote a codon as a start codon, put in <tt>start</tt> in the third column.  For example, if we wanted to assign <tt>GTG</tt> as a start codon, we'd enter:
-<syntaxhighlight lang="text">
-GTG	V	start
-</syntaxhighlight>
-For stop codons, if we enter an IUPAC single letter representation for the amino acid in the 3rd column, we're denoting that amino acid to be used in the case of a readthrough stop codon.  For example, to use pyrrolysine, we'd enter:
-<syntaxhighlight lang="text">
-TAG	*	O
-</syntaxhighlight>
-If you write your own customized translation table, make sure to update the <tt><translation_table></tt> element in your configuration to your customized file.
-===Canned comments===
-You can configure a set of predefined comments that will be available for users when adding comments through a dropdown menu.  The configuration is stored in <tt>/usr/local/tomcat/tomcat7/webapps/WebApollo/config/canned_comments.xml</tt>.  Let’s take a look at the configuration file.
-<syntaxhighlight lang="xml">
-<?xml version="1.0" encoding="UTF-8"?>
-<canned_comments>
-	<!-- one <comment> element per comment.
-	it must contain either the attribute "feature_type" that defines
-	the type of feature this comment will apply to or the attribute "feature_types"
-	that defines a list (comma separated) of types of features this comment will
-	apply to.
-	types must be be in the form of "CV:term" (e.g., "sequence:gene")
-	<comment feature_type="sequence:gene">This is a comment for sequence:gene</comment>
-	or
-	<comment feature_types="sequence:tRNA,sequence:ncRNA">This is a comment for both sequence:tRNA and sequence:ncRNA</comment>
-	-->
-</canned_comments>
-</syntaxhighlight>
-You’ll need one <tt><comment></tt> element for each predefined comment.  The element needs to have either a <tt>feature_type</tt> attribute in the form of <tt>CV:term</tt> that this comment applies to or a <tt>feature_types</tt> attribute, a comma separated list of types this comment will apply to, where each type is also in the form of <tt>CV:term</tt>.  Let’s make a few comments for feature of type <tt>sequence:gene</tt> and <tt>sequence:transcript</tt>, <tt>sequence:mRNA</tt>:
-<syntaxhighlight lang="xml">
-<comment feature_type="sequence:gene">This is a comment for a gene</comment>
-<comment feature_type="sequence:gene">This is another comment for a gene</comment>
-<comment feature_types="sequence:transcript,sequence:mRNA">This is a comment for both a transcript or mRNA</comment>
-</syntaxhighlight>
-All [[#Supported_annotation_types|supported annotation types]] can be used.
-===Search tools===
-As mentioned previously, Web Apollo makes use of tools for sequence searching rather than employing its own search algorithm.  The only currently supported tool is command line Blat.
-====Blat====
-You’ll need to have Blat installed and a search database with your genomic sequences available to make use of this feature.  You can get documentation on the Blat command line suite of tools at [http://genome.ucsc.edu/goldenPath/help/blatSpec.html BLAT Suite Program Specifications and User Guide] and get information on setting up the tool in the official [http://genome.ucsc.edu/FAQ/FAQblat.html#blat3 BLAT FAQ].  The configuration is stored in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/config/blat_config.xml</tt>.  Let’s take a look at the configuration file:
-<syntaxhighlight lang="xml">
-<?xml version="1.0" encoding="UTF-8"?>
-<!-- configuration file for setting up command line Blat support -->
-<blat_config>
-	<!-- path to Blat binary →
-	<blat_bin>ENTER_PATH_TO_BLAT_BINARY</blat_bin>
-	<!-- path to where to put temporary data -->
-	<tmp_dir>ENTER_PATH_FOR_TEMPORARY_DATA</tmp_dir>
-	<!-- path to Blat database -->
-	<database>ENTER_PATH_TO_BLAT_DATABASE</database>
-	<!-- any Blat options (directly passed to Blat) e.g., -minMatch -->
-	<blat_options>ENTER_ANY_BLAT_OPTIONS</blat_options>
-	<!-- true to remove temporary data path after search (set to false for debugging purposes) -->
-	<remove_tmp_dir>true</remove_tmp_dir>
-</blat_config>
-</syntaxhighlight>
-Let’s look at each element with values filled in.
-<syntaxhighlight lang="xml">
-<!-- path to Blat binary -->
-<blat_bin>BLAT_DIR/blat</blat_bin>
-</syntaxhighlight>
-We need to point to the location where the Blat binary resides.  For this guide, we'll assume Blat in installed in <tt>/usr/local/bin</tt>.
-<syntaxhighlight lang="xml">
-<!-- path to where to put temporary data -->
-<tmp_dir>BLAT_TMP_DIR</tmp_dir>
-</syntaxhighlight>
-We need to point to the location where to store temporary files to be used in the Blat search.  It can be set to whatever location you’d like.
-<syntaxhighlight lang="xml">
-<!-- path to Blat database -->
-<database>BLAT_DATABASE</database>
-</syntaxhighlight>
-We need to point to the location of the search database to be used by Blat.  See the Blat documentation for more information on generation search databases.
-<syntaxhighlight lang="xml">
-<!-- any Blat options (directly passed to Blat) e.g., -minMatch -->
-<blat_options>-minScore=100 -minIdentity=60</blat_options>
-</syntaxhighlight>
-Here we can configure any extra options to used by Blat.  These options are passed verbatim to the program.  In this example, we’re passing the <tt>-minScore</tt> parameter with a minimum score of <tt>100</tt> and the <tt>-minIdentity</tt> parameter with a value of <tt>60</tt> (60% identity).  See the Blat documentation for information of all available options.
-<syntaxhighlight lang="xml">
-<!-- true to remove temporary data path after search (set to false for debugging purposes) -->
-<remove_tmp_dir>true</remove_tmp_dir>
-</syntaxhighlight>
-Whether to delete the temporary files generated for the BLAT search.  Set it to <tt>false</tt> to not delete the files after the search, which is useful for debugging why your search may have failed or returned no results.
-===Data adapters===
-====GFF3====
-The GFF3 data adapter will allow exporting the current annotations as a GFF3 file.  You can get more information about the GFF3 format at [http://www.sequenceontology.org/gff3.shtml The Sequence Ontology GFF3 page].  The configuration is stored in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/config/gff3_config.xml</tt>.  Let’s take a look at the configuration file:
-<syntaxhighlight lang="xml">
-<?xml version="1.0" encoding="UTF-8"?>
-<!-- configuration file for GFF3 data adapter -->
-<gff3_config>
-	<!-- path to where to put generated GFF3 file.  This path is
-		relative path that will be where you deployed your
-		instance (so that it's accessible from HTTP download requests) -->
-	<tmp_dir>tmp</tmp_dir>
-	<!-- value to use in the source column (column 2) of the generated
-		GFF3 file. -->
-	<source>.</source>
-	<!-- which metadata to export as an attribute - optional.
-		Default is to export everything except owner, date_creation, and date_last_modified -->
-	<!--
-	<metadata_to_export>
-		<metadata type="name" />
-		<metadata type="symbol" />
-		<metadata type="description" />
-		<metadata type="status" />
-		<metadata type="dbxrefs" />
-		<metadata type="attributes" />
-		<metadata type="comments" />
-		<metadata type="owner" />
-		<metadata type="date_creation" />
-		<metadata type="date_last_modified" />
-	</metadata_to_export>
-	-->
-	<!-- whether to export underlying genomic sequence - optional.
-	Defaults to true -->
-	<export_source_genomic_sequence>true</export_source_genomic_sequence>
-</gff3_config>
-</syntaxhighlight>
-<syntaxhighlight lang="xml">
-<tmp_dir>tmp</tmp_dir>
-</syntaxhighlight>
-This is the root directory where the GFF3 files will be generated.  The actual GFF3 files will be in subdirectories that are generated to prevent collisions from concurrent requests.  This directory is relative to <tt>TOMCAT_WEBAPPS_DIR/WebApollo</tt>.  This is done to allow the generated GFF3 to be accessible from HTTP requests.
-<syntaxhighlight lang="xml">
-<!-- value to use in the source column (column 2) of the generated
-	GFF3 file. -->
-<source>.</source>
-</syntaxhighlight>
-This is what to put as the source (column 2) in the generated GFF3 file.  You can change the value to anything you'd like.
-<syntaxhighlight lang="xml">
-<!-- which metadata to export as an attribute - optional.
-	Default is to export everything except owner, date_creation, and date_last_modified -->
-<metadata_to_export>
-	<metadata type="name" />
-	<metadata type="symbol" />
-	<metadata type="description" />
-	<metadata type="status" />
-	<metadata type="dbxrefs" />
-	<metadata type="attributes" />
-	<metadata type="comments" />
-	<metadata type="owner" />
-	<metadata type="date_creation" />
-	<metadata type="date_last_modified" />
-</metadata_to_export>
-</syntaxhighlight>
-This defines which metadata to export in the GFF3 (in column 9).  This configuration is optional.  The default is to export everything except owner, date_creation, and date_last_modified.  You need to define one <tt>&lt;metadata<&gt;</tt> element with the appropriate <tt>type</tt> attribute per metadata type you want to export.  Available types are:
-* name
-* symbol
-* description
-* status
-* dbxrefs
-* attributes
-* comments
-* owner
-* date_creation
-* date_last_modified
-<syntaxhighlight lang="xml">
-<!-- whether to export underlying genomic sequence - optional.
-Defaults to true -->
-<export_source_genomic_sequence>true</export_source_genomic_sequence>
-</syntaxhighlight>
-Determines whether to export the underlying genomic sequence as FASTA attached to the GFF3 file.  Set to <tt>false</tt> to disable it.  Defaults to <tt>true</tt>.
-Note that the generated files will reside in that directory indefinitely to allow users to download them.  You'll need to eventually remove those files to prevent the file system from cluttering up.  There's a script that will traverse the directory and remove any files that are older than a provided time and cleanup directories as they become empty.  It's recommended to setup this script as a <tt>cron</tt> job that runs hourly to remove any files older than an hour (should provide plenty of time for users to download those files).  The script is in <tt>WEB_APOLLO_DIR/tools/cleanup/remove_temporary_files.sh</tt>.
- $ WEB_APOLLO_DIR/tools/cleanup/remove_temporary_files.sh -d TOMCAT_WEBAPPS_DIR/WebApollo/tmp -m 60
-====Chado====
-The Chado data adapter will allow writing the current annotations to a Chado database.  You can get more information about the Chado at [http://gmod.org/wiki/Chado GMOD Chado page].  The configuration is stored in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/config/chado_config.xml</tt>.  Let’s take a look at the configuration file:
-<syntaxhighlight lang="xml">
-<?xml version="1.0" encoding="UTF-8"?>
-<!-- configuration file for Chado data adapter -->
-<chado_config>
-	<!-- Hibernate configuration file for accessing Chado database -->
-	<hibernate_config>/config/hibernate.xml</hibernate_config>
-</chado_config>
-</syntaxhighlight>
-There's only one element to be configured:
-<syntaxhighlight lang="xml">
-<hibernate_config>/config/hibernate.xml</hibernate_config>
-</syntaxhighlight>
-This points to the Hibernate configuration for accessing the Chado database.  Hibernate provides an ORM (Object Relational Mapping) for relational databases.  This is used to access the Chado database.  The Hibernate configuration is stored in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/config/hibernate.xml</tt>.  It is quite large (as it contains a lot of mapping resources), so let's take a look at the parts of the configuration file that are of interest (near the top of the file):
-<syntaxhighlight lang="xml">
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE hibernate-configuration PUBLIC
-		"-//Hibernate/Hibernate Configuration DTD 3.0//EN"
-		"http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
-<hibernate-configuration>
-	<session-factory name="SessionFactory">
-		<property name="hibernate.connection.driver_class">org.postgresql.Driver</property>
-		<property name="hibernate.connection.url">ENTER_DATABASE_CONNECTION_URL</property>
-		<property name="hibernate.connection.username">ENTER_USERNAME</property>
-		<property name="hibernate.connection.password">ENTER_PASSWORD</property>
-		...
-	</session-factory>
-</hibernate-configuration>
-</syntaxhighlight>
-Let's look at each element:
-<syntaxhighlight lang="xml">
-<property name="hibernate.connection.driver_class">org.postgresql.Driver</property>
-</syntaxhighlight>
-The database driver for the RDBMS where the Chado database exists.  It will most likely be PostgreSQL (as it's the officially recommended RDBMS for Chado), in which case you should leave this at its default value.
-<syntaxhighlight lang="xml">
-<property name="hibernate.connection.url">ENTER_DATABASE_CONNECTION_URL</property>
-</syntaxhighlight>
-JDBC URL to connect to the Chado database.  It will be in the form of <tt>jdbc:$RDBMS://$SERVERNAME:$PORT/$DATABASE_NAME</tt> where <tt>$RDBMS</tt> is the RDBMS used for the Chado database, <tt>$SERVERNAME</tt> is the server's name, <tt>$PORT</tt> is the database port, and <tt>$DATABASE_NAME</tt> is the database's name.  Let's say we're connecting to a Chado database running on PostgreSQL on server <tt>my_server</tt>, port <tt>5432</tt> (PostgreSQL's default), and a database name of <tt>my_organism</tt>, the connection URL will look as follows: <tt>jdbc:postgresql://my_server:5432/my_organism</tt>.
-<syntaxhighlight lang="xml">
-<property name="hibernate.connection.username">ENTER_USERNAME</property>
-</syntaxhighlight>
-User name used to connect to the database.  This user should have write privileges to the database.
-<syntaxhighlight lang="xml">
-<property name="hibernate.connection.password">ENTER_PASSWORD</property>
-</syntaxhighlight>
-Password for the provided user name.
-'''Important Note for first-time Export
-'''
-Make sure to load your chromosomes into Chado before you do an export.
-To do this you export your GFF file from Apollo (using the GFF3 export also detailed in this section) and [http://gmod.org/wiki/Load_GFF_Into_Chado| import the GFF3 file into Chado].  Example:
-<syntaxhighlight lang="bash">
-./load/bin/gmod_bulk_load_gff3.pl --gfffile ~/Amel/Amel_4.5_scaffolds.gff --dbuser USERNAME \
---dbpass PASSWORD --dbname CHADO_DB --organism "Apis mellifera"
-</syntaxhighlight>
-====FASTA====
-The FASTA data adapter will allow exporting the current annotations to a FASTA file.  The configuration is stored in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/config/fasta_config.xml</tt>.  Let’s take a look at the configuration file:
-<syntaxhighlight lang="xml">
-<?xml version="1.0" encoding="UTF-8"?>
-<!-- configuration file for FASTA data adapter -->
-<fasta_config>
-	<!-- path to where to put generated FASTA file.  This path is a
-	relative path that will be where you deployed your WebApollo
-	instance (so that it's accessible from HTTP download requests) -->
-	<tmp_dir>tmp</tmp_dir>
-	<!-- feature types to process when dumping FASTA sequence -->
-	<feature_types>
-		<!-- feature type to process - one element per type -->
-		<feature_type>sequence:mRNA</feature_type>
-		<feature_type>sequence:transcript</feature_type>
-	</feature_types>
-	<!-- which metadata to export as an attribute - optional.
-	Default does not export any metadata -->
-	<!--
-	<metadata_to_export>
-		<metadata type="name" />
-		<metadata type="symbol" />
-		<metadata type="description" />
-		<metadata type="status" />
-		<metadata type="dbxrefs" />
-		<metadata type="attributes" />
-		<metadata type="comments" />
-		<metadata type="owner" />
-		<metadata type="date_creation" />
-		<metadata type="date_last_modified" />
-	</metadata_to_export>
-	-->
-</fasta_config>
-</syntaxhighlight>
-<syntaxhighlight lang="xml">
-<!-- path to where to put generated FASTA file.  This path is a
-relative path that will be where you deployed your WebApollo
-instance (so that it's accessible from HTTP download requests) -->
-<tmp_dir>tmp</tmp_dir>
-</syntaxhighlight>
-This is the root directory where the FASTA files will be generated. The actual FASTA files will be in subdirectories that are generated to prevent collisions from concurrent requests. This directory is relative to TOMCAT_WEBAPPS_DIR/WebApollo. This is done to allow the generated FASTA to be accessible from HTTP requests.
-<syntaxhighlight lang="xml">
-<!-- feature types to process when dumping FASTA sequence -->
-<feature_types>
-	<!-- feature type to process - one element per type -->
-	<feature_type>sequence:mRNA</feature_type>
-	<feature_type>sequence:transcript</feature_type>
-</feature_types>
-</syntaxhighlight>
-This defines which annotation types should be processed when exporting the FASTA data.  You'll need one <tt>&lt;feature_type&gt;</tt> element for each type you want to have processed.  Only the defined <tt>feature_type</tt> elements will all be processed, so you might want to have different configuration files for processing different types of annotations (which you can point to in FASTA data adapter in the <tt>config</tt> element in <tt>config.xml</tt>).  All [[#Supported_annotation_types|supported annotation types]] can be used for the value of <tt>feature_type</tt>, with the addition of <tt>sequence:exon</tt>.
-In <tt>config.xml</tt>, in the <tt>&lt;options&gt;</tt> element in the <tt>&lt;data_adapter&gt;</tt> configuration for the FASTA adapter, you'll notice that there's a <tt>seqType</tt> option.  You can change that value to modify which type of sequence will be exported as FASTA.  Available options are:
-* peptide
-** Export the peptide sequence.  This will only apply to protein coding transcripts and protein coding exons
-* cdna
-** Export the cDNA sequence.  This will only apply to transcripts and exons
-* cds
-** Export the CDS sequence.  This will only apply to protein coding transcripts and protein coding exons
-* genomic
-** Export the genomic within the feature's boundaries.  This applies to all feature types.
-<syntaxhighlight lang="xml">
-<!-- which metadata to export as an attribute - optional.
-Default does not export any metadata -->
-<!--
-<metadata_to_export>
-	<metadata type="name" />
-	<metadata type="symbol" />
-	<metadata type="description" />
-	<metadata type="status" />
-	<metadata type="dbxrefs" />
-	<metadata type="attributes" />
-	<metadata type="comments" />
-	<metadata type="owner" />
-	<metadata type="date_creation" />
-	<metadata type="date_last_modified" />
-</metadata_to_export>
--->
-</syntaxhighlight>
-Defines which metadata to export in the defline for each feature.  The default is to not output any of the listed metadata.  Uncomment to turn on this option.  Note that you can remove (or comment) any <tt>&lt;metadata&gt;</tt> elements that you're not interested in exporting.
-Note that like the GFF3 adapter, the generated files will reside in that directory indefinitely to allow users to download them.  You'll need to eventually remove those files to prevent the file system from cluttering up.  You can use the <tt>remove_temporary_files.sh</tt> script to handle the cleanup.  In fact, if you configure both the GFF3 and FASTA adapters to use the same temporary directory, you'll only need to worry about cleanup from a single location.  See the [[#GFF3|GFF3]] section for information about <tt>remove_temporary_files.sh</tt>.
-==Data generation==
-The steps for generating data (in particular static data) are mostly similar to [[JBrowse]] data generation steps, with some extra steps required.  The scripts for data generation reside in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/jbrowse/bin</tt>.  Let's go into WebApollo's JBrowse directory.
- $ <span class="enter">cd TOMCAT_WEBAPPS_DIR/WebApollo/jbrowse</span>
-It will make things easier if we make sure that the scripts in the <tt>bin</tt> directory are executable.
- $ <span class="enter">chmod 755 bin/*</span>
-As mentioned previously, the data resides in the <tt>data</tt> directory by default.  We can symlink <tt>JBROWSE_DATA_DIR</tt> giving you a lot of flexibility in allowing your WebApollo instance to easily point to a new data directory.
- $ <span class="enter">ln -sf JBROWSE_DATA_DIR data</span>
-<font color="red">IMPORTANT</font>: If you're using data generated in previous versions of WebApollo (2013-09-04 and prior), you won't need to regenerate the data, but you will need to run the [[#Adding the WebApollo plugin|Adding the WebApollo plugin]] step.
-===DNA track setup===
-The first thing we need to do before processing our evidence is to generate the reference sequence data to be used by JBrowse.  We'll use the <tt>prepare-refseqs.pl</tt> script.
- $ <span class="enter">bin/prepare-refseqs.pl --fasta WEB_APOLLO_SAMPLE_DIR/scf1117875582023.fa</span>
-We now have the DNA track setup.  Note that you can also use a GFF3 file containing the genomic sequence by using the <tt>--gff</tt> option instead of <tt>--fasta</tt> and point it to the GFF3 file.
-===Adding the WebApollo plugin===
-We now need to setup the data configuration to use the WebApollo plugin. We'll use the <tt>add-webapollo-plugin.pl</tt> script to do so.
- $ <span class="enter">bin/add-webapollo-plugin.pl -i data/trackList.json</span>
-===Static data generation===
-Generating data from GFF3 works best by having a separate GFF3 per source type.  If your GFF3 has all source types in the same file, we need to split up the GFF3.  We can use the <tt>split_gff_by_source.pl</tt> script in <tt>WEB_APOLLO_DIR/tools/data</tt> to do so.  We'll output the split GFF3 to some temporary directory (we'll use <tt>WEB_APOLLO_SAMPLE_DIR/split_gff</tt>).
- $ <span class="enter">mkdir -p WEB_APOLLO_SAMPLE_DIR/split_gff</span>
- $ <span class="enter">WEB_APOLLO_DIR/tools/data/split_gff_by_source.pl \
- -i WEB_APOLLO_SAMPLE_DIR/scf1117875582023.gff -d WEB_APOLLO_SAMPLE_DIR/split_gff</span>
-If we look at the contents of <tt>WEB_APOLLO_SAMPLE_DIR/split_gff</tt>, we can see we have the following files:
- $ <span class="enter">ls WEB_APOLLO_SAMPLE_DIR/split_gff</span>
- blastn.gff  est2genome.gff  protein2genome.gff  repeatrunner.gff
- blastx.gff  maker.gff       repeatmasker.gff    snap_masked.gff
-We need to process each file and create the appropriate tracks.
-(If you've previously used JBrowse, you may know that JBrowse also has an alternative approach to generating multiple static data tracks from a GFF3 file, which uses the biodb-to-json script and a configuration file.  However, WebApollo is not yet compatible with that approach)
-====GFF3 with gene/transcript/exon/CDS/polypeptide features====
-We'll start off with <tt>maker.gff</tt>.  We need to handle that file a bit differently than the rest of the files since the GFF represents the features as gene, transcript, exons, and CDSs.
- $ <span class="enter">bin/flatfile-to-json.pl --gff WEB_APOLLO_SAMPLE_DIR/split_gff/maker.gff \
- --arrowheadClass trellis-arrowhead --getSubfeatures \
- --subfeatureClasses '{"wholeCDS": null, "CDS":"brightgreen-80pct", "UTR": "darkgreen-60pct", "exon":"container-100pct"}' \
- --className container-16px --type mRNA --trackLabel maker</span>
-Note that <tt>brightgreen-80pct</tt>, <tt>darkgreen-60pct</tt>, <tt>container-100pct</tt>, <tt>container-16px</tt>, <tt>gray-center-20pct</tt> are all CSS classes defined in WebApollo stylesheets that describe how to display their respective features and subfeatures.  WebApollo also tries to use reasonable default CSS styles, so it is possible to omit these CSS class arguments.  For example, to accept default styles for maker.gff, the above could instead be shortened to:
- $ <span class="enter">bin/flatfile-to-json.pl --gff WEB_APOLLO_SAMPLE_DIR/split_gff/maker.gff \
- --getSubfeatures --type mRNA --trackLabel maker</span>
-See the [[#Customizing features|Customizing features]] section for more information on CSS styles.  There are also many other configuration options for flatfile-to-json.pl, see [[JBrowse_Configuration_Guide#Data_Formatting|JBrowse data formatting]] for more information.
-====GFF3 with match/match_part features====
-Now we need to process the other remaining GFF3 files.  The entries in those are stored as "match/match_part", so they can all be handled in a similar fashion.
-We'll start off with <tt>blastn</tt> as an example.
- $ <span class="enter">bin/flatfile-to-json.pl --gff WEB_APOLLO_SAMPLE_DIR/split_gff/blastn.gff \
- --arrowheadClass webapollo-arrowhead --getSubfeatures \
- --subfeatureClasses '{"match_part": "darkblue-80pct"}' \
- --className container-10px --trackLabel blastn</span>
-Again, <tt>container-10px</tt> and <tt>darkblue-80pct</tt> are CSS class names that define how to display those elements.  See the [[#Customizing features|Customizing features]] section for more information.
-We need to follow the same steps for the remaining GFF3 files.  It can be a bit tedious to do this for the remaining six files, so we can use a simple Bash shell script to help us out (write the script to a file and execute as shown below).  Don't worry if the script doesn't make sense, you can always process each file manually on the command line:
-   <span class="enter">for i in $(ls WEB_APOLLO_SAMPLE_DIR/split_gff/*.gff | grep -v maker); do
-     j=$(basename $i)
-     j=${j/.gff/}
-     echo "Processing $j"
-     bin/flatfile-to-json.pl --gff $i --arrowheadClass webapollo-arrowhead \
-     --getSubfeatures --subfeatureClasses "{\"match_part\": \"darkblue-80pct\"}" \
-     --className container-10px --trackLabel $j
-   done
- $ /bin/bash myscript.sh
-</span>
-====Generate searchable name index====
-Once data tracks have been created, you will need to generate a searchable index of names using the generate-names.pl script:
- $ <span class="enter">bin/generate-names.pl</span>
-This script creates an index of sequence names and feature names in order to enable auto-completion in the navigation text box.  This index is required, so if you do not wish any of the feature tracks to be indexed for auto-completion, you can instead run generate-names.pl immediately after running prepare_refseqs.pl, but before generating other tracks.
-The script can be also rerun after any additional tracks are generated if you wish feature names from that track to be added to the index (using the <tt>--incremental</tt> option).
-<font color="red">IMPORTANT</font>: If you're running this script with a Perl version 5.10 or older, you'll need to add the <tt>--safeMode</tt> option.  Note that running it in safe mode will be much slower.
-====BAM data====
-Now let's look how to configure BAM support.  WebApollo has native support for BAM, so no extra processing of the data is required.
-First we'll copy the BAM data into the WebApollo data directory.  We'll put it in the <tt>data/bam</tt> directory.  Keep in mind that this BAM data was randomly generated, so there's really no biological meaning to it.  We only created it to show BAM support.
- $ <span class="enter">mkdir data/bam</span>
- $ <span class="enter">cp WEB_APOLLO_SAMPLE_DIR/*.bam* data/bam</span>
-Now we need to add the BAM track.
- $ <span class="enter">bin/add-bam-track.pl --bam_url bam/simulated-sorted.bam \
-    --label simulated_bam --key "simulated BAM"</span>
-You should now have a <tt>simulated BAM</tt> track available.
-====BigWig data====
-WebApollo has native support for BigWig files (.bw), so no extra processing of the data is required.
-Configuring a BigWig track is very similar to configuring a BAM track.  First we'll copy the BigWig data into the WebApollo data directory.  We'll put it in the <tt>data/bigwig</tt> directory.  Keep in mind that this BigWig data was generated as a coverage map derived from the randomly generated BAM data, so like the BAM data there's really no biological meaning to it.  We only created it to show BigWig support.
- $ <span class="enter">mkdir data/bigwig</span>
- $ <span class="enter">cp WEB_APOLLO_SAMPLE_DIR/*.bw data/bigwig</span>
-Now we need to add the BigWig track.
- $ <span class="enter">bin/add-bw-track.pl --bw_url bigwig/simulated-sorted.coverage.bw \
-   --label simulated_bw --key "simulated BigWig"</span>
-You should now have a <tt>simulated BigWig</tt> track available.
-===Customizing different annotation types===
-To change how the different annotation types look in the annotation track, you'll need to update the mapping of the annotation type to the appropriate CSS class.  This data resides in <tt>trackList.json</tt> after running <tt>add-webapollo-plugin.pl</tt>.  You'll need to modify the JSON entry whose label is <tt>Annotations</tt>.  Of particular interest is the <tt>alternateClasses</tt> element.  Let's look at that default element:
-<pre>
-"alternateClasses": {
-    "pseudogene" : {
-       "className" : "light-purple-80pct",
-       "renderClassName" : "gray-center-30pct"
-    },
-    "tRNA" : {
-       "className" : "brightgreen-80pct",
-       "renderClassName" : "gray-center-30pct"
-    },
-    "snRNA" : {
-       "className" : "brightgreen-80pct",
-       "renderClassName" : "gray-center-30pct"
-    },
-    "snoRNA" : {
-       "className" : "brightgreen-80pct",
-       "renderClassName" : "gray-center-30pct"
-    },
-    "ncRNA" : {
-       "className" : "brightgreen-80pct",
-       "renderClassName" : "gray-center-30pct"
-    },
-    "miRNA" : {
-       "className" : "brightgreen-80pct",
-       "renderClassName" : "gray-center-30pct"
-    },
-    "rRNA" : {
-       "className" : "brightgreen-80pct",
-       "renderClassName" : "gray-center-30pct"
-    },
-    "repeat_region" : {
-       "className" : "magenta-80pct"
-    },
-    "transposable_element" : {
-       "className" : "blue-ibeam",
-       "renderClassName" : "blue-ibeam-render"
-    }
-},
-</pre>
-For each annotation type, you can override the default class mapping for both <tt>className</tt> and <tt>renderClassName</tt> to use another CSS class.  Check out the [[#Customizing_features|Customizing features]] section for more information on customizing the CSS classes.
-===Customizing features===
-The visual appearance of biological features in WebApollo (and JBrowse) is handled by CSS stylesheets.  Every feature and subfeature is given a default CSS "class" that matches a default CSS style in a CSS stylesheet.  These styles are are defined in <tt>TOMCAT_WEBAPPS_DIR/WebApollo/jbrowse/track_styles.css</tt> and <tt>TOMCAT_WEBAPPS_DIR/WebApollo/jbrowse/plugins/WebApollo/css/webapollo_track_styles.css</tt>.  Additional styles are also defined in these files, and can be used by explicitly specifying them in the --className, --subfeatureClasses, --renderClassname, or --arrowheadClass parameters to flatfile-to-json.pl.  See example [[#GFF3_with_gene/transcript/exon/CDS/polypeptide_features|above]]
-WebApollo differs from JBrowse in some of it's styling, largely in order to help with feature selection, edge-matching, and dragging.  WebApollo by default uses invisible container elements (with style class names like "container-16px") for features that have children, so that the children are fully contained within the parent feature.  This is paired with another styled element that gets rendered ''within'' the feature but underneath the subfeatures, and is specified by the --renderClassname argument to flatfile-to-json.pl.  Exons are also by default treated as special invisible containers, which hold styled elements for UTRs and CDS.
-It is relatively easy to add other stylesheets that have custom style classes that can be used as parameters to flatfile-to-json.pl.  An example is <tt>TOMCAT_WEBAPPS_DIR/WebApollo/jbrowse/sample_data/custom_track_styles.css</tt> which contains two new styles:
-<pre>
-.gold-90pct,
-.plus-gold-90pct,
-.minus-gold-90pct  {
-    background-color: gold;
-    height: 90%;
-    top: 5%;
-    border: 1px solid gray;
-}
-.dimgold-60pct,
-.plus-dimgold-60pct,
-.minus-dimgold-60pct  {
-    background-color: #B39700;
-    height: 60%;
-    top: 20%;
-}
-</pre>
-In this example, two subfeature styles are defined, and the ''top'' property is being set to (100%-height)/2 to assure that the subfeatures are centered vertically within their parent feature.  When defining new styles for features, it is important to specify rules that apply to plus-''stylename'' and minus-''stylename'' in addition to ''stylename'', as WebApollo adds the "plus-" or "minus-" to the class of the feature if the the feature has a strand orientation.
-You need to tell WebApollo where to find these styles.  This can be done via standard CSS loading in the index.html file by adding a <link> element:
- <link rel="stylesheet" type="text/css" href="sample_data/custom_track_styles.css">
-Or alternatively, to avoid modifying the web application, additional CSS can be specified in the trackList.json file that is created in the data directory during static data generation, by adding a "css" property to the JSON data:
-<pre>
-   "css" : "sample_data/custom_track_styles.css"
-</pre>
-Then these new styles can be used as arguments to flatfile-to-json.pl, for example:
-<pre>
-bin/flatfile-to-json.pl --gff WEB_APOLLO_SAMPLE_DIR/split_gff/maker.gff
---getSubfeatures --type mRNA --trackLabel maker --webApollo
---subfeatureClasses '{"CDS":"gold-90pct", "UTR": "dimgold-60pct"}'
-</pre>
-Depending on how your Tomcat server is setup, you might need to restart the server to pick up all the changes (or at least restart the WebApollo web application).  You'll also need to do this any time you change the configuration files (not needed when changing the data files).
-===Bulk loading annotations to the user annotation track===
-====GFF3====
-You can use the <tt>WEB_APOLLO_DIR/tools/data/add_transcripts_from_gff3_to_annotations.pl</tt> script to bulk load GFF3 files with transcripts to the user annotation track.  Let's say we want to load our <tt>maker.gff</tt> transcripts.
- $ <span class="enter">WEB_APOLLO_DIR/tools/data/add_transcripts_from_gff3_to_annotations.pl \
- -U localhost:8080/WebApollo -u web_apollo_admin -p web_apollo_admin \
- -i WEB_APOLLO_SAMPLE_DIR/split_gff/maker.gff</span>
-The default options should be handle GFF3 most files that contain genes, transcripts, and exons.
-You can still use this script even if the file you're loading does not contain transcripts and exons.  Let's say we want to load <tt>match</tt> and <tt>match_part</tt> features as transcripts and exons respectively.  We'll use the <tt>blastn.gff</tt> file as an example.
- $ <span class="enter">WEB_APOLLO_DIR/tools/data/add_transcripts_from_gff3_to_annotations.pl \
- -U localhost:8080/WebApollo -u web_apollo_admin -p web_apollo_admin \
- -i WEB_APOLLO_SAMPLE_DIR/split_gff/blastn.gff -t match -e match_part</span>
-Look at the script's help (<tt>-h</tt>) for all available options.
-Congratulations, you're done configuring WebApollo!
-==Upgrading existing instances==
-We suggest creating a new instance to prevent disruption to existing instances and to have a staging site before making the upgrade public.  Since the local storage is file based, you can just copy the BerkeleyDB databases to another directory and have the new instance point to it:
- $ <span class="enter">cp -R WEB_APOLLO_DATA_DIR WEB_APOLLO_DATA_DIR_STAGING</span>
-Create a staging instance in your <tt>TOMCAT_WEBAPPS_DIR</tt>:
- $ <span class="enter">cd TOMCAT_WEBAPPS_DIR<span>
- $ <span class="enter">mkdir WebApolloStaging</span>
-Unpack the WAR in the WebApoloStaging and point <tt>&lt;datastore_directory&gt;</tt> in <tt>TOMCAT_WEBAPPS_DIR/WebApolloStaging/config.xml</tt> file to wherever <tt>WEB_APOLLO_DATA_DIR_STAGING</tt> is.  Afterwards, just setup the configuration as normal.
-To use the existing static data, we can just copy the data symlink (or directory if you chose not to use a symlink):
- $ <span class="enter">cp -R WebApollo/jbrowse/data WebApolloStaging/jbrowse/data
-You can also copy over any custom CSS modifications you may have made to the staging site.
-Once you've had a chance to test out the upgrade and make sure everything's working fine, just delete (or move it somewhere else for backup purposes) and rename the staging site:
- $ <span class="enter">rm -rf WebApollo</span>
- $ <span class="enter">mv WebApolloStaging WebApollo</span>
-You might also want to update <tt>&lt;datastore_directory&gt;</tt> back to <tt>WEB_APOLLO_DATA_DIR</tt> and delete <tt>WEB_APOLLO_DATA_DIR_STAGING</tt> so that you can continue to keep the data in the same location.  It's also recommended that you restart Tomcat after this.
-====Upgrading existing JBrowse data stores====
-You'll need to upgrade the <tt>trackList.json</tt> file in your JBROWSE_DATA_DIR directory.  The WebApollo plugin needs to be reconfigured, so run through the steps in the  [[#Adding_the_WebApollo_plugin|Adding the WebApollo plugin]] section.
-====Upgrading existing annotation data stores====
-=====Transcript type updating=====
-Releases 2013-09-04 and prior only supported annotating protein coding genes.  WebApollo now supports annotating other feature types.  If you're running WebApollo on annotation data generated from the 2013-09-04 and prior releases, you might want to run a tool that will update all protein coding transcripts from type "sequence:transcript" to "sequence:mRNA".  Although this step is not required (WebApollo has proper backwards support for the generic "sequence:transcript" type, we recommend updating your data.
-Although issues with the update are not expected, it's highly recommended to backup the databases before the update (you can delete them once you've tested the update and made sure that everything's ok).
- $ <span class="enter">cp -R WEB_APOLLO_DATA_DIR WEB_APOLLO_DATA_DIR.bak</span>
-Note that before you run the update, you'll need to stop WebApollo (either by shutting down Tomcat or stopping WebApollo through Tomcat's Application Manager).
-You'll need to run <tt>update_transcript_to_mrna.sh</tt>, located in WEB_APOLLO_DIR/tools/data.  You'll only need to run this tool when first upgrading your WebApollo version.  You can either choose to run the tool on individual annotation data stores (using the <tt>-i</tt> option) or more conveniently run through all the data stores that are within a parent directory (using the <tt>-d</tt> option).  We'll go ahead with the later.  You can choose to update either the annotation data store or the history data store (using the <tt>-H</tt> option).  You'll  need to tell the tool where you deployed WebApollo (using the <tt>-w</tt> option).
- $ <span class="enter">WEB_APOLLO_DIR/tools/data/update_transcripts_to_mrna.sh -w TOMCAT_WEBAPPS_DIR/WebApollo -d WEB_APOLLO_DATA_DIR</span>
- $ <span class="enter">WEB_APOLLO_DIR/tools/data/update_transcripts_to_mrna.sh -w TOMCAT_WEBAPPS_DIR/WebApollo -d WEB_APOLLO_DATA_DIR -H</span>
-Restart WebApollo and test out that the update didn't break anything.  Once you're satisfied, you can go ahead and remove the backup we made:
- $ <span class="enter">rm -rf WEB_APOLLO_DATA_DIR.bak</span>
-=====Sequence alterations updating=====
-We've modified how sequence alterations are indexed compared to releases 2013-09-04 and prior.  If you're running WebApollo on annotation data generated from the 2013-09-04 and prior releases, you'll need to run a tool that will update all your sequence alterations.  You only need to run this tool if you've annotated sequence alterations (e.g., insertion, deletion, substitution).  If you haven't annotated those types, you can skip this step.
-Although issues with the update are not expected, it's highly recommended to backup the databases before the update (you can delete them once you've tested the update and made sure that everything's ok).
- $ <span class="enter">cp -R WEB_APOLLO_DATA_DIR WEB_APOLLO_DATA_DIR.bak</span>
-Note that before you run the update, you'll need to stop WebApollo (either by shutting down Tomcat or stopping WebApollo through Tomcat's Application Manager).
-You'll need to run <tt>update_sequence_alterations.sh</tt>.  You can get the tarball [http://genomearchitect.org/webapollo/releases/patches/2013-11-22/update_sequence_alterations.tgz here].
-Uncompress the tarball:
- $ <span class="enter">tar -xvzf update_sequence_alterations.tgz</span>
- $ <span class="enter">cd update_sequence_alterations</span>
-You'll only need to run this tool when first upgrading your WebApollo version.  You can either choose to run the tool on individual annotation data stores (using the <tt>-i</tt> option) or more conveniently run through all the data stores that are within a parent directory (using the <tt>-d</tt> option).  We'll go ahead with the later.  You'll  need to tell the tool where you deployed WebApollo (using the <tt>-w</tt> option).
- $ <span class="enter">./update_sequence_alterations.sh -w TOMCAT_WEBAPPS_DIR/WebApollo -d WEB_APOLLO_DATA_DIR</span>
-Restart WebApollo and test out that the update didn't break anything.  Once you're satisfied, you can go ahead and remove the backup we made:
- $ <span class="enter">rm -rf WEB_APOLLO_DATA_DIR.bak</span>
-=Accessing your WebApollo installation=
-Let's test out our installation.  Point your browser to <tt>
-<nowiki>http://SERVER_ADDRESS:8080/WebApollo</nowiki> </tt>.
-[[File:web_apollo_login_page_with_credentials_doc.jpg|220px|WebApollo login page|center|border]]
-The user name and password are both <tt>web_apollo_admin</tt> as we configured earlier.  Enter them into the login dialog.
-[[File:web_apollo_select_refseq_doc.jpg|800px|WebApollo reference sequence selection|center|border]]
-We only see one reference sequence to annotate since we're only working with one contig.  Click on <tt>scf1117875582023</tt> under the <tt>Name</tt> column.
-Now have fun annotating!!!

Difference between revisions of "WebApollo Installation 1.x"

Latest revision as of 05:08, 12 November 2014

Navigation menu

Search