WebApollo Installation 1.x
Contents
Introduction
This guide will walk you through the server side installation for Web Apollo. Web Apollo is a web-based application, so the only client side requirement is a web browser. Note that Web Apollo has only been tested on Chrome, Firefox, and Safari. It has not been tested with Internet Explorer.
Quick start guide
While there are a number of prerequisites to WebApollo, we hope that this quick-start guide can help by automating some setup steps. This "Quick start guide" can be used to initialize a "blank" machine with a WebApollo instance from scratch.
- Use of a command line
- Ability to install some prerequisites to the system with a package manager
- Ability to preconfigure a postgres user
- Use of cpan or cpanm for perl package management
More discussion of the particular configurations can be seen in the full installation guide.
# set some environmental variables export PGUSER=`whoami` export PGPASSWORD=password export WEBAPOLLO_USER=web_apollo_admin export WEBAPOLLO_PASSWORD=web_apollo_admin export WEBAPOLLO_DATABASE=web_apollo_users export ORGANISM=Pythium ultimum
# install system prerequisites (debian/ubuntu) sudo apt-get install tomcat7 openjdk-7-jdk libexpat1-dev cpanminus postgresql-9.3 postgresql-server-dev-9.3 nodejs-legacy git maven # install system prerequisites (centOS/redhat) sudo yum epel-release sudo yum install tomcat cpanminus zlib-devel libpng-devel gcc postgresql postgresql-devel git maven npm # install system prerequisites (macOSX/homebrew) brew install git maven tomcat node cpanminus --no-tcl postgresql
# on centOS/redhat, manually init and start postgres sudo su -c "service postgresql initdb && service postgresql start"
# setup cpanm and install jbrowse and webapollo perl prerequisites cpanm --local-lib=~/perl5 local::lib && eval $(perl -I ~/perl5/lib/perl5/ -Mlocal::lib) cpanm DateTime Text::Markdown Crypt::PBKDF2 DBI DBD::Pg
# init postgres by logging into postgres user and creating the webapollo user sudo su postgres -c "createuser -RDIElPS $PGUSER" sudo su postgres -c "createdb -E UTF-8 -O $PGUSER $WEBAPOLLO_DATABASE"
# clone Apollo repository and download sample data to WEB_APOLLO_ROOT/pyu_data git clone --depth 1 https://github.com/gmod/Apollo.git cd Apollo wget http://icebox.lbl.gov/webapollo/data/pyu_data.tgz tar xvzf pyu_data.tgz
# initialize PostgreSQL data base for sample data. Enter the password web_apollo_users_admin for firs tstep psql -U $PGUSER $WEBAPOLLO_DATABASE < tools/user/user_database_postgresql.sql tools/user/add_user.pl -D $WEBAPOLLO_DATABASE -U $PGUSER -P $PGPASSWORD -u $WEBAPOLLO_USER -p $WEBAPOLLO_PASSWORD
# add the chromosome names to the webapollo database tools/user/extract_seqids_from_fasta.pl -p Annotations- -i pyu_data/scf1117875582023.fa -o seqids.txt tools/user/add_tracks.pl -D $WEBAPOLLO_DATABASE -U $PGUSER -P $PGPASSWORD -t seqids.txt tools/user/set_track_permissions.pl -D $WEBAPOLLO_DATABASE -U $PGUSER -P $PGPASSWORD -u $WEBAPOLLO_USER -t seqids.txt -a
# build a compressed release package and install jbrowse binaries (also installs many perl prerequisites using cpanm) ./build.sh release ./install_jbrowse_bin.sh cpanm
# setup jbrowse data directory in WEB_APOLLO_ROOT/data mkdir split_gff tools/data/split_gff_by_source.pl -i pyu_data/scf1117875582023.gff -d split_gff prepare-refseqs.pl --fasta pyu_data/scf1117875582023.fa --out data flatfile-to-json.pl --gff split_gff/maker.gff --arrowheadClass trellis-arrowhead \ --subfeatureClasses '{"wholeCDS": null, "CDS":"brightgreen-80pct", "UTR": "darkgreen-60pct", "exon":"container-100pct"}' \ --className container-16px --type mRNA --trackLabel maker --out data
# add the webapollo plugin to the jbrowse config client/apollo/bin/add-webapollo-plugin.pl -i data/trackList.json
# configure data directories using config.properties mkdir annotations echo jbrowse.data=`pwd`/data > config.properties echo datastore.directory=`pwd`/annotations >> config.properties echo database.url=jdbc:postgresql:$WEBAPOLLO_DATABASE >> config.properties echo database.username=$PGUSER >> config.properties echo database.password=$PGPASSWORD >> config.properties echo organism=$ORGANISM >> config.properties
# launch instance for testing, login to http://localhost:8080/apollo as web_apollo_admin:web_apollo_admin ./run.sh
Note: you may have to shutdown any running instances of tomcat before doing a run.sh for testing. Alternatively, continue to the Deploying the servlet for instructions on deploying to production.
Installation
You can download the latest Web Apollo release as a tarball or from [genomearchitect.org] (not available for 1.x release branch yet). All installation steps will be done through a shell. We'll be using Tomcat 7 as our servlet container and PostgreSQL as our relational database management system. We'll use sample data from the Pythium ultimum genome, provided as a separate download.
Server operating system
Any Unix like system (e.g., Unix, Linux, Mac OS X)
Prerequisites
Note: see the Quick-start guide for the quickest way to take care of pre-requisites.
- System prerequisites
- Servlet container (must support servlet spec 3.0+) [officially supported: Tomcat 7]
- Java 7+
- Maven3+ (most package managers will have this)
- Relational Database Management System [officially supported: PostgreSQL]
- Git
- NodeJS
- Perl prerequisites that need manual installation
- DateTime
- Text::Markdown
- Crypt::PBKDF2
- DBI
- DBD::Pg
- Data generation pipeline prerequisites (see JBrowse prerequisites for more information on its prerequisites)
- System packages
- libpng12-0
- libpng12-dev
- zlib1g (Debian/Ubuntu)
- zlib1g-dev (Debian/Ubuntu)
- zlib (RedHat/CentOS)
- zlib-devel (RedHat/CentOS)
- libexpat1-dev (Debian/Ubuntu)
- System packages
- Sequence search (optional)
Tomcat memory
The memory requirements will depend on the the size of your genome and how many instances of Web Apollo you host in the same Tomcat instance. We recommend at least 1g for the heap size and 256m for the permgen size as a starting point. Suggested settings are:
-Xms512m -Xmx1g -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled -XX:+UseConcMarkSweepGC -XX:MaxPermSize=256m
The location of your Tomcat environment configuration will be dependent on how you installed it (manually vs using a package manager). It's recommended that you add this configuration in $TOMCAT_BIN_DIR/setenv.sh where $TOMCAT_BIN_DIR is where the directory where the Tomcat binaries reside.
Conventions
This guide will use the following conventions to make it more concise (you might want to keep these convention definitions handy so that you can easily reference them as you go through this guide):
- WEB_APOLLO_DIR
- Location where the tarball was uncompressed and will include WebApollo-RELEASE_DATE (e.g., ~/webapollo/WebApollo-2012-10-08)
- WEB_APOLLO_SAMPLE_DIR
- Location where the sample tarball was uncompressed (e.g., ~/webapollo/webapollo_sample)
- WEB_APOLLO_DATA_DIR
- Location for WebApollo annotations (e.g., /data/webapollo/annotations)
- JBROWSE_DATA_DIR
- Location for JBrowse data (e.g., /data/webapollo/jbrowse/data)
- TOMCAT_WEBAPPS_DIR
- Location where deployed servlets for Tomcat go (e.g., /var/lib/tomcat7/webapps)
- BLAT_DIR
- Location where the Blat binaries are installed (e.g., /usr/local/bin)
- BLAT_TMP_DIR
- Location for temporary Blat files (e.g., /data/webapollo/blat/tmp)
- BLAT_DATABASE
- Location for the Blat database (e.g., /data/webapollo/blat/db/pyu.2bit)
The Tomcat related paths are the ones used by default in Ubuntu 12.04 and Ubuntu's provided Tomcat7 package. Paths will likely be different in your system depending on how Tomcat was installed.
Authentication
Postgres can use Ident and password authentication. Because it is set up to use Ident by default, you might have to add a line to pg_hba.conf specifying that the user will connect via password authentication.
Also see How do I find the path to pg_hba.conf from the shell?
Then add the following line (above the others that refer to all users)
local all web_apollo_users_admin md5
Restart the postgres server for changes to take effect.
User database
Web Apollo uses a database to determine who can access and edit annotations for a given sequence.
First we’ll need to create a database. You can call it whatever you want (remember the name as you’ll need to point the configuration to it). For the purposes of this guide, we’ll call it web_apollo_users You might want to create a separate account to manage the database. We’ll have the user web_apollo_users_admin with password web_apollo_users_admin who has database creation privilege. Depending on how your database server is setup, you might not need to set a password for the user. See the PostgreSQL documentation for more information. We'll assume that the database is in the same server where Web Apollo is being installed ("localhost"). These commands will be run as the postgres user.
$ sudo su postgres $ createuser -P web_apollo_users_admin Enter password for new role: Enter it again: Shall the new role be a superuser? (y/n) n Shall the new role be allowed to create databases? (y/n) y Shall the new role be allowed to create more new roles? (y/n) n
Next we'll create the user database.
$ createdb -U web_apollo_users_admin web_apollo_users
If you get an authentication error, use the -W flag to get a password prompt.
$ createdb -U web_apollo_users_admin -W web_apollo_users
Now that the database is created, we need to load the schema to it.
$ cd WEB_APOLLO_DIR/tools/user $ psql -U web_apollo_users_admin web_apollo_users < user_database_postgresql.sql
Now the user database has been setup.
Let's populate the database.
First we’ll create an user with access to Web Apollo. We’ll use the add_user.pl script in WEB_APOLLO_DIR/tools/user. Let’s create an user named web_apollo_admin with the password web_apollo_admin.
$ ./add_user.pl -D web_apollo_users -U web_apollo_users_admin -P web_apollo_users_admin \
-u web_apollo_admin -p web_apollo_admin
Next we’ll add the annotation tracks ids for the genomic sequences for our organism. We’ll use the add_tracks.pl script in the same directory. We need to generate a file of genomic sequence ids for the script. For convenience, there’s a script called extract_seqids_from_fasta.pl in the same directory which will go through a FASTA file and extract all the ids from the deflines. Let’s first create the list of genomic sequence ids. We'll store it in ~/scratch/seqids.txt. We’ll want to add the prefix “Annotations-” to each identifier.
$ mkdir ~/scratch $ ./extract_seqids_from_fasta.pl -p Annotations- -i WEB_APOLLO_SAMPLE_DIR/scf1117875582023.fa \ -o ~/scratch/seqids.txt
Now we’ll add those ids to the user database.
$ ./add_tracks.pl -D web_apollo_users -U web_apollo_users_admin -P web_apollo_users_admin \
-t ~/scratch/seqids.txt
Now that we have an user created and the annotation track ids loaded, we’ll need to give the user permissions to access the sequence. We’ll have the all permissions (read, write, publish, user manager). We’ll use the set_track_permissions.pl script in the same directory. We’ll need to provide the script a list of genomic sequence ids, like in the previous step.
$ ./set_track_permissions.pl -D web_apollo_users -U web_apollo_users_admin \
-P web_apollo_users_admin -u web_apollo_admin -t ~/scratch/seqids.txt -a
We’re all done setting up the user database.
Note that we’re only using a subset of the options for all the scripts mentioned above. You can get more detailed information on any given script (and other available options) using the “-h” or “--help” flag when running the script.
Installing WebApollo
From the top of inside the downloaded release, you need to run maven to build a war file. This is then placed in tomcat's webapps directory. Tomcat will be responsible extracting the file.
IMPORTANT: the JBrowse data directories should no longer be placed anywhere inside the Tomcat webapps folder, not even when using symlinks!! The data directory should be created outside of the webapps folder to avoid data loss when doing Undeploy operations!!
Before you build
You need to configure your instance using a config.properties and a config.xml file, which are copied into the war file.
- Copy the sample config / logging files to the right location.
$ cd WEB_APOLLO_DIR $ cp sample_config.properties config.properties $ cp sample_config.xml config.xml $ cp sample_log4j2.json log4j2.json $ cp sample_log4j2-test.json log4j2-test.json
- Edit the config.properties file and config.xml to point to the appropriate directories.
- Note: You must edit the config.properties file to point to your jbrowse data directory, e.g. jbrowse.data=/opt/apollo/jbrowse/data to point to your data directory. The other parameters are optional and can still be configured in your config.xml file (to comment out, prepend with a #).
Building the servlet
$ cd WEB_APOLLO_DIR
There are a variety of targets available to build the war. For the debug|release, make sure you have the prerequisites for building, including NodeJS, DateTime, and Text::Markdown
To generate a build, run deploy.sh with some optional parameters
$ ./deploy.sh [release|debug|github|help]
This is used to generate a WAR file that is ready for deployment. The parameters are for generating compiled javascript (release), unoptimized javascript (debug) or unmodified straight-from-github deployments (github).
The script creates a war file in the WEB_APOLLO_DIR/target/apollo-1.x.war
Install JBrowse binaries for WebApollo
For WebApollo, it is best to install the JBrowse binaries using the following script:
$ ./install_jbrowse_bin.sh [cpanm]
This will install the binaries to the system via cpan or cpanm. If you are using cpanm, you can use environment variables to set specific install directories, i.e.
$ export PERL_CPANM_OPT="--local-lib=~/perl5"
Configuration
Aside from the config.properties file, which is all that is essential to get the database up and running, there are several other configuration files that will reside in WEB_APOLLO_DIR/config/ that can be used to tailor parameters for your installation and setup BLAT,Chado,etc. Additionally, adding evidence tracks and JBrowse configuration can be performed. Please see WebApollo_Configuration_Guide_1.x for details.
Deploying the servlet
After the war file is generated by the ./deploy.sh script in the WEB_APOLLO_DIR/target directory (e.g. target/apollo-1.0.war), it can be copied into the tomcat7 webapps directory:
- cp WEB_APOLLO_DIR/target/apollo-1.x.war TOMCAT_WEBAPPS_DIR/WebApollo.war
We recommend to NEVER expand the war file manually or touch the contents of the war file after deployment. The configuration steps should be done in your checkout directory, and the configurations are packaged into the war file by the ./deploy.sh script.
Accessing your WebApollo installation
After copying your WAR file to the tomcat webapps directory, the app will be automatically started. Let's test out our installation. Point your browser to http://SERVER_ADDRESS:8080/WebApollo .
The user name and password are both web_apollo_admin as we configured earlier. Enter them into the login dialog.
We only see one reference sequence to annotate since we're only working with one contig. Click on scf1117875582023 under the Name column.
Now have fun annotating!!!