WebApollo Tutorial 2013
GMOD Summer School 2013
These are the steps I took to install WebApollo on the prebuilt AWS image. Some steps need only be performed once, such as installing the custom valves in Tomcat or installing BLAT, to ensure that the system is ready. These steps will be denoted with an asterisk (*).
Note: These notes follow along with the general WebApollo installation guide (http://www.gmod.org/wiki/WebApollo_Installation). The guide below details the steps I used to set up the GMOD Summer School 2013 AMI.
For more information about using WebApollo, please see the user guide, located here: http://genomearchitect.org/webapollo/docs/webapollo_user_guide.pdf
Contents
Prerequisites
- * Install Tomcat7*
- * Install BLAT/verify location*
- * PostgreSQL is installed and configured*
- * Download WebApollo (and sample data)*
mkdir ~/dataHome/WebApollo2 cd ~/dataHome/WebApollo2 wget https://apollo-web.googlecode.com/files/WebApollo-2013-05-16.tgz wget http://genomearchitect.org/webapollo/data/pyu_data.tgz tar -xvzf WebApollo-2013-05-16.tgz tar -xvzf pyu_data.tgz
Database Setup
- Create PostgreSQL user with proper permissions for WebApollo to use
$ sudo su postgres $ createuser -P web_apollo_users_admin Enter password for new role: Enter it again: Shall the new role be a superuser? (y/n) n Shall the new role be allowed to create databases? (y/n) y Shall the new role be allowed to create more new roles? (y/n) n
- Create a database for managing the WebApollo users
$ createdb -U web_apollo_users_admin web_apollo_users
- Populate the database with the WebApollo User schema
cd /data/dataHome/WebApollo2/WebApollo-2013-05-16/tools/user psql -U web_apollo_users_admin web_apollo_users < \ user_database_postgresql.sql
- Create a user with access to WebApollo
./add_user.pl -D web_apollo_users -U web_apollo_users_admin -P \ web_apollo_users_admin -u web_apollo_admin -p web_apollo_admin
- The database is ready and has an initial user, which we will use as an administrator.
- We now need to add in the sequences and grant permissions to the user we just created.
cd /data/dataHome/WebApollo2/WebApollo-2013-05-16/tools/user mkdir /data/dataHome/WebApollo2/pyu_data/scratch/ ./extract_seqids_from_fasta.pl -p Annotations- \ -i /data/dataHome/WebApollo2/pyu_data/scf1117875582023.fa \ -o /data/dataHome/WebApollo2/pyu_data/scratch/scratch/seqids.txt
- Add those ids to the user database.
cd /data/dataHome/WebApollo2/WebApollo-2013-05-16/tools/user ./add_tracks.pl -D web_apollo_users -U web_apollo_users_admin \ -P web_apollo_users_admin -t /data/dataHome/WebApollo2/pyu_data/scratch/seqids.txt
- Now that we have created the user and have loaded the annotation track ids, we’ll need to give the user
- permissions to access the sequence. We’ll have the all permissions (read, write, publish, user manager).
- We’ll use the set_track_permissions.pl script in the same directory. We’ll need to provide the script a list
- of genomic sequence ids, like in the previous step.
cd /data/dataHome/WebApollo2/WebApollo-2013-05-16/tools/user ./set_track_permissions.pl -D web_apollo_users –U \ web_apollo_users_admin -P web_apollo_users_admin -u \ web_apollo_admin -t \ /data/dataHome/WebApollo/pyu_data/scratch/seqids.txt –a
Update Tomcat for WebApollo
- * Install the WebApollo custom valves in the Tomcat7 lib directory*
sudo cp \ ~/dataHome/WebApollo2/WebApollo-2013-05-16/tomcat/custom-valves.jar \ /usr/share/tomcat7/lib/
- * Update the Tomcat7 server.xml to include the WebApollo error reporting VALVE*
- Add errorReportValveClass="org.bbop.apollo.web.ErrorReportValve" as an attribute to the existing <Host> element in: /var/lib/tomcat7/conf/server.xml
- The final file should look like the example below:
$ less /var/lib/tomcat7/conf/server.xml <Host name="localhost" appBase="webapps" unpackWARs="true" autoDeploy="true" errorReportValveClass="org.bbop.apollo.web.ErrorReportValve"> </Host>
- At this point the server should be ready for WebApollo.
Deploy WebApollo
- Create the location for the new instance
sudo mkdir /var/lib/tomcat7/webapps/WebApollo2 cd WebApollo2
- Deploy servlet
sudo jar –xvf ~/dataHome/WebApollo2/WebApollo-2013-05-16/war/WebApollo.war
- Change ownership of the base WebApollo directory
sudo chown tomcat7:tomcat7 /var/lib/tomcat7/webapps/WebApollo2
- Tomcat will create a tmp directory to hold exported GFF3, and needs permission to create and modify the contents of that directory.
- Once the servlet is deployed we need to configure JBrowse and load the data.
- Once JBrowse and BLAT are set up and configured, we will use this information to finish configuring WebApollo.
Data Processing and JBrowse setup
- Create ancillary directories for the new instance
cd /data/dataHome/WebApollo2 mkdir Pyu Pyu/Blat Pyu/Blat/tmp Pyu/Annotations Pyu/data
- Change ownership to user tomcat7
sudo chown -R tomcat7:tomcat7 /data/data/Home/WebApollo2/Pyu/Annotations sudo chown -R tomcat7:tomcat7 /data/data/Home/WebApollo2/Pyu/Blat/tmp
- Set up the JBrowse instance
cd /var/lib/tomcat7/webapps/WebApollo2/jbrowse
- Create a symbolic link to the large disk where the browser data will be stored
ln -s /data/dataHome/WebApollo2/Pyu/data/ .
- To make loading easier, make another link to the original data directory
ln -s /data/dataHome/WebApollo2/Pyu_data .
- WebApollo uses two special tracks, one to display annotations, and one for sequence alterations. We need to copy those from the original downloaded location to the data directory
cp /data/dataHome/WebApollo2/WebApollo-2013-05-16/json/* ./data/
- Process the data in preparation for loading
cd /var/lib/tomcat7/webapps/WebApollo2/jbrowse/pyu_data mkdir scratch scratch/split_gff
- This script splits the Maker output GFF3 file into separate files based on the source
~/dataHome/WebApollo2/WebApollo-2013-05-16/tools/data/split_gff_by_source.pl -i scf1117875582023.gff \ -d scratch/split_gff
- Set up the BLAT database
cd /data/data/Home/WebApollo2/Pyu/Blat ln -s ../../pyu_data/scf1117875582023.fa faToTwoBit scf1117875582023.fa Pyu.2bit
Data Loading
- Load the reference sequence(s) and evidence tracks.
- Note: This example uses the results from Maker and simulated BAM and BigWig files.
- Depending on the data you use in your project, these commands will be different.
- Make the scripts in bin/ executable
cd /var/lib/tomcat7/webapps/WebApollo2/jbrowse chmod +x bin/*
- Load the reference sequence(s)
bin/prepare-refseqs.pl --fasta pyu_data/scf1117875582023.fa
- Load the Maker track
bin/flatfile-to-json.pl --gff pyu_data/scratch/split_gff/maker.gff \ --arrowheadClass trellis-arrowhead --getSubfeatures \ --subfeatureClasses '{"wholeCDS": null, "CDS":"brightgreen-80pct", \ "UTR": "darkgreen-60pct", "exon":"container-100pct"}' \ --cssClass container-16px --type mRNA --trackLabel maker \ --webApollo --renderClassName gray-center-20pct
- This is a bash script to automate loading the non-Maker tracks. You can either paste it into the command prompt or use it to build a shell script.
for i in $(ls pyu_data/scratch/split_gff/*.gff | grep -v maker); do j=$(basename $i) j=${j/.gff/} echo "Processing $j" bin/flatfile-to-json.pl --gff $i --arrowheadClass webapollo-arrowhead \ --getSubfeatures --subfeatureClasses "{\"match_part\": \"darkblue-80pct\"}" \ --cssClass container-10px --trackLabel $j \ --webApollo --renderClassName gray-center-20pct done
### Output #Processing blastn #Processing blastx #Processing est2genome #Processing protein2genome #Processing repeatmasker #Processing repeatrunner #Processing snap_masked
- populate the name hash to generate a searchable list of names
bin/generate-names.pl
- Prepare the BAM data for loading
mkdir data/bam cp pyu_data/*.bam* data/bam
- Load the BAM track
bin/add_bam_track.pl --bam_url bam/simulated-sorted.bam --label \ simulated_bam --key "simulated BAM"
- Prepare the BigWig data for loading
mkdir data/bigwig cp pyu_data/*.bw data/bigwig/
- Load the BigWig track
bin/add_bw_track.pl --bw_url bigwig/simulated-sorted.coverage.bw \ --label simulated_bw --key "simulated BigWig"
- The data for this instance should now be all loaded.
Final WebApollo configuration.
- Edit the config.xml file for this instance
- Proper configuration requires us to set several sections of this file to work with our project.
<datastore_directory>$WEB_APOLLO_DATA_DIR</datastore_directory>
- For this instance, we need to set it to:
<datastore_directory/data/dataHome/WebApollo2/Pyu/Annotations</datastore_directory>
- Next, edit the user section to connect this instance to the database back end. The database will handle
- user accounts, including login authentication and account permissions.
<url>ENTER_USER_DATABASE_JDBC_URL</url>
- For this instance, we need to set it to:
<url>jdbc:postgresql://localhost/web_apollo_users</url>
<username>ENTER_USER_DATABASE_USERNAME</username>
- For this instance, we need to set it to:
<username>web_apollo_users_admin</username>
<password>ENTER_USER_DATABASE_PASSWORD</password>
- For this instance, we need to set it to:
<password>web_apollo_users_admin</password>
<refseqs>ENTER_PATH_TO_REFSEQS_JSON_FILE</refseqs>
- For this instance, we need to set it to:
<refseqs>/var/lib/tomcat7/webapps/WebApollo2/jbrowse/data/refSeqs.json</refseqs>
<organism>ENTER_ORGANISM</organism>
- For this instance, we need to set it to:
<organism>Pythium ultimum</organism>
<sequence_type>ENTER_CVTERM_FOR_SEQUENCE</sequence_type>
- This is formatted “CV:term”, and may change depending on your reference assembly.
<sequence_type>sequence:contig</sequence_type>
- Edit the blat.xml file for this instance
<blat_bin>ENTER_PATH_TO_BLAT_BINARY</blat_bin>
- For this instance, we need to set it to:
<blat_bin>/opt/bin/blat</blat_bin>
<tmp_dir>ENTER_PATH_FOR_TEMPORARY_DATA</tmp_dir>
- For this instance, we need to set it to:
<tmp_dir>/data/dataHome/WebApollo2/Pyu/Blat/tmp</tmp_dir>
<database>ENTER_PATH_TO_BLAT_DATABASE</database>
- For this instance, we need to set it to:
<database>/data/dataHome/WebApollo2/Pyu/Blat/Pyu.2bit</database>
<blat_options>ENTER_ANY_BLAT_OPTIONS</blat_options>
- For this instance, we need to set it to:
<blat_options>-minScore=100 -minIdentity=60</blat_options>
Restart Tomcat
- The WebApollo instance should be set up now.
- Tomcat will need to be restarted for WebApollo to display properly.
sudo restart tomcat7
View your WebApollo instance
- You may view your instance by pointing your browser to
http://YOUR _HOST:8080/WebApollo2
- To view the site in JBrowse mode use this link format:
http://YOUR _HOST:8080/WebApollo2/jbrowse
Note on track styling
There are several pre-defined track styles included with the WebApollo release. If you'd like to use an included style, replace the subfeature section with the style name:
whatever-80pct green-80pct blue-80pct purple-80pct springgreen-80pct blueviolet-80pct mediumpurple-80pct orange-80pct darkorange-60pct
For example, to style a match type track with light green, adjust the load command like this:
--subfeatureClasses "{\"match_part\": \"darkblue-80pct\"}"
Becomes:
--subfeatureClasses "{\"match_part\": \"springgreen-80pct\"}"
These styles are located in the custom_track_styles.css file:
jbrowse/plugins/WebApollo2/css/custom_track_styles.css
You can also edit the styles after loading. The information on track styles are located in
jbrowse/data/trackList.json