Difference between revisions of "Integrating CMAE"

From GMOD
Jump to: navigation, search
m (Create the Database)
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
Note: This document was generated from a POD formated document checked in at [http://gmod.cvs.sourceforge.net/gmod/cmap/editor/Integrating_CMAE.pod?view=markup sourceforge].  Editing will not result in long term changes.
 
Note: This document was generated from a POD formated document checked in at [http://gmod.cvs.sourceforge.net/gmod/cmap/editor/Integrating_CMAE.pod?view=markup sourceforge].  Editing will not result in long term changes.
 
+
==Integrating the CMap Assembly Editor (CMAE) with In-House Systems==
----
+
==VERSION==
'''Integrating the CMap Assembly Editor (CMAE) with In-House Systems'''
+
 
+
 
+
----
+
'''VERSION'''
+
 
+
 
$Revision: 1.1 $
 
$Revision: 1.1 $
  
 
This document is intended to give a clear idea of what it will take to integrate the CMap Assembly Editor (CMAE) into an organizations in-house data system.
 
This document is intended to give a clear idea of what it will take to integrate the CMap Assembly Editor (CMAE) into an organizations in-house data system.
  
 
+
==Overview==
----
+
'''Overview'''
+
 
+
 
CMAE is built upon the CMap code base. The machines that run the program will need to have the CMap Perl modules installed (at a minimum).
 
CMAE is built upon the CMap code base. The machines that run the program will need to have the CMap Perl modules installed (at a minimum).
  
Line 26: Line 17:
 
CMAE can make modifications to the data. In order for these to become permanent outside of the CMAE environment, plug-ins need to be written to launch external scripts to change the in-house data. For information on this topic, see the "Modifying Data" section.
 
CMAE can make modifications to the data. In order for these to become permanent outside of the CMAE environment, plug-ins need to be written to launch external scripts to change the in-house data. For information on this topic, see the "Modifying Data" section.
  
 
+
==Installing CMAE==
----
+
'''Installing CMAE'''
+
 
+
 
Since CMAE uses much of the CMap code base, the CMap modules need to be installed on each machine using it (even if the data and config files are being served off another machine.
 
Since CMAE uses much of the CMap code base, the CMap modules need to be installed on each machine using it (even if the data and config files are being served off another machine.
  
 +
===Download CMap===
 +
Download CMap from the SourceForge CVS repository
  
'''''Download CMap'''''
 
 
Download CMap from the Source''''''Forge CVS repository
 
  
 +
<code>
 
   $ cvs -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod login
 
   $ cvs -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod login
 
   $ cvs -z3 -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod co -P cmap
 
   $ cvs -z3 -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod co -P cmap
 
+
</code>
 
+
===Install Pre-Requisites===
'''''Install Pre-Requisites'''''
+
 
+
 
CMAE (and CMap) requires a number of modules to be installed prior to installation.
 
CMAE (and CMap) requires a number of modules to be installed prior to installation.
  
 
+
====CMap Pre-Requisites====
''CMap Pre-Requisites''
+
 
+
 
Running "perl Build.PL" will provide a list of missing modules, which can be downloaded from CPAN.
 
Running "perl Build.PL" will provide a list of missing modules, which can be downloaded from CPAN.
  
 
A bundle can be used to install most of these at once. To use this bundle, run:
 
A bundle can be used to install most of these at once. To use this bundle, run:
  
 +
 +
<code>
 
   $ sudo perl -MCPAN -e "install Bundle::CMap"
 
   $ sudo perl -MCPAN -e "install Bundle::CMap"
 
+
</code>
 
The GD module requires the use of the libgd library which can be found at http://www.libgd.org/ .
 
The GD module requires the use of the libgd library which can be found at http://www.libgd.org/ .
  
 
+
====CMAE Pre-Requisites====
''CMAE Pre-Requisites''
+
 
+
 
In addition to the CMap requirements, CMAE requires:
 
In addition to the CMap requirements, CMAE requires:
  
*Perl/Tk (http://www.perltk.org/)
+
* Perl/Tk (http://www.perltk.org/)
 
Perl/Tk can be downloaded from CPAN, http://search.cpan.org/~ni-s/Tk-804.027/ .
 
Perl/Tk can be downloaded from CPAN, http://search.cpan.org/~ni-s/Tk-804.027/ .
  
*Tkzinc (http://www.tkzinc.org/)
+
* Tkzinc (http://www.tkzinc.org/)
 
Zinc can render images using openGL. It can be downloaded from http://www.tkzinc.org/tkzinc/pmwiki.php?n=Main.Download .
 
Zinc can render images using openGL. It can be downloaded from http://www.tkzinc.org/tkzinc/pmwiki.php?n=Main.Download .
  
  
 
+
===Install CMap===
'''''Install CMap'''''
+
 
+
 
The install process will install CMap on the machine as well as CMAE. It will ask you about the location of various web related directories. On a linux system those should be easily answered.
 
The install process will install CMap on the machine as well as CMAE. It will ask you about the location of various web related directories. On a linux system those should be easily answered.
  
 
The install process is simply:
 
The install process is simply:
  
 +
 +
<code>
 
   $ perl Build.PL
 
   $ perl Build.PL
 
   $ ./Build
 
   $ ./Build
 
   $ sudo ./Build install
 
   $ sudo ./Build install
 
+
</code>
 
+
===Create the Database===
'''''Create the Database'''''
+
 
+
 
If you will be serving the data from a web page, the database only needs to be created on the web server.
 
If you will be serving the data from a web page, the database only needs to be created on the web server.
  
Create the CMap database schema by reading the schema file into the database. There are schema files provided for My''''''SQL, Oracle, Postgres, Sybase and SQLite. Each is named cmap.create.dbname (e.g. cmap.create.mysql). They are in the sql directory in the distribution.
+
Create the CMap database schema by reading the schema file into the database. There are schema files provided for [[MySQL]], Oracle, [[PostgreSQL]], Sybase and SQLite. Each is named cmap.create.dbname (e.g. cmap.create.mysql). They are in the sql directory in the distribution.
 
+
 
+
'''''Create the configuration files'''''
+
  
 +
===Create the configuration files===
 
If you will be serving the data from a web page, the config files only need to be created on the web server.
 
If you will be serving the data from a web page, the config files only need to be created on the web server.
  
Line 95: Line 75:
 
For more information about the configuration files, see the ADMINISTRATION.pod document in the docs/ directory.
 
For more information about the configuration files, see the ADMINISTRATION.pod document in the docs/ directory.
  
 
+
==Importing Data==
----
+
'''Importing Data'''
+
 
+
 
The simplest way to import data is with a Perl script using the CMap API.
 
The simplest way to import data is with a Perl script using the CMap API.
  
 
Using the API, the following data types will need to be created:
 
Using the API, the following data types will need to be created:
  
- Species: Each species that maps in the data set belong to must be entered into the database.
+
; - Species : Each species that maps in the data set belong to must be entered into the database.
- Map Sets: A map set is a collection of maps. The maps are of the same type (sequence, FPC, etc) and are usually from the same analysis set. For instance, the contigs from a particular assembly run would be in a set.
+
; - Map Sets : A map set is a collection of maps. The maps are of the same type (sequence, FPC, etc) and are usually from the same analysis set. For instance, the contigs from a particular assembly run would be in a set.
- Maps: Maps can represent many different data types, sequence, physical, genetic, etc. Simply put, a map is any type of data that can be represented as a line with features on it.
+
; - Maps : Maps can represent many different data types, sequence, physical, genetic, etc. Simply put, a map is any type of data that can be represented as a line with features on it.
- Features: Features can be placed on maps. They provide the anchor points for correspondences such as a read is one anchor for a line between read pairs. There are also other types of features that can be used to create banding patterns or heat maps.
+
; - Features : Features can be placed on maps. They provide the anchor points for correspondences such as a read is one anchor for a line between read pairs. There are also other types of features that can be used to create banding patterns or heat maps.
- Map_to_Features: In order to place a map underneath another map, CMAE requires a link between the child map and a feature on the parent. That feature represents the exact placement of the child.
+
; - Map_to_Features : In order to place a map underneath another map, CMAE requires a link between the child map and a feature on the parent. That feature represents the exact placement of the child.
- Correspondences: Correspondences are links between features.
+
; - Correspondences : Correspondences are links between features.
- Attributes and External References (xrefs): CMap also allows for assigning attributes and external references to it's objects (features, maps, etc). These can be useful for adding descriptions or providing data for an external script to work on an object (such as location of a contig's ACE file).
+
; - Attributes and External References (xrefs) : CMap also allows for assigning attributes and external references to it's objects (features, maps, etc). These can be useful for adding descriptions or providing data for an external script to work on an object (such as location of a contig's ACE file).
 
+
 
+
----
+
'''Modifying Data'''
+
  
 +
==Modifying Data==
 
After modifying data in CMAE, the user can save the changes to the CMap database. However, in order to modify the underlying data, a plug-in system has been created.
 
After modifying data in CMAE, the user can save the changes to the CMap database. However, in order to modify the underlying data, a plug-in system has been created.
  
Line 123: Line 97:
 
After modifying the underlying data, the plug-in could then modify the data in the CMap database to be viewed in CMAE.
 
After modifying the underlying data, the plug-in could then modify the data in the CMap database to be viewed in CMAE.
  
 
+
==Conclusion==
----
+
'''Conclusion'''
+
 
+
 
Hopefully, the barrier of entry for using CMAE isn't too great. Please let me know if you see any improvements can be made. Questions and comments can be emailed to the CMAE mailing list, gmod-cmap@lists.sourceforge.net.
 
Hopefully, the barrier of entry for using CMAE isn't too great. Please let me know if you see any improvements can be made. Questions and comments can be emailed to the CMAE mailing list, gmod-cmap@lists.sourceforge.net.
  
 
+
==AUTHOR==
----
+
'''AUTHOR'''
+
 
+
 
Ben Faga, faga@cshl.edu
 
Ben Faga, faga@cshl.edu
  
 
Copyright (c) 2007 Cold Spring Harbor Laboratory
 
Copyright (c) 2007 Cold Spring Harbor Laboratory
 +
 +
[[Category:CMap]]

Latest revision as of 18:53, 23 January 2008

Note: This document was generated from a POD formated document checked in at sourceforge. Editing will not result in long term changes.

Integrating the CMap Assembly Editor (CMAE) with In-House Systems

VERSION

$Revision: 1.1 $

This document is intended to give a clear idea of what it will take to integrate the CMap Assembly Editor (CMAE) into an organizations in-house data system.

Overview

CMAE is built upon the CMap code base. The machines that run the program will need to have the CMap Perl modules installed (at a minimum).

CMAE uses CMap configuration files and reads data from the CMap database. These can be on the local machine or on a web server that the program can access.

For information on Installing CMAE see the "Installing CMAE" section of this document.

Since, CMAE uses the CMap database, the data will have to be loaded. The "Importing Data" section discusses what kind of data is needed and how to use the CMap API to do data imports.

CMAE can make modifications to the data. In order for these to become permanent outside of the CMAE environment, plug-ins need to be written to launch external scripts to change the in-house data. For information on this topic, see the "Modifying Data" section.

Installing CMAE

Since CMAE uses much of the CMap code base, the CMap modules need to be installed on each machine using it (even if the data and config files are being served off another machine.

Download CMap

Download CMap from the SourceForge CVS repository


 $ cvs -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod login
 $ cvs -z3 -d:pserver:anonymous@gmod.cvs.sourceforge.net:/cvsroot/gmod co -P cmap

Install Pre-Requisites

CMAE (and CMap) requires a number of modules to be installed prior to installation.

CMap Pre-Requisites

Running "perl Build.PL" will provide a list of missing modules, which can be downloaded from CPAN.

A bundle can be used to install most of these at once. To use this bundle, run:


 $ sudo perl -MCPAN -e "install Bundle::CMap"

The GD module requires the use of the libgd library which can be found at http://www.libgd.org/ .

CMAE Pre-Requisites

In addition to the CMap requirements, CMAE requires:

Perl/Tk can be downloaded from CPAN, http://search.cpan.org/~ni-s/Tk-804.027/ .

Zinc can render images using openGL. It can be downloaded from http://www.tkzinc.org/tkzinc/pmwiki.php?n=Main.Download .


Install CMap

The install process will install CMap on the machine as well as CMAE. It will ask you about the location of various web related directories. On a linux system those should be easily answered.

The install process is simply:


 $ perl Build.PL
 $ ./Build
 $ sudo ./Build install

Create the Database

If you will be serving the data from a web page, the database only needs to be created on the web server.

Create the CMap database schema by reading the schema file into the database. There are schema files provided for MySQL, Oracle, PostgreSQL, Sybase and SQLite. Each is named cmap.create.dbname (e.g. cmap.create.mysql). They are in the sql directory in the distribution.

Create the configuration files

If you will be serving the data from a web page, the config files only need to be created on the web server.

The configuration files are important to CMap (and hence CMAE). They define which database is to be used and provide information about the different types of maps, features and correspondence evidences in the database.

For more information about the configuration files, see the ADMINISTRATION.pod document in the docs/ directory.

Importing Data

The simplest way to import data is with a Perl script using the CMap API.

Using the API, the following data types will need to be created:

- Species 
Each species that maps in the data set belong to must be entered into the database.
- Map Sets 
A map set is a collection of maps. The maps are of the same type (sequence, FPC, etc) and are usually from the same analysis set. For instance, the contigs from a particular assembly run would be in a set.
- Maps 
Maps can represent many different data types, sequence, physical, genetic, etc. Simply put, a map is any type of data that can be represented as a line with features on it.
- Features 
Features can be placed on maps. They provide the anchor points for correspondences such as a read is one anchor for a line between read pairs. There are also other types of features that can be used to create banding patterns or heat maps.
- Map_to_Features 
In order to place a map underneath another map, CMAE requires a link between the child map and a feature on the parent. That feature represents the exact placement of the child.
- Correspondences 
Correspondences are links between features.
- Attributes and External References (xrefs) 
CMap also allows for assigning attributes and external references to it's objects (features, maps, etc). These can be useful for adding descriptions or providing data for an external script to work on an object (such as location of a contig's ACE file).

Modifying Data

After modifying data in CMAE, the user can save the changes to the CMap database. However, in order to modify the underlying data, a plug-in system has been created.

There are several hooks in the code (and more to be added) where a plug-in can be attached. For instance, there is a plug-in hook attached to the right click menu. A plug-in can then be written to add a button that gets the selected maps, figures out where their ACE files are and passes them to Consed for viewing.

A hook will be added to the "Save changes" method, so any modifications in CMAE can be appropriately handled for the underlying data.

After modifying the underlying data, the plug-in could then modify the data in the CMap database to be viewed in CMAE.

Conclusion

Hopefully, the barrier of entry for using CMAE isn't too great. Please let me know if you see any improvements can be made. Questions and comments can be emailed to the CMAE mailing list, gmod-cmap@lists.sourceforge.net.

AUTHOR

Ben Faga, faga@cshl.edu

Copyright (c) 2007 Cold Spring Harbor Laboratory