iRefIndex Data Preparation for iRefScape

From irefindex
Revision as of 15:26, 14 November 2011 by PaulBoddie (talk | contribs) (→‎Processing the Files: Changed the index-making command.)

A set of index files needs to be generated for iRefScape using a program that reads from database tables already produced during the build process.

Database Modifications

Before running the index generation tools, some database modifications need to be made.

Obtaining the SQL Scripts

Get the scripts from this location:

Using CVS with the appropriate CVSROOT setting, run the following command:

cvs co bioscape/bioscape/modules/interaction/Sabry/SQL_commands

The CVSROOT environment variable should be set to the following for this to work:

export CVSROOT=:ext:<username>@hfaistos.uio.no:/mn/hfaistos/storage/cvsroot

(The <username> should be replaced with your actual username.)

Running the SQL Scripts

In the SQL_commands directory, the preprocess_for_iRefScape.sql script should be run as follows:

mysql -h <hostname> -u <username> -p -A -D <database> < preprocess_for_iRefScape.sql

Here, the database should be the iRefIndex build database.

Building the Software

  1. Get the program's source code from this location:

    https://hfaistos.uio.no/cgi-bin/viewvc.cgi/bioscape/bioscape/modules/interaction/Sabry/Index_maker_for_iRefScape/

    Using CVS with the appropriate CVSROOT setting, run the following command:

    cvs co bioscape/bioscape/modules/interaction/Sabry/Index_maker_for_iRefScape

    The CVSROOT environment variable should be set to the following for this to work:

    export CVSROOT=:ext:<username>@hfaistos.uio.no:/mn/hfaistos/storage/cvsroot
    (The <username> should be replaced with your actual username.)
  2. Compile the source code. It might be necessary to edit the build.xml file, changing the particular filename for the .jar file whose name begins with mysql-connector-java, since this name will change between versions of that library. Compile and create the .jar file as follows:
    ant jar

Running the Software

Run the program as follows:

 java -Xms256m -Xmx4g -cp dist/Index_maker_for_iRefScape.jar graph.no.uio.biotek.MakeGraph <hostname> <database> <username> <output directory>

The specified output directory must already exist and be writeable. This will produce a file called graph.

Exporting the Files

In the SQL_commands directory, the Cytoscape.sql script must be parameterised as follows:

 sed -e 's/<gene2go_filename>/<actual_location_of_gene2go>/;s/<morbidmap_filename>/<actual_location_of_morbidmap>/' Cytoscape.sql > Cytoscape_specific.sql

It may then be run as follows:

 mysql -h <hostname> -u <username> -p -A -D <database> < Cytoscape_specific.sql

A script needs to be run to parameterise the SQL script which produces the output files:

 python make_iRefScape_export_files.py <export directory>

Here, the directory specified should preferably be empty since it will only contain files that will be processed for the final indexes.

This should produce a file called make_iRefScape_export_files_specific.sql, and this can then be run as follows:

 mysql -h <hostname> -u <username> -p -A -D <database> < make_iRefScape_export_files_specific.sql

Here, the database should be the iRefIndex build database.

Processing the Files

First, the graph file must be processed and a resulting file placed in the designated export directory:

 java -cp dist/Index_maker_for_iRefScape.jar jar.no.uio.biotek.MakeIndexs <directory>/graph <export directory>

Typically, the export directory can be the same directory as that holding the graph file.

Then, a program needs to be run on the export directory:

 java -Xms10g -cp dist/Index_maker_for_iRefScape.jar jar.no.uio.biotek.Index2index <export directory>

The result of this program's execution should be a file with a name of the form iRefDATA_MMDDYYYY.irfz. For example:

 iRefDATA_10282011.irfz

Uploading the Files

The iRefScape software and data archive are made available via FTP so that Cytoscape can download them. To publish the files, a directory is made in the following local location:

 /mn/biotroll/ftp/proteas/irefindex/Cytoscape/plugin/archive

For iRefScape 1.0, the following directory is made:

 /mn/biotroll/ftp/proteas/irefindex/Cytoscape/plugin/archive/beta10

Where the iRefScape software is not itself being updated, it suffices to copy the data archive into an existing directory, typically the one referenced by the following symbolic link:

 /mn/biotroll/ftp/proteas/irefindex/Cytoscape/plugin/current

The copy command will thus resemble the following:

 cp iRefDATA_10282011.irfz /mn/biotroll/ftp/proteas/irefindex/Cytoscape/plugin/current/

All iRefIndex Pages

Follow this link for a listing of all iRefIndex related pages (archived and current).