Difference between revisions of "iRefIndex Data Preparation for iRefWeb"
(Added database dump and testing details along with a category link.)
(Added identifier mapping table notes.)
|Line 107:||Line 107:|
== All iRefIndex Pages ==
== All iRefIndex Pages ==
Revision as of 17:11, 17 August 2011
A special iRefWeb database needs to be created and populated so that iRefWeb may publish iRefIndex data.
Creating a Database
Enter MySQL using a command like the following:
mysql -h <host> -u <admin> -p -A
The <admin> is the name of the user with administrative privileges. For example:
mysql -h myhost -u admin -p -A
Then create a database using commands of the following form:
create database <database>; grant all privileges on <database>.* to '<username>'@'%';
For example, with <database> given as iRefWeb, <username> given as irefindex, and a substitution for <password>:
create database iRefWeb; grant all privileges on iRefWeb.* to 'irefindex'@'%';
If difficulties occur granting privileges in this way, try the following statements:
grant select, insert, update, delete, create, drop, references, index, alter, create temporary tables, lock tables, execute, create view, show view, create routine, alter routine on <database>.* to '<username>'@'%'; grant process, file on *.* to '<username>'@'%';
You should choose the same user as the one used to build the iRefIndex database since that database will need to be accessed during this activity.
Accessing the Previous iRefWeb Database
This should probably be changed so that the process of building an iRefWeb database depends only on the iRefIndex database.
In order to populate the iRefWeb database, the previous version of the database needs to be referenced. Thus, a command of the following form may be required:
grant select on <old_irefweb_db>.* to '<username>'@'%';
Obtaining the SQL Scripts
Get the scripts from this location:
Using CVS with the appropriate CVSROOT setting, run the following command:
cvs co bioscape/bioscape/modules/interaction/Sabry/SQL_commands
The CVSROOT environment variable should be set to the following for this to work:
(The <username> should be replaced with your actual username.)
Preparing the SQL Scripts
The make_iRefWeb.sql and test_iRefWeb.sql scripts needs to be parameterised to refer to specific databases. The following commands should prepare a suitable version of the scripts given appropriate values for <actual_irefindex_db>, <actual_old_irefweb_db> and <actual_irefweb_db>:
sed -e 's/<irefindex_db>/<actual_irefindex_db>/g;s/<old_irefweb_db>/<actual_old_irefweb_db>/g' make_iRefWeb.sql > make_iRefWeb_specific.sql sed -e 's/<irefweb_db>/<actual_irefweb_db>/g;s/<old_irefweb_db>/<actual_old_irefweb_db>/g' test_iRefWeb.sql > test_iRefWeb_specific.sql
sed -e 's/<irefindex_db>/iRefIndex_full_beta9/g;s/<old_irefweb_db>/iRefWeb8/g' make_iRefWeb.sql > make_iRefWeb_specific.sql sed -e 's/<irefweb_db>/iRefWeb9/g;s/<old_irefweb_db>/iRefWeb8/g' test_iRefWeb.sql > test_iRefWeb_specific.sql
Running the SQL Scripts
In the SQL_commands directory, two scripts - preprocess_for_iRefWeb.sql and make_iRefWeb.sql - together provide a large number of SQL statements for the creation of iRefWeb data. The first of these scripts should be run as follows and any error conditions noted:
mysql -h <hostname> -u <username> -p -A -D <irefindex_database> < preprocess_for_iRefWeb.sql
The prepared version of the second script can then be run if no errors were experienced:
mysql -h <hostname> -u <username> -p -A -D <irefweb_database> < make_iRefWeb_specific.sql
Here, <irefweb_database> refers to the current iRefWeb database created above, whereas <irefindex_database> refers to the current iRefIndex database.
The built database can be tested by running the following:
mysql -h <hostname> -u <username> -p -A -D <irefweb_database> < test_iRefWeb_specific.sql
This will output a number of figures corresponding to changes in the number of interactions and interactors. Large deviations from the typical scale of change may indicate a problem in the generation or assignment of identifiers.
Generating the Database Dump
A database dump can be produced by running a command of the following form:
mysqldump -h <hostname> -u <username> -p --databases <irefweb_database> --tables \ alias interaction interaction_detection_type interaction_interactor interaction_interactor_assignment \ interaction_source_db interaction_source_db_experiment interaction_type interactor \ interactor_alias interactor_detection_type interactor_type name_space score \ sequence sequence_source_db source_db statistics geneid2rog \ > iRefWeb.sql
For iRefIndex 9.0, a table will be made available mapping the RIGIDs used in that release (and subsequent releases) to "legacy" RIGIDs used in earlier releases of iRefIndex. This can be produced by running the following SQL command in the iRefIndex database:
select sourceid, experiment_uid, Represent_rigid, Represent_rogid from join_2 order by sourceid, experiment_uid into outfile '/tmp/iRefIndex_interaction_interactors';
This output file can then be processed using the make_legacy_mapping.py program:
python make_legacy_mapping.py /tmp/iRefIndex_interaction_interactors /tmp/iRefIndex_interaction_rigids
(The program needs Python 2.5 or later since it uses the hashlib module. It could probably be made compatible with earlier Python versions if necessary.)
The resulting output file (/tmp/iRefIndex_interaction_rigids in the above example) is a tab-separated file with correct RIGIDs in the first column and "legacy" RIGIDs in the second column.
All iRefIndex Pages
Follow this link for a listing of all iRefIndex related pages (archived and current).