A reference index for protein interaction data
iRefIndex provides an index of protein interactions available in a number of primary interaction databases including BIND, BioGRID, CORUM, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. This index includes multiple interaction types including physical and genetic (mapped to their corresponding protein products) as determined by a multitude of methods. This index allows the user to search for a protein and retrieve a non-redundant list of interactors for that protein.
iRefIndex assigns a global unique identifier (rigid) which looks like 'tjWXXjgPyHyT2J6EwED8zK2x18U' to identify interactions that are identical (according to the sequence and taxon ids of the interactors). iRefIndex also assigns similar looking keys to protein interactors. These keys are global meaning they can be generated by anyone using the method described in the paper. This method allows users to integrate their own data with the iRefIndex in a way that ensures proteins with the exact same sequence will be represented only once.
Publications and further reading
iRefIndex related publications, references for source databases and works citing and using the iRefIndex are provided on the iRefIndex Citations page.
Long term goals of the iRefIndex project
We believe that protein interaction data hold incredible potential for biomedical research. Presently, these data are collected and archived by multiple groups around the world and the number of groups taking part in this work is growing rather than diminishing.
As such, it is important that these databases have the means to effectively exchange and compare data and that they are curating and representing data using similar standards in order to make their data accessible and allow effective use.
To this end, the iRefIndex project has three long term objectives:
- 1) to facilitate exchange of interaction data between interaction databases.
- The iRefIndex paper describes a method for assigning unique and global identifiers to protein interactors, interactions and complexes. This method is independent of the iRefIndex resource and may be used by anyone to facilitate exchange and consolidation of data.
- 2) to consolidate interaction data from multiple sources.
- The method has been used by to index interaction records from multiple sources. The resulting iRefIndex may be used search for the existence of interaction data for any protein regardless of the original resource. Nine interaction databases have been incorporated so far, others will follow.
- 3) to provide feedback to source interaction databases.
- During the process of data consolidation, iRefIndex uses a sophisticated method to keep track of potential problems with source records such as outdated or unfound protein identifiers or incorrectly assigned taxonomy identifiers. These data are provided as feedback files to source interaction databases for correction, clarification or improvements to our own system. This process will help to harmonize data representation and improve the overall quality of interaction records for all source databases. This process will also help source databases to exchange data with one another.