Difference between revisions of "iRefIndex"

From irefindex
(Linked references, tidied the formatting.)
Line 1: Line 1:
''' A reference index for protein interaction data'''
+
== A reference index for protein interaction data ==
  
 
iRefIndex provides an index of protein interactions available in a number of primary interaction databases including [http://bond.unleashedinformatics.com/ BIND], [http://www.thebiogrid.org/ BioGRID], [http://dip.doe-mbi.ucla.edu/ DIP], [http://www.hprd.org/ HPRD], [http://www.ebi.ac.uk/intact/site/index.jsf IntAct], [http://mint.bio.uniroma2.it/mint/Welcome.do MINT], [http://mips.gsf.de/genre/proj/mpact MPact], [http://mips.gsf.de/proj/ppi/ MPPI] and [http://ophid.utoronto.ca/ OPHID]. This index allows the user to search for a protein and retrieve a non-redundant list of interactors for that protein.
 
iRefIndex provides an index of protein interactions available in a number of primary interaction databases including [http://bond.unleashedinformatics.com/ BIND], [http://www.thebiogrid.org/ BioGRID], [http://dip.doe-mbi.ucla.edu/ DIP], [http://www.hprd.org/ HPRD], [http://www.ebi.ac.uk/intact/site/index.jsf IntAct], [http://mint.bio.uniroma2.it/mint/Welcome.do MINT], [http://mips.gsf.de/genre/proj/mpact MPact], [http://mips.gsf.de/proj/ppi/ MPPI] and [http://ophid.utoronto.ca/ OPHID]. This index allows the user to search for a protein and retrieve a non-redundant list of interactors for that protein.
  
 
+
iRefIndex uses the Sequence Global Unique Identifier ([http://bioinformatics.anl.gov/SEGUID/ SEGUID]) to group proteins and interactions into redundant groups. This method allows users to integrate their own data with the iRefIndex in a way that ensures proteins with the exact same sequence will be represented only once.
iRefIndex uses the Sequence Global Unique Identifier (SEGUID) to group proteins and interactions into redundant groups. This method allows users to integrate their own data with the iRefIndex in a way that ensures proteins with the exact same sequence will be represented only once.
 
 
 
  
 
----
 
----
 
  
 
== Data availability and download ==
 
== Data availability and download ==
 
 
  
 
A subset of iRefIndex data is provided as a tab-delimited text file in PSI-MITAB format. There is no web-interface nor web services at this time.
 
A subset of iRefIndex data is provided as a tab-delimited text file in PSI-MITAB format. There is no web-interface nor web services at this time.
 
  
 
Data is available via anonymous ftp at:
 
Data is available via anonymous ftp at:
  
ftp://ftp.no.embnet.org/irefindex/data
+
ftp://ftp.no.embnet.org/irefindex/data
 
 
 
 
Username: ftp
 
 
 
Password: enter “anonymous” or your email address
 
  
 +
Username: <tt>ftp</tt>
  
 +
Password: enter <tt>anonymous</tt> or your email address
  
 
----
 
----
 
  
 
== Data format and help ==
 
== Data format and help ==
 
  
 
iRefIndex data is provided as a tab-delimited text file in PSI-MITAB 2.5 format. The format is described at
 
iRefIndex data is provided as a tab-delimited text file in PSI-MITAB 2.5 format. The format is described at
  
ftp://ftp.no.embnet.org/irefindex/data/current/README
+
ftp://ftp.no.embnet.org/irefindex/data/current/README
 
 
 
 
This file can also be viewed  [http://biotin.uio.no/wiki/Readme here].
 
  
 +
This file can also be viewed [http://biotin.uio.no/wiki/Readme here].
  
 
iRefIndex data is provided as a single file or in a number of data sets specific to the organism in which the interaction occurs. See the above link for details.
 
iRefIndex data is provided as a single file or in a number of data sets specific to the organism in which the interaction occurs. See the above link for details.
 
  
 
Source data for the current build is described at
 
Source data for the current build is described at
  
ftp://ftp.no.embnet.org/irefindex/data/current/sources.htm.
+
ftp://ftp.no.embnet.org/irefindex/data/current/sources.htm.
 
 
 
 
This file can also be viewed [ftp://ftp.no.embnet.org/irefindex/data/current/sources.htm here]. **
 
 
 
  
 +
This file can also be viewed [ftp://ftp.no.embnet.org/irefindex/data/current/sources.htm here].
  
 
If you need help, see the contact at the bottom of this page.
 
If you need help, see the contact at the bottom of this page.
 
  
 
----
 
----
 
  
 
== Data availability and license ==
 
== Data availability and license ==
 
 
  
 
iRefIndex data distributed on the FTP site includes only those data that may be freely distributed under the copyright license of the source database. This includes data from BIND, BioGRID, IntAct, MINT, MPPI and OPHID.
 
iRefIndex data distributed on the FTP site includes only those data that may be freely distributed under the copyright license of the source database. This includes data from BIND, BioGRID, IntAct, MINT, MPPI and OPHID.
 
  
 
Data released on the public ftp site is released under Creative Commons Attribution License http://creativecommons.org/licenses/by/2.5/
 
Data released on the public ftp site is released under Creative Commons Attribution License http://creativecommons.org/licenses/by/2.5/
 
 
  
 
iRefIndex also integrates data from DIP, HPRD and MPact. These data are not distributed publicly. These data may be made available to academic users under a collaborative agreement.
 
iRefIndex also integrates data from DIP, HPRD and MPact. These data are not distributed publicly. These data may be made available to academic users under a collaborative agreement.
  
Contact ian.donaldson at biotek.uio.no if you are interested in using the iRefIndex database or would like your database included in the public release of the index.
+
Contact <tt>ian.donaldson at biotek.uio.no</tt> if you are interested in using the iRefIndex database or would like your database included in the public release of the index.
 
 
  
 
----
 
----
 
  
 
== Disclaimer ==
 
== Disclaimer ==
 
 
  
 
Data is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
Data is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
 
  
 
----
 
----
 
 
  
 
== Statistics ==
 
== Statistics ==
  
 
+
Statistics for the iRefIndex include a breakdown of interactors and interactions from each data source. These will be made available in the paper.
 
 
Statistics for the iRefIndex include a breakdown of interactors and interactions from each data source. These will be made avaialable in the paper.
 
 
 
  
 
Number of non-redundant proteins: 77,827
 
Number of non-redundant proteins: 77,827
  
 
Number of non-redundant interactions and complexes: 318,349
 
Number of non-redundant interactions and complexes: 318,349
 
  
 
The public dataset distributed on the FTP site contains:
 
The public dataset distributed on the FTP site contains:
 
  
 
Number of non-redundant proteins: 74,423
 
Number of non-redundant proteins: 74,423
  
 
Number of non-redundant interactions and complexes: 284,310
 
Number of non-redundant interactions and complexes: 284,310
 
  
 
----
 
----
 
 
  
 
== Credits ==
 
== Credits ==
 
 
  
 
The iRefIndex was developed at the Biotechnology Centre of Oslo, University of Oslo in the Donaldson lab by Sabry Razick and Ian Donaldson. George Magklaras provided systems engineer support and EMBNet Norway provided hardware support.
 
The iRefIndex was developed at the Biotechnology Centre of Oslo, University of Oslo in the Donaldson lab by Sabry Razick and Ian Donaldson. George Magklaras provided systems engineer support and EMBNet Norway provided hardware support.
 
 
  
 
----
 
----
 
  
 
== Citation ==
 
== Citation ==
 
 
  
 
iRefIndex has been submitted for publication. If you use iRefIndex, please cite iRefIndex: http://irefindex.uio.no and the source databases described below.
 
iRefIndex has been submitted for publication. If you use iRefIndex, please cite iRefIndex: http://irefindex.uio.no and the source databases described below.
 
  
 
iRefIndex consolidates protein interaction data from
 
iRefIndex consolidates protein interaction data from
  
BIND[1, 2],
+
BIND [[#ref1|[1]]], [[#ref2|[2]]],
 
 
BioGRID[3],
 
  
DIP[4],
+
BioGRID [[#ref3|[3]]],
  
HPRD[5, 6],
+
DIP [[#ref4|[4]]],
  
IntAct[7, 8],
+
HPRD [[#ref5|[5]]], [[#ref6|[6]]],
  
MINT[9],
+
IntAct [[#ref7|[7]]], [[#ref8|[8]]],
  
MPact[10],
+
MINT [[#ref9|[9]]],
  
MPPI[11] and
+
MPact [[#ref10|[10]]],
 
 
OPHID[12].
 
  
 +
MPPI [[#ref11|[11]]] and
  
 +
OPHID [[#ref12|[12]]].
  
 
----
 
----
 
 
  
 
== References ==
 
== References ==
  
 +
<div id="ref1">1. Bader GD, Betel D, Hogue CW: '''BIND: the Biomolecular Interaction Network Database'''. Nucleic Acids Res 2003, 31(1):248-250.</div>
  
 +
<div id="ref2">2. Alfarano C, Andrade CE, Anthony K, Bahroos N, Bajec M, Bantoft K, Betel D, Bobechko B, Boutilier K, Burgess E et al: '''The Biomolecular Interaction Network Database and related tools 2005 update'''. Nucleic Acids Res 2005, 33(Database issue):D418-424.</div>
  
1. Bader GD, Betel D, Hogue CW: BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res 2003, 31(1):248-250.
+
<div id="ref3">3. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: '''BioGRID: a general repository for interaction datasets'''. Nucleic Acids Res 2006, 34(Database issue):D535-539.</div>
 
 
2. Alfarano C, Andrade CE, Anthony K, Bahroos N, Bajec M, Bantoft K, Betel D, Bobechko B, Boutilier K, Burgess E et al: The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res 2005, 33(Database issue):D418-424.
 
 
 
3. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic Acids Res 2006, 34(Database issue):D535-539.
 
 
 
4. Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 2004, 32(Database issue):D449-451.
 
 
 
5. Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TK, Gronborg M et al: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res 2003, 13(10):2363-2371.
 
  
6. Mishra GR, Suresh M, Kumaran K, Kannabiran N, Suresh S, Bala P, Shivakumar K, Anuradha N, Reddy R, Raghavan TM et al: Human protein reference database--2006 update. Nucleic Acids Res 2006, 34(Database issue):D411-414.
+
<div id="ref4">4. Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: '''The Database of Interacting Proteins: 2004 update'''. Nucleic Acids Res 2004, 32(Database issue):D449-451.</div>
  
7. Hermjakob H, Montecchi-Palazzi L, Lewington C, Mudali S, Kerrien S, Orchard S, Vingron M, Roechert B, Roepstorff P, Valencia A et al: IntAct: an open source molecular interaction database. Nucleic Acids Res 2004, 32(Database issue):D452-455.
+
<div id="ref5">5. Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TK, Gronborg M et al: '''Development of human protein reference database as an initial platform for approaching systems biology in humans'''. Genome Res 2003, 13(10):2363-2371.</div>
  
8. Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, Dimmer E, Feuermann M, Friedrichsen A, Huntley R et al: IntAct--open source resource for molecular interaction data. Nucleic Acids Res 2007, 35(Database issue):D561-565.
+
<div id="ref6">6. Mishra GR, Suresh M, Kumaran K, Kannabiran N, Suresh S, Bala P, Shivakumar K, Anuradha N, Reddy R, Raghavan TM et al: '''Human protein reference database--2006 update'''. Nucleic Acids Res 2006, 34(Database issue):D411-414.</div>
  
9. Chatr-aryamontri A, Ceol A, Palazzi LM, Nardelli G, Schneider MV, Castagnoli L, Cesareni G: MINT: the Molecular INTeraction database. Nucleic Acids Res 2007, 35(Database issue):D572-574.
+
<div id="ref7">7. Hermjakob H, Montecchi-Palazzi L, Lewington C, Mudali S, Kerrien S, Orchard S, Vingron M, Roechert B, Roepstorff P, Valencia A et al: '''IntAct: an open source molecular interaction database'''. Nucleic Acids Res 2004, 32(Database issue):D452-455.</div>
  
10. Guldener U, Munsterkotter M, Oesterheld M, Pagel P, Ruepp A, Mewes HW, Stumpflen V: MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res 2006, 34(Database issue):D436-441.
+
<div id="ref8">8. Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, Dimmer E, Feuermann M, Friedrichsen A, Huntley R et al: '''IntAct--open source resource for molecular interaction data'''. Nucleic Acids Res 2007, 35(Database issue):D561-565.</div>
  
11. Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Mark P, Stumpflen V, Mewes HW et al: The MIPS mammalian protein-protein interaction database. Bioinformatics 2005, 21(6):832-834.
+
<div id="ref9">9. Chatr-aryamontri A, Ceol A, Palazzi LM, Nardelli G, Schneider MV, Castagnoli L, Cesareni G: '''MINT: the Molecular INTeraction database'''. Nucleic Acids Res 2007, 35(Database issue):D572-574.</div>
  
12. Brown KR, Jurisica I: Online predicted human interaction database. Bioinformatics 2005, 21(9):2076-2082.
+
<div id="ref10">10. Guldener U, Munsterkotter M, Oesterheld M, Pagel P, Ruepp A, Mewes HW, Stumpflen V: '''MPact: the MIPS protein interaction resource on yeast'''. Nucleic Acids Res 2006, 34(Database issue):D436-441.</div>
 
 
13. Babnigg G, Giometti CS: A database of unique protein sequence identifiers for proteome studies. Proteomics 2006, 6(16):4514-4522.
 
  
 +
<div id="ref11">11. Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Mark P, Stumpflen V, Mewes HW et al: '''The MIPS mammalian protein-protein interaction database'''. Bioinformatics 2005, 21(6):832-834.</div>
  
 +
<div id="ref12">12. Brown KR, Jurisica I: '''Online predicted human interaction database'''. Bioinformatics 2005, 21(9):2076-2082.</div>
  
 +
<div id="ref13">13. Babnigg G, Giometti CS: '''A database of unique protein sequence identifiers for proteome studies'''. Proteomics 2006, 6(16):4514-4522.</div>
  
 
== Contact  ==
 
== Contact  ==
 
 
  
 
Suggestions, requests and comments are welcome. Please email
 
Suggestions, requests and comments are welcome. Please email
  
ian.donaldson at biotek.uio.no.
+
ian.donaldson at biotek.uio.no.
 
 
  
Visiting and mail address info is here.
+
Visiting and mail address info is [http://www.biotek.uio.no/research_groups/donaldson_group.html here].

Revision as of 14:02, 5 August 2008

A reference index for protein interaction data

iRefIndex provides an index of protein interactions available in a number of primary interaction databases including BIND, BioGRID, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. This index allows the user to search for a protein and retrieve a non-redundant list of interactors for that protein.

iRefIndex uses the Sequence Global Unique Identifier (SEGUID) to group proteins and interactions into redundant groups. This method allows users to integrate their own data with the iRefIndex in a way that ensures proteins with the exact same sequence will be represented only once.


Data availability and download

A subset of iRefIndex data is provided as a tab-delimited text file in PSI-MITAB format. There is no web-interface nor web services at this time.

Data is available via anonymous ftp at:

ftp://ftp.no.embnet.org/irefindex/data

Username: ftp

Password: enter anonymous or your email address


Data format and help

iRefIndex data is provided as a tab-delimited text file in PSI-MITAB 2.5 format. The format is described at

ftp://ftp.no.embnet.org/irefindex/data/current/README

This file can also be viewed here.

iRefIndex data is provided as a single file or in a number of data sets specific to the organism in which the interaction occurs. See the above link for details.

Source data for the current build is described at

ftp://ftp.no.embnet.org/irefindex/data/current/sources.htm.

This file can also be viewed here.

If you need help, see the contact at the bottom of this page.


Data availability and license

iRefIndex data distributed on the FTP site includes only those data that may be freely distributed under the copyright license of the source database. This includes data from BIND, BioGRID, IntAct, MINT, MPPI and OPHID.

Data released on the public ftp site is released under Creative Commons Attribution License http://creativecommons.org/licenses/by/2.5/

iRefIndex also integrates data from DIP, HPRD and MPact. These data are not distributed publicly. These data may be made available to academic users under a collaborative agreement.

Contact ian.donaldson at biotek.uio.no if you are interested in using the iRefIndex database or would like your database included in the public release of the index.


Disclaimer

Data is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Statistics

Statistics for the iRefIndex include a breakdown of interactors and interactions from each data source. These will be made available in the paper.

Number of non-redundant proteins: 77,827

Number of non-redundant interactions and complexes: 318,349

The public dataset distributed on the FTP site contains:

Number of non-redundant proteins: 74,423

Number of non-redundant interactions and complexes: 284,310


Credits

The iRefIndex was developed at the Biotechnology Centre of Oslo, University of Oslo in the Donaldson lab by Sabry Razick and Ian Donaldson. George Magklaras provided systems engineer support and EMBNet Norway provided hardware support.


Citation

iRefIndex has been submitted for publication. If you use iRefIndex, please cite iRefIndex: http://irefindex.uio.no and the source databases described below.

iRefIndex consolidates protein interaction data from

BIND [1], [2],

BioGRID [3],

DIP [4],

HPRD [5], [6],

IntAct [7], [8],

MINT [9],

MPact [10],

MPPI [11] and

OPHID [12].


References

1. Bader GD, Betel D, Hogue CW: BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res 2003, 31(1):248-250.
2. Alfarano C, Andrade CE, Anthony K, Bahroos N, Bajec M, Bantoft K, Betel D, Bobechko B, Boutilier K, Burgess E et al: The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res 2005, 33(Database issue):D418-424.
3. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic Acids Res 2006, 34(Database issue):D535-539.
4. Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 2004, 32(Database issue):D449-451.
5. Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TK, Gronborg M et al: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res 2003, 13(10):2363-2371.
6. Mishra GR, Suresh M, Kumaran K, Kannabiran N, Suresh S, Bala P, Shivakumar K, Anuradha N, Reddy R, Raghavan TM et al: Human protein reference database--2006 update. Nucleic Acids Res 2006, 34(Database issue):D411-414.
7. Hermjakob H, Montecchi-Palazzi L, Lewington C, Mudali S, Kerrien S, Orchard S, Vingron M, Roechert B, Roepstorff P, Valencia A et al: IntAct: an open source molecular interaction database. Nucleic Acids Res 2004, 32(Database issue):D452-455.
8. Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, Dimmer E, Feuermann M, Friedrichsen A, Huntley R et al: IntAct--open source resource for molecular interaction data. Nucleic Acids Res 2007, 35(Database issue):D561-565.
9. Chatr-aryamontri A, Ceol A, Palazzi LM, Nardelli G, Schneider MV, Castagnoli L, Cesareni G: MINT: the Molecular INTeraction database. Nucleic Acids Res 2007, 35(Database issue):D572-574.
10. Guldener U, Munsterkotter M, Oesterheld M, Pagel P, Ruepp A, Mewes HW, Stumpflen V: MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res 2006, 34(Database issue):D436-441.
11. Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Mark P, Stumpflen V, Mewes HW et al: The MIPS mammalian protein-protein interaction database. Bioinformatics 2005, 21(6):832-834.
12. Brown KR, Jurisica I: Online predicted human interaction database. Bioinformatics 2005, 21(9):2076-2082.
13. Babnigg G, Giometti CS: A database of unique protein sequence identifiers for proteome studies. Proteomics 2006, 6(16):4514-4522.

Contact

Suggestions, requests and comments are welcome. Please email

ian.donaldson at biotek.uio.no.

Visiting and mail address info is here.