Difference between revisions of "Statistics iRefIndex 5.0"

From irefindex
Line 2: Line 2:
  
 
*Total interactions : 871,172
 
*Total interactions : 871,172
*Total distinct interactions (based on RIGID): 357,156
+
*Total distinct interactions (based on RIGID): 357,170
*Total distinct proteins (based on ROGID)    :  83,938
+
*Total distinct proteins (based on ROGID)    :  83.940
  
This page lists statistics for our internal version of iRefIndex that includes all of the data from sources used for the current build [[Sources_iRefIndex_4.0]].  This full build of the iRefIndex contains data that cannot be redistributed according to usage policies of the source databases (namely, from DIP, HPRD and MPact databases).  Please contact  ian.donaldson at biotek.uio.no if you would like to obtain a copy of the full iRefIndex build under an academic, collaborative agreement.
+
This page lists statistics for our internal version of iRefIndex that includes all of the data from sources used for the current build [[Sources_iRefIndex_5.0]].  This full build of the iRefIndex contains data that cannot be redistributed according to usage policies of the source databases (namely, from DIP, HPRD and MPact databases).  Please contact  ian.donaldson at biotek.uio.no if you would like to obtain a copy of the full iRefIndex build under an academic, collaborative agreement.
  
The data that are freely available at ftp://ftp.no.embnet.org/irefindex/data/current/ are a subset of the full build that we can freely redistribute according to the usage policies of the source databases. Please refer to http://irefindex.uio.no/wiki/Statistics_iRefIndex_free_4.0 for statistics that are applicable to this free dataset.
+
The data that are freely available at ftp://ftp.no.embnet.org/irefindex/data/current/ are a subset of the full build that we can freely redistribute according to the usage policies of the source databases. Please refer to http://irefindex.uio.no/wiki/Statistics_iRefIndex_free_5.0 for statistics that are applicable to this free dataset.
  
 
== Interactions available from major taxonomies ==
 
== Interactions available from major taxonomies ==
Line 15: Line 15:
 
| align="center" style="background:#f0f0f0;"|''' Name'''
 
| align="center" style="background:#f0f0f0;"|''' Name'''
 
| align="center" style="background:#f0f0f0;"|'''Number of interactions'''
 
| align="center" style="background:#f0f0f0;"|'''Number of interactions'''
 +
| align="center" style="background:#f0f0f0;"|''''''
 
|-
 
|-
 
| 4932|| Saccharomyces cerevisiae        ||115570||
 
| 4932|| Saccharomyces cerevisiae        ||115570||
 
|-
 
|-
| 9606|| Homo sapiens                    ||103700||
+
| 9606|| Homo sapiens                    ||103695||
 
|-
 
|-
| 7227|| Drosophila melanogaster        ||46240||
+
| 7227|| Drosophila melanogaster        ||46260||
 
|-
 
|-
 
| 40674|| Mammalia                        ||35023||
 
| 40674|| Mammalia                        ||35023||
Line 48: Line 49:
 
|  
 
|  
 
|}
 
|}
 
 
* for the full list :http://irefindex.uio.no/wikifiles//images/9/90/Interactions_by_taxonomy_beta5_full.pdf
 
* for the full list :http://irefindex.uio.no/wikifiles//images/9/90/Interactions_by_taxonomy_beta5_full.pdf
  
Line 57: Line 57:
 
| align="center" style="background:#f0f0f0;"|'''115562'''
 
| align="center" style="background:#f0f0f0;"|'''115562'''
 
|-
 
|-
| 9606|| Homo sapiens                    ||114618
+
| 9606|| Homo sapiens                    ||114613
 
|-
 
|-
| 7227|| Drosophila melanogaster        ||46244
+
| 7227|| Drosophila melanogaster        ||46264
 
|-
 
|-
 
| 197|| Campylobacter jejuni            ||11998
 
| 197|| Campylobacter jejuni            ||11998
Line 91: Line 91:
  
 
== Interactions  (Corresponds to Table 6 in PMID 18823568)==
 
== Interactions  (Corresponds to Table 6 in PMID 18823568)==
{|
+
 
| BIND||62896
 
|-
 
| BIOGRID||20564  || 164770
 
|-
 
| DIP||25930  || 29004  || 56430
 
|-
 
| HPRD||2947  || 2016  || 858  || 39966
 
|-
 
| INTACT||24400  || 26946  || 25008  || 8442  || 113877
 
|-
 
| MINT||22066  || 34683  || 30052  || 6563  || 45338  || 76602
 
|-
 
| MPACT||6938  || 8489  || 6793  || 0  || 6132  || 6426  || 13321
 
|-
 
| MPPI||385  || 27  || 41  || 304  || 89  || 71  || 0  || 826
 
|-
 
| OPHID||2226  || 1357  || 899  || 18063  || 7248  || 6396  || 0  || 183  || 47297
 
|-
 
| CORUM||116  || 18  || 29  || 403  || 122  || 66  || 0  || 9  || 158  || 1919
 
|-
 
| ||BIND||BIOGRID||DIP||HPRD||INTACT||MINT||MPACT||MPPI||OPHID||CORUM
 
|-
 
| ||(25793)||(111301)||(13496)||(16472)||(56805)||(15495)||(1132)||(235)||(26461)||(1398)
 
|}
 
  
 
== Interactors ==
 
== Interactors ==
{|
+
 
| BIND||40783
 
|-
 
| BIOGRID||14503  || 27694
 
|-
 
| DIP||15398  || 13106  || 20108
 
|-
 
| HPRD||3357  || 2522  || 1251  || 9750
 
|-
 
| INTACT||18205  || 16989  || 15721  || 5967  || 42061
 
|-
 
| MINT||16198  || 15112  || 14999  || 4669  || 23460  || 28424
 
|-
 
| MPACT||4651  || 4560  || 4639  || 0  || 4874  || 4756  || 4972
 
|-
 
| MPPI||674  || 219  || 294  || 430  || 579  || 505  || 0  || 861
 
|-
 
| OPHID||3253  || 2325  || 1242  || 7426  || 5809  || 4708  || 1  || 422  || 9626
 
|-
 
| CORUM||1561  || 756  || 671  || 1867  || 2312  || 1733  || 0  || 322  || 1849  || 3581
 
|-
 
| ||BIND||BIOGRID||DIP||HPRD||INTACT||MINT||MPACT||MPPI||OPHID||CORUM
 
|-
 
| ||(18529)||(8247)||(1834)||(1094)||(12125)||(3228)||(16)||(42)||(1257)||(620)
 
|}
 
  
 
==Summary of mapping interaction records to RIGs (Corresponds to Table 5 in PMID 18823568) ==
 
==Summary of mapping interaction records to RIGs (Corresponds to Table 5 in PMID 18823568) ==
{|
+
 
| align="center" style="background:#f0f0f0;"|'''Source'''||align="center" style="background:#f0f0f0;"|'''Total records'''||align="center" style="background:#f0f0f0;"|'''Protein-only interactors'''||align="center" style="background:#f0f0f0;"|'''PPI Assigned to RIGID'''||align="center" style="background:#f0f0f0;"|'''Unique RIGIDs'''
 
|-
 
| bind||193648||93957||91265(97.1349%)||62896(68.9158%)
 
|-
 
| grid||242126||242126||241823(99.8749%)||164770(68.1366%)
 
|-
 
| dip||57675||57675||56597(98.1309%)||56430(99.7049%)
 
|-
 
| intact||133302||132525||132077(99.6620%)||113877(86.2202%)
 
|-
 
| mint||109398||109398||107808(98.5466%)||76602(71.0541%)
 
|-
 
| HPRD||40075||40075||40075(100.0000%)||39966(99.7280%)
 
|-
 
| ophid||73257||73257||72907(99.5222%)||47297(64.8731%)
 
|-
 
| MPACT||16504||16504||16286(98.6791%)||13321(81.7942%)
 
|-
 
| MPPI||1814||1814||1688(93.0540%)||826(48.9336%)
 
|-
 
| CORUM||2104||2104||2104(100.0000%)||1919(91.2072%)
 
|-
 
| ALL||869903||769435||762630(99.1156%)||372649(48.8637%)
 
|}
 
  
 
== Assignment of protein interactors to ROGs (Corresponds to Table 3 in PMID 18823568)  ==
 
== Assignment of protein interactors to ROGs (Corresponds to Table 3 in PMID 18823568)  ==
{|
 
| align="center" style="background:#f0f0f0;"|'''Source'''||align="center" style="background:#f0f0f0;"|'''Protein_Intractors'''||align="center" style="background:#f0f0f0;"|'''Assigned'''||align="center" style="background:#f0f0f0;"|'''%'''||align="center" style="background:#f0f0f0;"|'''Arbitrary'''||align="center" style="background:#f0f0f0;"|'''New'''||align="center" style="background:#f0f0f0;"|'''Unassigned'''||align="center" style="background:#f0f0f0;"|'''Unique proteins'''
 
|-
 
| bind||285482||273658||95.8582||0||7887||3937||40783
 
|-
 
| CORUM||10316||10314||99.9806||0||2||0||3581
 
|-
 
| dip||20728||18527||89.3815||1246||477||478||20108
 
|-
 
| grid||29599||19354||65.3873||10141||6||98||27694
 
|-
 
| HPRD||9773||9676||99.0075||55||42||0||9750
 
|-
 
| intact||100752||97166||96.4408||19||3323||244||42061
 
|-
 
| mint||76898||73745||95.8998||2||2678||473||28424
 
|-
 
| MPACT||40349||40112||99.4126||0||0||237||4972
 
|-
 
| MPPI||3628||3457||95.2867||0||30||141||861
 
|-
 
| ophid||146423||145362||99.2754||103||699||259||9626
 
|-
 
| All||723948||691371||95.5001||11566||15144||5867||83809
 
|}
 
  
 
==  ROG summary (Corresponds to Table 4 in PMID 18823568) ==
 
==  ROG summary (Corresponds to Table 4 in PMID 18823568) ==
{|
+
 
| align="center" style="background:#f0f0f0;"|'''Decimal_score'''||align="center" style="background:#f0f0f0;"|'''Binary_flag'''||align="center" style="background:#f0f0f0;"|'''String_score'''||align="center" style="background:#f0f0f0;"|'''Score_class'''||align="center" style="background:#f0f0f0;"|'''Proteins'''||align="center" style="background:#f0f0f0;"|'''Percentage'''||align="center" style="background:#f0f0f0;"|'''BIND'''||align="center" style="background:#f0f0f0;"|'''BioGrid'''||align="center" style="background:#f0f0f0;"|'''DIP'''||align="center" style="background:#f0f0f0;"|'''MINT'''||align="center" style="background:#f0f0f0;"|'''HPRD'''||align="center" style="background:#f0f0f0;"|'''OPHID'''||align="center" style="background:#f0f0f0;"|'''MPPI'''||align="center" style="background:#f0f0f0;"|'''MPACT'''||align="center" style="background:#f0f0f0;"|'''IntAct'''||align="center" style="background:#f0f0f0;"|'''CORUM'''
 
|-
 
| 1||000000000000000001||P||1||565134||78.0628%||232690||7520||0||71606||0||125715||3023||30666||93914||0
 
|-
 
| 554||000000001000101010||SVGO||1||624||0.0862%||0||0||0||0||624||0||0||0||0||0
 
|-
 
| 66||000000000001000010||SD||1||2||0.0003%||0||2||0||0||0||0||0||0||0||0
 
|-
 
| 65||000000000001000001||PD||1||9581||1.3234%||8084||1494||0||3||0||0||0||0||0||0
 
|-
 
| 42||000000000000101010||SVG||1||163||0.0225%||0||0||0||0||163||0||0||0||0||0
 
|-
 
| 8193||000010000000000001||PI||1||49||0.0068%||0||2||0||0||0||0||0||0||47||0
 
|-
 
| 129||000000000010000001||PM||1||523||0.0722%||473||1||0||0||0||0||32||0||17||0
 
|-
 
| 8194||000010000000000010||SI||1||12399||1.7127%||12336||63||0||0||0||0||0||0||0||0
 
|-
 
| 10||000000000000001010||SV||1||13||0.0018%||0||0||2||4||0||0||0||0||7||0
 
|-
 
| 2||000000000000000010||S||1||35124||4.8517%||0||7473||17447||252||2772||0||0||6927||253||0
 
|-
 
| 130||000000000010000010||SM||1||570||0.0787%||0||570||0||0||0||0||0||0||0||0
 
|-
 
| 778||000000001100001010||SVO+||2||1||0.0001%||0||0||0||0||0||0||0||0||1||0
 
|-
 
| 774||000000001100000110||SUO+||2||1||0.0001%||0||0||0||0||0||0||0||0||1||0
 
|-
 
| 16385||000100000000000001||PE||2||184||0.0254%||0||0||0||0||0||0||0||0||184||0
 
|-
 
| 16386||000100000000000010||SE||2||5414||0.7478%||5414||0||0||0||0||0||0||0||0||0
 
|-
 
| 773||000000001100000101||PUO+||2||12||0.0017%||0||0||0||3||0||1||0||0||8||0
 
|-
 
| 5||000000000000000101||PU||2||22812||3.1511%||0||0||0||264||0||19519||320||2519||190||0
 
|-
 
| 6||000000000000000110||SU||2||767||0.1059%||0||690||60||4||5||0||0||0||8||0
 
|-
 
| 145||000000000010010001||PTM||3||170||0.0235%||132||0||0||0||0||0||35||0||3||0
 
|-
 
| 8210||000010000000010010||STI||3||905||0.1250%||855||50||0||0||0||0||0||0||0||0
 
|-
 
| 8209||000010000000010001||PTI||3||12||0.0017%||0||0||0||0||0||0||0||0||12||0
 
|-
 
| 17||000000000000010001||PT||3||26392||3.6456%||11873||0||0||1604||0||122||47||0||2456||10290
 
|-
 
| 18||000000000000010010||ST||3||8547||1.1806%||0||1487||1015||0||6042||0||0||0||3||0
 
|-
 
| 26||000000000000011010||SVT||3||1||0.0001%||0||0||0||0||0||0||0||0||1||0
 
|-
 
| 81||000000000001010001||PTD||3||1487||0.2054%||1486||0||0||1||0||0||0||0||0||0
 
|-
 
| 146||000000000010010010||STM||3||1||0.0001%||0||1||0||0||0||0||0||0||0||0
 
|-
 
| 790||000000001100010110||SUTO+||4||1||0.0001%||0||0||0||0||1||0||0||0||0||0
 
|-
 
| 16401||000100000000010001||PTE||4||3||0.0004%||0||0||0||0||0||0||0||0||3||0
 
|-
 
| 789||000000001100010101||PUTO+||4||14||0.0019%||0||0||0||0||0||0||0||0||14||0
 
|-
 
| 16402||000100000000010010||STE||4||315||0.0435%||315||0||0||0||0||0||0||0||0||0
 
|-
 
| 22||000000000000010110||SUT||4||18||0.0025%||0||1||3||0||14||0||0||0||0||0
 
|-
 
| 131073||100000000000000001||PQ||5||2||0.0003%||0||0||0||0||0||0||0||0||2||0
 
|-
 
| 21||000000000000010101||PUT||5||33||0.0046%||0||0||0||4||0||5||0||0||0||24
 
|-
 
| 12546||000011000100000010||SLI+||5||6716||0.9277%||0||6716||0||0||0||0||0||0||0||0
 
|-
 
| 131077||100000000000000101||PUQ||5||1||0.0001%||0||0||0||0||0||0||0||0||1||0
 
|-
 
| 131089||100000000000010001||PTQ||5||38||0.0052%||0||0||0||0||0||0||0||0||38||0
 
|-
 
| 4373||000001000100010101||PUTL+||5||9||0.0012%||0||0||0||1||0||0||0||0||8||0
 
|-
 
| 4354||000001000100000010||SL+||5||4208||0.5813%||0||3014||1194||0||0||0||0||0||0||0
 
|-
 
| 1802||000000011100001010||SVOX+||5||3||0.0004%||0||0||0||0||0||0||0||0||3||0
 
|-
 
| 810||000000001100101010||SVGO+||5||55||0.0076%||0||0||0||0||55||0||0||0||0||0
 
|-
 
| 4357||000001000100000101||PUL+||5||84||0.0116%||0||0||0||0||0||84||0||0||0||0
 
|-
 
| 4374||000001000100010110||SUTL+||5||52||0.0072%||0||0||52||0||0||0||0||0||0||0
 
|-
 
| 4394||000001000100101010||SVGL+||5||52||0.0072%||0||0||0||0||52||0||0||0||0||0
 
|-
 
| 4482||000001000110000010||SML+||5||411||0.0568%||0||411||0||0||0||0||0||0||0||0
 
|-
 
| 5381||000001010100000101||PUXL+||5||29||0.0040%||0||0||0||0||0||19||0||0||10||0
 
|-
 
| 5382||000001010100000110||SUXL+||5||3||0.0004%||0||0||0||0||3||0||0||0||0||0
 
|-
 
| 5386||000001010100001010||SVXL+||5||2||0.0003%||0||0||0||1||0||0||0||0||1||0
 
|-
 
| 86274||010101000100000010||SLEN+||6||3||0.0004%||0||2||1||0||0||0||0||0||0||0
 
|-
 
| 81938||010100000000010010||STEN||6||24||0.0033%||24||0||0||0||0||0||0||0||0||0
 
|-
 
| 81937||010100000000010001||PTEN||6||3||0.0004%||3||0||0||0||0||0||0||0||0||0
 
|-
 
| 81922||010100000000000010||SEN||6||5766||0.7965%||5397||4||364||1||0||0||0||0||0||0
 
|-
 
| 81921||010100000000000001||PEN||6||2858||0.3948%||2462||0||0||49||0||98||30||0||217||2
 
|-
 
| 65601||010000000001000001||PDN||6||1||0.0001%||1||0||0||0||0||0||0||0||0||0
 
|-
 
| 65553||010000000000010001||PTN||6||10||0.0014%||0||0||0||0||0||0||0||0||10||0
 
|-
 
| 65537||010000000000000001||PN||6||6478||0.8948%||0||0||112||2628||42||601||0||0||3095||0
 
|-
 
| 196625||110000000000010001||PTNQ||6||1||0.0001%||0||0||0||0||0||0||0||0||1||0
 
|}
 
  
 
== Scores (Corresponds to Table 2 in PMID 18823568) ==
 
== Scores (Corresponds to Table 2 in PMID 18823568) ==
  
{|
 
| align="center" style="background:#f0f0f0;"|'''Character'''||align="center" style="background:#f0f0f0;"|'''Description of feature (when the value is 1)'''||align="center" style="background:#f0f0f0;"|'''Frequency'''
 
|-
 
| D||The source database (D) listed in the interaction record is different than what is expected for the given accession for the protein. In specific cases, this difference is tolerated and the assignment is made.||11071(1.5417%)
 
|-
 
| E||The protein reference was a retired NCBI Identifier. NCBI's eUtils (E) were used to retrieve the current accession and/or sequence.||14570(2.029%)
 
|-
 
| G||The interaction record's reference for the protein was an EntrezGene (G) identifier. The corresponding products of the gene were used to make the assignment.||894(0.1245%)
 
|-
 
| L||More than one possible assignment is possible (see + above). The assignment with the largest (L) SEGUID is arbitrarily chosen (see Methods)||11569(1.6111%)
 
|-
 
| M||The protein reference listed by the interaction record was a typographical modification (M) of a known accession. In specific cases, this variation is tolerated and the assignment is made.||1675(0.2333%)
 
|-
 
| +||More than one possible assignment is possible (+). This case may arise in one of three ways. 1) The reference supplied by the interaction record requires updating but more than one possibility exists. For example, Q7XJL8 was found to be a secondary accession in three separate UniProt records (Q3EBZ2, Q6DR20, and Q8GWA9). 2) The secondary references supplied by the interaction record point to more than one unique protein sequence. 3) An EntrezGene identifier is provided in the interaction record as a protein reference. This identifier points to more than one protein product. An attempt is made to resolve this ambiguity as indicated by ROG score features O, X or L (see below).||11656(1.6232%)
 
|-
 
| N||The protein reference, taxonomy identifier and sequence for the protein as provided in the interaction record are used to make a new entry in the SEGUID table. The protein interactor is assigned the newly (N) generated ROG identifier.||15144(2.109%)
 
|-
 
| O||More than one possible assignment is possible (see + above). The assignment chosen has a SEGUID that is identical to the SEGUID of the original (O) sequence provided in the interaction record.||711(0.099%)
 
|-
 
| I||The protein reference used was an NCBI GenInfo Identifier (I).||20081(2.7965%)
 
|-
 
| U||The protein reference listed in the interaction record and used to make the assignment was a secondary UniProt accession and was updated (U) to a primary UniProt accession in order to make the assignment.||23836(3.3194%)
 
|-
 
| T||The taxonomy (T) identifier for the protein (as supplied by the interaction record) differed from what was found in the protein sequence record. This discrepancy was tolerated and the assignment was made||38036(5.2969%)
 
|-
 
| V||The protein reference listed by the interaction record contained version (V) information that was ignored. For example, RefSeq accession.version NP_012420.1 was listed but treated as RefSeq accession NP_012420.||914(0.1273%)
 
|-
 
| Q||The protein reference used to make the assignment was of the type 'see-also'. See PSI-MI Path: entrySet/entry/interactorList/interactor/xref/primaryRef/refType = 'see-also'.||42(0.0058%)
 
|-
 
| P||The interaction record's primary (P) reference for the protein was used to make the assignment||635920(88.5583%)
 
|-
 
| S||One of the interaction record's secondary (S) references for the protein was used to make the assignment||82161(11.4417%)
 
|-
 
| X||More than one possible assignment is possible (see + above). The assignment chosen has the same taxonomy (X) identifier as listed in the interaction record||37(0.0052%)
 
|}
 
  
  
 
[[Category:iRefIndex]]
 
[[Category:iRefIndex]]

Revision as of 10:13, 27 July 2009

Summary

  • Total interactions : 871,172
  • Total distinct interactions (based on RIGID): 357,170
  • Total distinct proteins (based on ROGID) : 83.940

This page lists statistics for our internal version of iRefIndex that includes all of the data from sources used for the current build Sources_iRefIndex_5.0. This full build of the iRefIndex contains data that cannot be redistributed according to usage policies of the source databases (namely, from DIP, HPRD and MPact databases). Please contact ian.donaldson at biotek.uio.no if you would like to obtain a copy of the full iRefIndex build under an academic, collaborative agreement.

The data that are freely available at ftp://ftp.no.embnet.org/irefindex/data/current/ are a subset of the full build that we can freely redistribute according to the usage policies of the source databases. Please refer to http://irefindex.uio.no/wiki/Statistics_iRefIndex_free_5.0 for statistics that are applicable to this free dataset.

Interactions available from major taxonomies

Top 15 uncorrected taxonomy groups in iRefIndex (Taxonomy identifiers as they appear in original source)

NCBI taxonomy identifier Name Number of interactions '
4932 Saccharomyces cerevisiae 115570
9606 Homo sapiens 103695
7227 Drosophila melanogaster 46260
40674 Mammalia 35023
197 Campylobacter jejuni 11998
6239 Caenorhabditis elegans 11793
284812 Schizosaccharomyces pombe 972h- 11556
10090 Mus musculus 9964
562 Escherichia coli 8512
3702 Arabidopsis thaliana 5348
160 Treponema pallidum 3646
83333 Escherichia coli K-12 3490
10116 Rattus norvegicus 3477
1142 Synechocystis 3057
36329 Plasmodium falciparum 3D7 2731

Top 15 corrected taxonomy groups in iRefIndex (Taxonomy identifiers corrected using sequence database information)

4932 Saccharomyces cerevisiae 115562
9606 Homo sapiens 114613
7227 Drosophila melanogaster 46264
197 Campylobacter jejuni 11998
4896 Schizosaccharomyces pombe 11831
6239 Caenorhabditis elegans 11793
10090 Mus musculus 8569
83333 Escherichia coli K-12 7482
3702 Arabidopsis thaliana 5354
155864 Escherichia coli O157:H7 EDL933 4928
160 Treponema pallidum 3646
1148 Synechocystis sp. PCC 6803 3166
36329 Plasmodium falciparum 3D7 2731
10116 Rattus norvegicus 2650
85962 Helicobacter pylori 26695 1598

Interactions (Corresponds to Table 6 in PMID 18823568)

Interactors

Summary of mapping interaction records to RIGs (Corresponds to Table 5 in PMID 18823568)

Assignment of protein interactors to ROGs (Corresponds to Table 3 in PMID 18823568)

ROG summary (Corresponds to Table 4 in PMID 18823568)

Scores (Corresponds to Table 2 in PMID 18823568)