Statistics iRefIndex 8.0

From irefindex
Revision as of 13:33, 17 November 2010 by Sabry (talk | contribs)

Summary

Last updated: 2010-11-17

  • Total interaction source records :
  • Total distinct interactions (based on RIGID): ( % of total interactions)
  • Total distinct proteins (based on ROGID) :

This page lists statistics for our internal version of iRefIndex that includes all of the data from sources used for the current build Sources_iRefIndex_8.0.

Interactions available from major taxonomies

Top 15 uncorrected taxonomy groups in iRefIndex (Taxonomy identifiers as they appear in original source)

  • Full list [[1]]

Top 15 corrected taxonomy groups in iRefIndex (Taxonomy identifiers corrected using sequence database information)

  • Full list [[2]]

Interactions (Corresponds to Table 6 in PMID 18823568)

BIND 62896
BIOGRID 23631 241088
DIP 26457 29866 68301
HPRD 2530 11000 553 40490
INTACT 24320 30088 25124 6603 130316
MINT 22144 36491 30784 5193 47949 83727
MPACT 6965 8287 6857 0 6175 6470 13328
MPPI 382 114 41 198 94 74 0 832
OPHID 0 0 0 0 0 0 0 0 0
CORUM 231 116 66 311 236 108 0 15 0 2607
I2D 31771 67039 34961 24460 69947 58975 8382 646 0 500 481732
BIND_TRANSLATION 55057 23657 23874 3167 23943 22041 6316 379 0 190 33437 61348
INNATEDB 289 656 236 1186 546 431 0 46 0 73 4122 398 6114
MATRIXDB 2 5 3 20 15 2 0 2 0 0 40 5 4 201
BIND BIOGRID DIP HPRD INTACT MINT MPACT MPPI OPHID CORUM I2D BIND_TRANSLATION INNATEDB MATRIXDB
(3656) (156331) (22561) (12571) (47752) (12213) (1112) (91) (0) (1807) (340528) (2311) (1838) (146)

Interactors

BIND 40784
BIOGRID 17136 30716
DIP 15599 13707 21334
HPRD 3205 6004 1081 9846
INTACT 18675 20495 16619 6141 50583
MINT 16570 16738 15432 4534 24840 30382
MPACT 4663 4554 4652 0 4879 4771 4978
MPPI 680 406 291 359 625 524 0 867
OPHID 0 0 0 0 0 0 0 0 0
CORUM 2036 1921 859 1967 3115 2280 0 416 0 4365
I2D 19257 20632 14835 8231 29594 22941 4956 816 0 3843 51871
BIND_TRANSLATION 38396 17468 15289 3592 19247 16899 4379 688 0 2208 20031 40956
INNATEDB 1440 1449 778 1602 2167 1722 0 339 0 1047 2908 1601 2998
MATRIXDB 109 92 57 126 136 111 0 18 0 52 189 115 85 222
BIND BIOGRID DIP HPRD INTACT MINT MPACT MPPI OPHID CORUM I2D BIND_TRANSLATION INNATEDB MATRIXDB
(1352) (5320) (1815) (728) (13291) (2757) (8) (14) (0) (305) (14805) (896) (62) (22)


Summary of mapping interaction records to RIGs (Corresponds to Table 5 in PMID 18823568)

Source Total records Protein-only interactors PPI Assigned to RIGID Unique RIGIDs
bind 193648 93957 91264(97.1338%) 62896(68.9165%)
grid 355947 351586 351145(99.8746%) 241088(68.6577%)
dip 69463 69463 68390(98.4553%) 68301(99.8699%)
intact 154833 153547 152961(99.6184%) 130316(85.1956%)
mint 118153 118153 117679(99.5988%) 83727(71.1486%)
HPRD 40618 40618 40618(100.0000%) 40490(99.6849%)
MPACT 16504 16504 16293(98.7215%) 13328(81.8020%)
MPPI 1814 1814 1703(93.8809%) 832(48.8550%)
CORUM 2844 2844 2844(100.0000%) 2607(91.6667%)
I2D 847845 847845 847837(99.9991%) 481732(56.8189%)
BIND_Translation 213166 89275 83179(93.1717%) 61348(73.7542%)
InnateDB 13609 10595 9970(94.1010%) 6114(61.3240%)
MatrixDB 846 349 321(91.9771%) 201(62.6168%)
ALL 2029290 1796550 1784204(99.3128%) 792679(44.4276%)


Assignment of protein interactors to ROGs (Corresponds to Table 3 in PMID 18823568)

Source Protein_Intractors Assigned % Arbitrary N_and_Y Unassigned Unique proteins
bind 285482 272821 95.5650 0 8732 3923 40784
BIND_Translation 297101 265324 89.3043 10290 1549 19938 40956
CORUM 12916 12916 100.0000 0 0 0 4365
dip 21970 20290 92.3532 671 532 477 21334
grid 38197 31321 81.9986 6668 2 206 30716
HPRD 9871 9630 97.5585 192 49 0 9846
I2D 52222 52107 99.7798 4 108 3 51871
InnateDB 25172 24386 96.8775 0 0 786 2998
intact 125474 121570 96.8886 38 3553 313 50583
MatrixDB 1123 1077 95.9038 0 0 46 222
mint 82312 78661 95.5644 1 3524 126 30382
MPACT 40349 40118 99.4275 0 1 230 4978
MPPI 3628 3457 95.2867 0 47 124 867
All 995817 933684 93.7606 17864 18097 26172 105435

ROG summary (Corresponds to Table 4 in PMID 18823568)

Decimal_score Binary_flag String_score Score_class Proteins Percentage BIND BioGrid DIP MINT HPRD OPHID MPPI MPACT IntAct CORUM BIND_Translation
32834 001000000001000010 SDY -1 193 0.0194% 0 0 0 0 0 0 0 0 0 0 193
770 000000001100000010 SO+ -1 6 0.0006% 0 0 0 0 0 0 0 0 6 0 0
1797 000000011100000101 PUOX+ -1 2 0.0002% 0 0 0 0 0 0 0 0 2 0 0
9 000000000000001001 PV -1 1 0.0001% 0 0 0 0 0 0 0 0 1 0 0
1 000000000000000001 P 1 568766 57.1155% 231951 19013 0 76612 0 0 3023 30666 117787 12916 19
8193 000010000000000001 PI 1 225108 22.6054% 0 0 0 3 0 0 0 0 51 0 225054
2 000000000000000010 S 1 53213 5.3437% 0 20 19310 254 2267 0 0 6935 317 0 24110
8194 000010000000000010 SI 1 12336 1.2388% 12336 0 0 0 0 0 0 0 0 0 0
65 000000000001000001 PD 1 8074 0.8108% 8073 0 0 1 0 0 0 0 0 0 0
554 000000001000101010 SVGO 1 2195 0.2204% 0 0 0 0 2195 0 0 0 0 0 0
41 000000000000101001 PVG 1 1747 0.1754% 0 1747 0 0 0 0 0 0 0 0 0
10 000000000000001010 SV 1 1419 0.1425% 0 0 5 19 0 0 0 0 253 0 1142
129 000000000010000001 PM 1 678 0.0681% 473 0 0 0 0 0 32 0 36 0 0
42 000000000000101010 SVG 1 198 0.0199% 0 0 12 0 186 0 0 0 0 0 0
66 000000000001000010 SD 1 117 0.0117% 0 4 9 0 0 0 0 0 0 0 88
16386 000100000000000010 SE 2 5420 0.5443% 5405 0 0 0 15 0 0 0 0 0 0
5 000000000000000101 PU 2 3858 0.3874% 0 0 0 114 0 0 320 2517 365 0 0
16385 000100000000000001 PE 2 1488 0.1494% 0 0 0 0 0 0 0 0 155 0 1333
6 000000000000000110 SU 2 978 0.0982% 0 1 58 4 2 0 0 0 13 0 900
773 000000001100000101 PUO+ 2 9 0.0009% 0 0 0 0 0 0 0 0 9 0 0
778 000000001100001010 SVO+ 2 1 0.0001% 0 0 0 0 0 0 0 0 1 0 0
774 000000001100000110 SUO+ 2 1 0.0001% 0 0 0 0 0 0 0 0 1 0 0
17 000000000000010001 PT 3 26504 2.6615% 11784 10516 0 1654 0 0 47 0 2500 0 0
8209 000010000000010001 PTI 3 10393 1.0437% 0 0 0 0 0 0 0 0 13 0 10380
18 000000000000010010 ST 3 6878 0.6907% 0 20 892 0 4347 0 0 0 1 0 1618
81 000000000001010001 PTD 3 1587 0.1594% 1496 0 0 0 0 0 0 0 0 0 0
8210 000010000000010010 STI 3 855 0.0859% 855 0 0 0 0 0 0 0 0 0 0
145 000000000010010001 PTM 3 189 0.0190% 132 0 0 0 0 0 35 0 22 0 0
26 000000000000011010 SVT 3 27 0.0027% 0 0 0 0 0 0 0 0 1 0 26
82 000000000001010010 STD 3 6 0.0006% 0 0 0 0 0 0 0 0 0 0 6
16401 000100000000010001 PTE 4 650 0.0653% 0 0 0 0 0 0 0 0 2 0 648
16402 000100000000010010 STE 4 356 0.0357% 316 0 1 0 39 0 0 0 0 0 0
789 000000001100010101 PUTO+ 4 14 0.0014% 0 0 0 0 0 0 0 0 14 0 0
22 000000000000010110 SUT 4 4 0.0004% 0 0 3 0 1 0 0 0 0 0 0
4393 000001000100101001 PVGL+ 5 6628 0.6656% 0 6628 0 0 0 0 0 0 0 0 0
4362 000001000100001010 SVL+ 5 5936 0.5961% 0 0 0 0 0 0 0 0 0 0 5936
4354 000001000100000010 SL+ 5 5004 0.5025% 0 40 610 0 0 0 0 0 0 0 4354
810 000000001100101010 SVGO+ 5 578 0.0580% 0 0 0 0 578 0 0 0 0 0 0
4394 000001000100101010 SVGL+ 5 201 0.0202% 0 0 9 0 192 0 0 0 0 0 0
4374 000001000100010110 SUTL+ 5 52 0.0052% 0 0 52 0 0 0 0 0 0 0 0
5386 000001010100001010 SVXL+ 5 21 0.0021% 0 0 0 1 0 0 0 0 20 0 0
131089 100000000000010001 PTQ 5 12 0.0012% 0 0 0 0 0 0 0 0 12 0 0
5381 000001010100000101 PUXL+ 5 10 0.0010% 0 0 0 0 0 0 0 0 10 0 0
4373 000001000100010101 PUTL+ 5 8 0.0008% 0 0 0 0 0 0 0 0 8 0 0
4357 000001000100000101 PUL+ 5 4 0.0004% 0 0 0 0 0 0 0 0 0 0 0
1802 000000011100001010 SVOX+ 5 4 0.0004% 0 0 0 0 0 0 0 0 4 0 0
131073 100000000000000001 PQ 5 3 0.0003% 0 0 0 0 0 0 0 0 3 0 0
21 000000000000010101 PUT 5 2 0.0002% 0 0 0 0 0 0 0 0 0 0 0
32769 001000000000000001 PY 6 8505 0.8541% 3325 1 0 2793 0 0 5 0 2337 0 0
81922 010100000000000010 SEN 6 4503 0.4522% 4503 0 0 0 0 0 0 0 0 0 0
65537 010000000000000001 PN 6 1532 0.1538% 35 0 122 451 49 0 11 0 864 0 0
81921 010100000000000001 PEN 6 1270 0.1275% 0 0 0 8 0 0 22 0 43 0 1133
32833 001000000001000001 PDY 6 755 0.0758% 755 0 0 0 0 0 0 0 0 0 0
32770 001000000000000010 SY 6 572 0.0574% 0 1 409 25 0 0 0 1 94 0 42
65601 010000000001000001 PDN 6 306 0.0307% 52 0 0 247 0 0 5 0 2 0 0
73729 010010000000000001 PIN 6 200 0.0201% 0 0 0 0 0 0 0 0 200 0 0
81937 010100000000010001 PTEN 6 179 0.0180% 0 0 0 0 0 0 2 0 0 0 177
32785 001000000000010001 PTY 6 31 0.0031% 31 0 0 0 0 0 0 0 0 0 0
81938 010100000000010010 STEN 6 31 0.0031% 31 0 0 0 0 0 0 0 0 0 0
65553 010000000000010001 PTN 6 13 0.0013% 0 0 0 0 0 0 0 0 13 0 0
40961 001010000000000001 PIY 6 4 0.0004% 0 0 0 0 0 0 0 0 0 0 4
32897 001000000010000001 PMY 6 2 0.0002% 0 0 0 0 0 0 2 0 0 0 0
32786 001000000000010010 STY 6 1 0.0001% 0 0 1 0 0 0 0 0 0 0 0
147473 100100000000010001 PTEQ 6 1 0.0001% 0 0 0 0 0 0 0 0 1 0 0


Scores (Corresponds to Table 2 in PMID 18823568)

Character Description of feature (when the value is 1) Frequency
D The source database (D) listed in the interaction record is different than what is expected for the given accession for the protein. In specific cases, this difference is tolerated and the assignment is made. 11106(1.2054%)
E The protein reference was a retired NCBI Identifier or a UniProt identifier. NCBI's eUtils (E) were used to retrieve the current accession and/or sequence. For the identifiers still with no sequence after going through eUtils, seeunce information obtained form UniProt. 12693(1.3777%)
G The interaction record's reference for the protein was an EntrezGene (G) identifier. The corresponding products of the gene were used to make the assignment. 8348(0.9061%)
L More than one possible assignment is possible (see + above). (e.g. isoforms for a geneid) In such a situation, references are picked using a ranking system (first look for RefSeq, then UniProt). Even after this ranking if ambiguity exists, the reference with lengthiest sequences selected. (Please note that this score class definition is different from originally published one) 16937(1.8383%)
M The protein reference listed by the interaction record was a typographical modification (M) of a known accession. In specific cases, this variation is tolerated and the assignment is made. 712(0.0773%)
+ More than one possible assignment is possible (+). This case may arise in one of three ways. 1) The reference supplied by the interaction record requires updating but more than one possibility exists. For example, Q7XJL8 was found to be a secondary accession in three separate UniProt records (Q3EBZ2, Q6DR20, and Q8GWA9). 2) The secondary references supplied by the interaction record point to more than one unique protein sequence. 3) An EntrezGene identifier is provided in the interaction record as a protein reference. This identifier points to more than one protein product. An attempt is made to resolve this ambiguity as indicated by ROG score features O, X or L (see below). 17164(1.863%)
N The protein reference, taxonomy identifier and sequence for the protein as provided in the interaction record are used to make a new entry in the SEGUID table. The protein interactor is assigned the newly (N) generated ROG identifier. 6423(0.6971%)
O More than one possible assignment is possible (see + above). The assignment chosen has a SEGUID that is identical to the SEGUID of the original (O) sequence provided in the interaction record. 704(0.0764%)
I The protein reference used was an NCBI GenInfo Identifier (I). 167561(18.1868%)
U The protein reference listed in the interaction record and used to make the assignment was a secondary UniProt accession and was updated (U) to a primary UniProt accession in order to make the assignment. 23872(2.591%)
T The taxonomy (T) identifier for the protein (as supplied by the interaction record) differed from what was found in the protein sequence record. This discrepancy was tolerated and the assignment was made 29603(3.2131%)
V The protein reference listed by the interaction record contained version (V) information that was ignored. For example, RefSeq accession.version NP_012420.1 was listed but treated as RefSeq accession NP_012420. 15147(1.644%)
Q The protein reference used to make the assignment was of the type 'see-also'. See PSI-MI Path: entrySet/entry/interactorList/interactor/xref/primaryRef/refType = 'see-also'. 43(0.0047%)
P The interaction record's primary (P) reference for the protein was used to make the assignment 832046(90.309%)
S One of the interaction record's secondary (S) references for the protein was used to make the assignment 89286(9.691%)
Y the accession was referring an accession which was removed from RefSeq or UniProt after beta3 build of iRefIndex (March 9th, 2009) 10230(1.1103%)
X More than one possible assignment is possible (see + above). The assignment chosen has the same taxonomy (X) identifier as listed in the interaction record 38(0.0041%)



All iRefIndex Pages

Follow this link for a listing of all iRefIndex related pages (archived and current).