Statistics iRefIndex free 6.0

From irefindex
Revision as of 16:01, 14 December 2009 by Sabry (talk | contribs) (→‎Summary)

Summary

  • Total interactions : 641,424
  • Total distinct interactions (based on RIGID): 327,736 (51.1 % of total interactions)
  • Total distinct proteins (based on ROGID) : 76,013 (70,963 of them participate in distribute interactions)

Interactions (Corresponds to Table 6 in PMID 18823568)

BIND 62706
BIOGRID 22310 163090
DIP 0 0 0
HPRD 0 0 0 0
INTACT 24410 31812 0 0 112107
MINT 22293 37478 0 0 44319 76322
MPACT 0 0 0 0 0 0 0
MPPI 386 135 0 0 90 73 0 825
OPHID 2226 8303 0 0 7277 6437 0 183 47331
CORUM 0 0 0 0 0 0 0 0 0 0
I2D 0 0 0 0 0 0 0 0 0 0 0
BIND BIOGRID DIP HPRD INTACT MINT MPACT MPPI OPHID CORUM I2D
(30476) (109784) (0) (0) (57306) (17760) (0) (304) (32821) (0) (0)

Interactors

BIND 40749
BIOGRID 16881 27409
DIP 0 0 0
HPRD 0 0 0 0
INTACT 18199 19405 0 0 41624
MINT 16426 17053 0 0 23360 28589
MPACT 0 0 0 0 0 0 0
MPPI 672 423 0 0 579 511 0 859
OPHID 3253 5217 0 0 5811 4804 0 421 9629
CORUM 0 0 0 0 0 0 0 0 0 0
I2D 0 0 0 0 0 0 0 0 0 0 0
BIND BIOGRID DIP HPRD INTACT MINT MPACT MPPI OPHID CORUM I2D
(19525) (4859) (0) (0) (12981) (3617) (0) (52) (1966) (0) (0)


Summary of mapping interaction records to RIGs (Corresponds to Table 5 in PMID 18823568)

Source Total records Protein-only interactors PPI Assigned to RIGID Unique RIGIDs
bind 193648 93957 90998(96.8507%) 62706(68.9092%)
grid 239485 238211 237796(99.8258%) 163090(68.5840%)
intact 133302 132525 130303(98.3233%) 112107(86.0356%)
mint 110788 110788 107860(97.3571%) 76322(70.7602%)
ophid 73257 73257 72779(99.3475%) 47331(65.0339%)
MPPI 1814 1814 1688(93.0540%) 825(48.8744%)
ALL 752294 650552 641424(98.5969%) 327736(51.0951%)


Assignment of protein interactors to ROGs (Corresponds to Table 3 in PMID 18823568)

Source Protein_Intractors Assigned % Arbitrary New Unassigned Unique proteins
bind 285482 275666 96.5616 0 5572 3938 40749
grid 27629 22417 81.1358 5079 0 133 27409
intact 100752 99219 98.4784 19 449 239 41624
mint 77936 76567 98.2434 2 298 228 28589
MPPI 3628 3459 95.3418 0 27 140 859
ophid 146423 145653 99.4741 103 280 6 9629
All 641850 625337 97.4273 5203 6626 4684 76013



ROG summary (Corresponds to Table 4 in PMID 18823568)

Decimal_score Binary_flag String_score Score_class Proteins Percentage BIND BioGrid DIP MINT HPRD OPHID MPPI MPACT IntAct CORUM
1 000000000000000001 P 1 546982 85.2196% 232419 19992 0 72231 0 125686 3023 0 93631 0
8194 000010000000000010 SI 1 12336 1.9219% 12336 0 0 0 0 0 0 0 0 0
65 000000000001000001 PD 1 8086 1.2598% 8083 0 0 3 0 0 0 0 0 0
2 000000000000000010 S 1 2877 0.4482% 0 2410 0 231 0 0 0 0 236 0
129 000000000010000001 PM 1 522 0.0813% 473 0 0 0 0 0 32 0 17 0
10 000000000000001010 SV 1 240 0.0374% 0 0 0 9 0 0 0 0 231 0
8193 000010000000000001 PI 1 48 0.0075% 0 0 0 0 0 0 0 0 48 0
5 000000000000000101 PU 2 20322 3.1662% 0 0 0 293 0 19519 320 0 190 0
16386 000100000000000010 SE 2 5413 0.8433% 5413 0 0 0 0 0 0 0 0 0
16385 000100000000000001 PE 2 884 0.1377% 0 0 0 199 0 0 0 0 685 0
16449 000100000001000001 PDE 2 116 0.0181% 0 0 0 34 0 0 0 0 82 0
6 000000000000000110 SU 2 12 0.0019% 0 0 0 4 0 0 0 0 8 0
773 000000001100000101 PUO+ 2 12 0.0019% 0 0 0 3 0 1 0 0 8 0
778 000000001100001010 SVO+ 2 1 0.0002% 0 0 0 0 0 0 0 0 1 0
774 000000001100000110 SUO+ 2 1 0.0002% 0 0 0 0 0 0 0 0 1 0
17 000000000000010001 PT 3 16115 2.5107% 11871 0 0 1620 0 122 46 0 2456 0
81 000000000001010001 PTD 3 1487 0.2317% 1486 0 0 1 0 0 0 0 0 0
8210 000010000000010010 STI 3 855 0.1332% 855 0 0 0 0 0 0 0 0 0
145 000000000010010001 PTM 3 170 0.0265% 132 0 0 0 0 0 35 0 3 0
8209 000010000000010001 PTI 3 12 0.0019% 0 0 0 0 0 0 0 0 12 0
18 000000000000010010 ST 3 3 0.0005% 0 0 0 0 0 0 0 0 3 0
26 000000000000011010 SVT 3 1 0.0002% 0 0 0 0 0 0 0 0 1 0
16402 000100000000010010 STE 4 315 0.0491% 315 0 0 0 0 0 0 0 0 0
789 000000001100010101 PUTO+ 4 14 0.0022% 0 0 0 0 0 0 0 0 14 0
16401 000100000000010001 PTE 4 4 0.0006% 0 0 0 0 0 0 0 0 4 0
4354 000001000100000010 SL+ 5 5079 0.7913% 0 5079 0 0 0 0 0 0 0 0
4357 000001000100000101 PUL+ 5 84 0.0131% 0 0 0 0 0 84 0 0 0 0
131089 100000000000010001 PTQ 5 38 0.0059% 0 0 0 0 0 0 0 0 38 0
5381 000001010100000101 PUXL+ 5 29 0.0045% 0 0 0 0 0 19 0 0 10 0
4373 000001000100010101 PUTL+ 5 9 0.0014% 0 0 0 1 0 0 0 0 8 0
21 000000000000010101 PUT 5 7 0.0011% 0 0 0 2 0 5 0 0 0 0
1802 000000011100001010 SVOX+ 5 3 0.0005% 0 0 0 0 0 0 0 0 3 0
5386 000001010100001010 SVXL+ 5 2 0.0003% 0 0 0 1 0 0 0 0 1 0
131077 100000000000000101 PUQ 5 1 0.0002% 0 0 0 0 0 0 0 0 1 0
131073 100000000000000001 PQ 5 1 0.0002% 0 0 0 0 0 0 0 0 1 0
32769 001000000000000001 PY 6 5397 0.8409% 1638 15 0 1912 0 320 3 0 1509 0
81922 010100000000000010 SEN 6 4636 0.7223% 4636 0 0 0 0 0 0 0 0 0
65537 010000000000000001 PN 6 1052 0.1639% 750 0 0 47 0 27 4 0 224 0
65601 010000000001000001 PDN 6 674 0.1050% 166 0 0 248 0 253 5 0 2 0
32833 001000000001000001 PDY 6 640 0.0997% 640 0 0 0 0 0 0 0 0 0
32770 001000000000000010 SY 6 59 0.0092% 0 0 0 25 0 0 0 0 34 0
81938 010100000000010010 STEN 6 19 0.0030% 19 0 0 0 0 0 0 0 0 0
81921 010100000000000001 PEN 6 19 0.0030% 0 0 0 3 0 0 16 0 0 0
65553 010000000000010001 PTN 6 9 0.0014% 1 0 0 0 0 0 0 0 8 0
32785 001000000000010001 PTY 6 5 0.0008% 5 0 0 0 0 0 0 0 0 0
81937 010100000000010001 PTEN 6 2 0.0003% 0 0 0 0 0 0 2 0 0 0
147473 100100000000010001 PTEQ 6 1 0.0002% 0 0 0 0 0 0 0 0 1 0
163841 101000000000000001 PYQ 6 1 0.0002% 0 0 0 0 0 0 0 0 1 0
73729 010010000000000001 PIN 6 215 0.0335% 0 0 0 0 0 0 0 0 215 0

Scores (Corresponds to Table 2 in PMID 18823568)

Character Description of feature (when the value is 1) Frequency
D The source database (D) listed in the interaction record is different than what is expected for the given accession for the protein. In specific cases, this difference is tolerated and the assignment is made. 11003(1.7333%)
E The protein reference was a retired NCBI Identifier or a UniProt identifier. NCBI's eUtils (E) were used to retrieve the current accession and/or sequence. For the identifiers still with no sequence after going through eUtils, seeunce information obtained form UniProt. 11409(1.7972%)
G The interaction record's reference for the protein was an EntrezGene (G) identifier. The corresponding products of the gene were used to make the assignment. 0(0.0%)
L More than one possible assignment is possible (see + above). (e.g. isoforms for a geneid) In such a situation, references are picked using a ranking system (first look for RefSeq, then UniProt). Even after this ranking if ambiguity exists, the reference with lengthiest sequences selected. (Please note that this score class definition is different from originally published one) 5203(0.8196%)
M The protein reference listed by the interaction record was a typographical modification (M) of a known accession. In specific cases, this variation is tolerated and the assignment is made. 692(0.109%)
+ More than one possible assignment is possible (+). This case may arise in one of three ways. 1) The reference supplied by the interaction record requires updating but more than one possibility exists. For example, Q7XJL8 was found to be a secondary accession in three separate UniProt records (Q3EBZ2, Q6DR20, and Q8GWA9). 2) The secondary references supplied by the interaction record point to more than one unique protein sequence. 3) An EntrezGene identifier is provided in the interaction record as a protein reference. This identifier points to more than one protein product. An attempt is made to resolve this ambiguity as indicated by ROG score features O, X or L (see below). 5234(0.8245%)
N The protein reference, taxonomy identifier and sequence for the protein as provided in the interaction record are used to make a new entry in the SEGUID table. The protein interactor is assigned the newly (N) generated ROG identifier. 6626(1.0438%)
O More than one possible assignment is possible (see + above). The assignment chosen has a SEGUID that is identical to the SEGUID of the original (O) sequence provided in the interaction record. 31(0.0049%)
I The protein reference used was an NCBI GenInfo Identifier (I). 13466(2.1213%)
U The protein reference listed in the interaction record and used to make the assignment was a secondary UniProt accession and was updated (U) to a primary UniProt accession in order to make the assignment. 20491(3.2279%)
T The taxonomy (T) identifier for the protein (as supplied by the interaction record) differed from what was found in the protein sequence record. This discrepancy was tolerated and the assignment was made 19066(3.0034%)
V The protein reference listed by the interaction record contained version (V) information that was ignored. For example, RefSeq accession.version NP_012420.1 was listed but treated as RefSeq accession NP_012420. 247(0.0389%)
Q The protein reference used to make the assignment was of the type 'see-also'. See PSI-MI Path: entrySet/entry/interactorList/interactor/xref/primaryRef/refType = 'see-also'. 42(0.0066%)
P The interaction record's primary (P) reference for the protein was used to make the assignment 602958(94.9824%)
S One of the interaction record's secondary (S) references for the protein was used to make the assignment 31852(5.0176%)
Y the accession was referring an accession which was removed from RefSeq or UniProt after beta3 build of iRefIndex (March 9th, 2009) 6102(0.9612%)
X More than one possible assignment is possible (see + above). The assignment chosen has the same taxonomy (X) identifier as listed in the interaction record 34(0.0054%)