Statistics iRefIndex 9.0

From irefindex
Revision as of 10:29, 2 September 2011 by PaulBoddie (talk | contribs) (Updated statistics.)

Interactions

BIND 62884
GRID 24360 263295
DIP 25783 39232 89716
INTACT 24813 37356 38549 155235
MINT 21678 40432 35850 47251 85758
HPRD 1826 8383 1073 5479 4337 40667
OPHID 2418 9274 1462 7537 6911 9927 47479
MPACT 6129 7831 6652 6224 6473 0 0 13328
MPPI 388 153 63 98 93 147 187 0 830
CORUM 247 193 115 244 119 233 237 0 15 2607
BIND_TRANSLATION 56122 24360 24808 24237 21605 2076 2725 5960 344 186 60291
INNATEDB 339 1272 399 783 647 1021 1212 0 52 82 399 6952
MATRIXDB 4 11 2 15 2 14 24 0 2 0 4 5 201
MPILIT 24 0 85 108 32 0 0 0 0 0 24 0 0 745
MPIIMEX 7 0 25 33 14 0 0 0 0 0 7 0 0 30 473
BIND GRID DIP INTACT MINT HPRD OPHID MPACT MPPI CORUM BIND_TRANSLATION INNATEDB MATRIXDB MPILIT MPIIMEX
(5067) (187792) (27732) (83506) (19567) (25104) (28622) (1614) (252) (1935) (3110) (4265) (156) (541) (398)

Interactors

BIND 40843
GRID 17950 33559
DIP 17386 18500 29969
INTACT 18803 23233 24392 52757
MINT 16524 18446 19518 25481 31622
HPRD 2870 5757 3444 5955 4497 9863
OPHID 3398 5928 4220 6976 5398 6165 9577
MPACT 4286 4491 4603 4934 4797 0 1 4978
MPPI 682 476 470 636 564 305 424 0 863
CORUM 2071 2344 2212 3217 2532 1833 2248 0 417 4365
BIND_TRANSLATION 35168 17405 16787 18274 15971 2913 3330 3899 660 1931 37322
INNATEDB 1616 2072 1786 2556 2128 1683 2106 0 358 1144 1617 3387
MATRIXDB 112 111 87 138 116 110 144 0 18 52 111 89 221
MPILIT 88 0 332 441 227 0 0 0 0 0 89 0 0 937
MPIIMEX 33 0 111 128 65 0 0 0 0 0 31 0 0 92 473
BIND GRID DIP INTACT MINT HPRD OPHID MPACT MPPI CORUM BIND_TRANSLATION INNATEDB MATRIXDB MPILIT MPIIMEX
(4329) (6404) (2245) (15612) (4048) (1816) (944) (12) (35) (521) (1457) (264) (35) (367) (282)

Summary of mapping interaction records to RIGs (Table 5)

Source Total records Protein-only interactors PPI Assigned to RIGID Unique RIGIDs
bind 193648 93957 91249(97.1178%) 62884(68.9147%)
grid 392218 386984 386462(99.8651%) 263295(68.1296%)
dip 90994 90994 89910(98.8087%) 89716(99.7842%)
intact 183079 181384 180718(99.6328%) 155235(85.8990%)
mint 122775 122775 122269(99.5879%) 85758(70.1388%)
HPRD 83022 83022 83022(100.0000%) 40667(48.9834%)
ophid 73257 73257 73160(99.8676%) 47479(64.8975%)
MPACT 16504 16504 16293(98.7215%) 13328(81.8020%)
MPPI 1814 1814 1699(93.6604%) 830(48.8523%)
CORUM 2844 2844 2844(100.0000%) 2607(91.6667%)
BIND_Translation 192923 87081 83347(95.7120%) 60291(72.3373%)
InnateDB 14729 11476 11176(97.3858%) 6952(62.2047%)
MatrixDB 846 349 321(91.9771%) 201(62.6168%)
mpilit 745 745 745(100.0000%) 745(100.0000%)
mpiimex 473 473 473(100.0000%) 473(100.0000%)
ALL 1369871 1153659 1143688(99.1357%) 533807(46.6742%)

Assignment of protein interactors to ROGs (Table 3)

Source Protein_Intractors Assigned % Arbitrary N_and_Y Unassigned Unique proteins
bind 285482 272483 95.4466 0 9055 3924 40843
BIND_Translation 264346 239986 90.7848 70 15384 8902 37322
CORUM 12916 12909 99.9458 7 0 0 4365
dip 30978 29258 94.4477 787 450 483 29969
grid 41420 34125 82.3877 7065 7 223 33559
HPRD 123812 108048 87.2678 15533 231 0 9863
InnateDB 27209 26833 98.6181 0 0 376 3387
intact 150972 148099 98.0970 38 2433 402 52757
MatrixDB 1123 1077 95.9038 0 0 46 221
mint 87509 83387 95.2896 51 3926 145 31622
MPACT 40349 40118 99.4275 0 1 230 4978
mpiimex 946 946 100.0000 0 0 0 473
mpilit 1490 1487 99.7987 3 0 0 937
MPPI 3628 3455 95.2315 0 42 131 863
ophid 146423 145149 99.1299 265 1003 6 9577
All 1218603 1147384 94.1557 23819 32532 14868 96627

ROG summary

Decimal_score Binary_flag String_score Score_class Proteins Percentage bind grid dip intact mint mpiimex mpilit HPRD ophid InnateDB MatrixDB MPACT BIND_Translation MPPI CORUM
786 000000001100010010 STO+ -1 8853 0.7265% 0 0 0 0 0 0 0 8853 0 0 0 0 0 0 0
1938 000000011110010010 STMOX+ -1 496 0.0407% 0 0 0 2 0 0 0 494 0 0 0 0 0 0 0
5506 000001010110000010 SMXL+ -1 79 0.0065% 0 0 0 0 0 0 0 79 0 0 0 0 0 0 0
914 000000001110010010 STMO+ -1 18 0.0015% 0 0 0 16 0 0 0 2 0 0 0 0 0 0 0
217346 110101000100000010 SLENQ+ -1 7 0.0006% 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0
131093 100000000000010101 PUTQ -1 5 0.0004% 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0
163905 101000000001000001 PDYQ -1 2 0.0002% 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0
218370 110101010100000010 SXLENQ+ -1 1 0.0001% 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
163921 101000000001010001 PTDYQ -1 1 0.0001% 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
1 000000000000000001 P 1 666026 54.6549% 127542 28701 0 101092 49820 936 1405 0 124771 26833 828 0 188186 3021 12891
2 000000000000000010 S 1 45077 3.6991% 0 66 22776 12 270 0 0 21388 0 0 0 0 565 0 0
554 000000001000101010 SVGO 1 17263 1.4166% 0 0 0 0 0 0 0 17263 0 0 0 0 0 0 0
8194 000010000000000010 SI 1 12276 1.0074% 12276 0 0 0 0 0 0 0 0 0 0 0 0 0 0
65 000000000001000001 PD 1 7142 0.5861% 7141 0 0 0 1 0 0 0 0 0 0 0 0 0 0
41 000000000000101001 PVG 1 1936 0.1589% 0 1936 0 0 0 0 0 0 0 0 0 0 0 0 0
42 000000000000101010 SVG 1 1123 0.0922% 0 0 122 0 0 0 0 1001 0 0 0 0 0 0 0
131201 100000000010000001 PMQ 1 942 0.0773% 0 0 0 0 0 0 0 0 0 0 0 0 942 0 0
139265 100010000000000001 PIQ 1 295 0.0242% 0 0 0 0 0 0 0 0 0 0 0 0 295 0 0
129 000000000010000001 PM 1 216 0.0177% 0 0 0 47 0 0 0 0 0 0 137 0 0 32 0
10 000000000000001010 SV 1 71 0.0058% 0 0 5 35 31 0 0 0 0 0 0 0 0 0 0
8193 000010000000000001 PI 1 36 0.0030% 0 0 0 28 8 0 0 0 0 0 0 0 0 0 0
66 000000000001000010 SD 1 22 0.0018% 0 4 0 0 0 0 0 0 0 0 18 0 0 0 0
9 000000000000001001 PV 1 5 0.0004% 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0
5 000000000000000101 PU 2 21578 1.7707% 0 0 0 61 235 9 3 0 20246 0 0 0 684 322 18
16386 000100000000000010 SE 2 4866 0.3993% 4866 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 000000000000000110 SU 2 173 0.0142% 0 0 128 26 5 0 0 13 0 0 0 0 1 0 0
16385 000100000000000001 PE 2 153 0.0126% 0 0 0 144 9 0 0 0 0 0 0 0 0 0 0
147458 100100000000000010 SEQ 2 150 0.0123% 0 0 0 4 0 0 0 0 0 0 0 0 146 0 0
147457 100100000000000001 PEQ 2 52 0.0043% 0 0 0 0 0 0 0 0 0 0 0 0 52 0 0
773 000000001100000101 PUO+ 2 17 0.0014% 0 0 0 7 1 0 0 0 9 0 0 0 0 0 0
16514 000100000010000010 SME 2 9 0.0007% 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1797 000000011100000101 PUOX+ 2 4 0.0003% 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0
774 000000001100000110 SUO+ 2 1 0.0001% 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
17 000000000000010001 PT 3 230967 18.9534% 115853 3418 0 46440 32920 1 79 0 118 0 2 30656 1434 46 0
18 000000000000010010 ST 3 45443 3.7291% 0 0 6223 1 18 0 0 32179 0 0 0 7001 21 0 0
131217 100000000010010001 PTMQ 3 27945 2.2932% 0 0 0 0 0 0 0 0 0 0 0 0 27945 0 0
146 000000000010010010 STM 3 23248 1.9078% 0 0 0 0 0 0 0 23248 0 0 0 0 0 0 0
81 000000000001010001 PTD 3 2506 0.2056% 2414 0 0 0 1 0 0 0 0 0 91 0 0 0 0
8210 000010000000010010 STI 3 915 0.0751% 915 0 0 0 0 0 0 0 0 0 0 0 0 0 0
145 000000000010010001 PTM 3 662 0.0543% 605 0 0 23 0 0 0 0 0 0 0 0 0 34 0
139281 100010000000010001 PTIQ 3 84 0.0069% 0 0 0 0 0 0 0 0 0 0 0 0 84 0 0
163985 101000000010010001 PTMYQ 3 58 0.0048% 0 0 0 0 0 0 0 0 0 0 0 0 58 0 0
8209 000010000000010001 PTI 3 16 0.0013% 0 0 0 16 0 0 0 0 0 0 0 0 0 0 0
82 000000000001010010 STD 3 10 0.0008% 0 0 0 0 9 0 0 0 0 0 1 0 0 0 0
16530 000100000010010010 STME 3 7 0.0006% 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0
26 000000000000011010 SVT 3 1 0.0001% 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
147474 100100000000010010 STEQ 4 2496 0.2048% 0 0 0 2 0 0 0 0 0 0 0 0 2494 0 0
16402 000100000000010010 STE 4 856 0.0702% 855 0 1 0 0 0 0 0 0 0 0 0 0 0 0
22 000000000000010110 SUT 4 137 0.0112% 0 0 3 0 0 0 0 134 0 0 0 0 0 0 0
790 000000001100010110 SUTO+ 4 48 0.0039% 0 0 0 19 27 0 0 2 0 0 0 0 0 0 0
789 000000001100010101 PUTO+ 4 32 0.0026% 0 0 0 28 4 0 0 0 0 0 0 0 0 0 0
16401 000100000000010001 PTE 4 3 0.0002% 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0
131073 100000000000000001 PQ 5 16324 1.3396% 0 0 0 6 0 0 0 0 0 0 0 0 16318 0 0
5378 000001010100000010 SXL+ 5 13950 1.1448% 0 0 0 15 1 0 0 13934 0 0 0 0 0 0 0
4393 000001000100101001 PVGL+ 5 7049 0.5784% 0 7049 0 0 0 0 0 0 0 0 0 0 0 0 0
810 000000001100101010 SVGO+ 5 3471 0.2848% 0 0 0 0 0 0 0 3471 0 0 0 0 0 0 0
21 000000000000010101 PUT 5 2521 0.2069% 0 0 0 28 23 0 0 0 5 0 0 2461 4 0 0
4394 000001000100101010 SVGL+ 5 1621 0.1330% 0 0 112 0 0 0 0 1509 0 0 0 0 0 0 0
131089 100000000000010001 PTQ 5 860 0.0706% 0 0 0 48 0 0 0 0 0 0 0 0 812 0 0
4354 000001000100000010 SL+ 5 670 0.0550% 0 16 652 2 0 0 0 0 0 0 0 0 0 0 0
4357 000001000100000101 PUL+ 5 241 0.0198% 0 0 0 0 0 0 3 0 222 0 0 0 9 0 7
4373 000001000100010101 PUTL+ 5 70 0.0057% 0 0 0 8 3 0 0 0 4 0 0 0 55 0 0
5381 000001010100000101 PUXL+ 5 56 0.0046% 0 0 0 12 5 0 0 0 39 0 0 0 0 0 0
5386 000001010100001010 SVXL+ 5 43 0.0035% 0 0 0 1 42 0 0 0 0 0 0 0 0 0 0
4374 000001000100010110 SUTL+ 5 30 0.0025% 0 0 17 0 0 0 0 7 0 0 0 0 6 0 0
4358 000001000100000110 SUL+ 5 6 0.0005% 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0
5382 000001010100000110 SUXL+ 5 4 0.0003% 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0
32769 001000000000000001 PY 6 15958 1.3095% 3658 5 0 1873 3388 0 0 0 750 0 0 0 6279 5 0
65601 010000000001000001 PDN 6 8727 0.7161% 52 0 0 2 247 0 0 0 253 0 0 0 8168 5 0
81922 010100000000000010 SEN 6 4426 0.3632% 4426 0 0 0 0 0 0 0 0 0 0 0 0 0 0
65537 010000000000000001 PN 6 959 0.0787% 35 0 190 263 256 0 0 204 0 0 0 0 0 11 0
32833 001000000001000001 PDY 6 769 0.0631% 769 0 0 0 0 0 0 0 0 0 0 0 0 0 0
32770 001000000000000010 SY 6 420 0.0345% 0 2 258 88 25 0 0 0 0 0 0 1 46 0 0
163969 101000000010000001 PMYQ 6 396 0.0325% 0 0 0 0 0 0 0 0 0 0 0 0 396 0 0
212993 110100000000000001 PENQ 6 291 0.0239% 0 0 0 0 0 0 0 0 0 0 0 0 291 0 0
73729 010010000000000001 PIN 6 198 0.0162% 0 0 0 198 0 0 0 0 0 0 0 0 0 0 0
32785 001000000000010001 PTY 6 164 0.0135% 104 0 0 2 0 0 0 0 0 0 0 0 58 0 0
65553 010000000000010001 PTN 6 31 0.0025% 0 0 0 4 0 0 0 27 0 0 0 0 0 0 0
196609 110000000000000001 PNQ 6 31 0.0025% 0 0 0 0 0 0 0 0 0 0 0 0 31 0 0
81921 010100000000000001 PEN 6 27 0.0022% 0 0 0 1 10 0 0 0 0 0 0 0 0 16 0
65617 010000000001010001 PTDN 6 23 0.0019% 0 0 0 0 0 0 0 0 0 0 0 0 23 0 0
196625 110000000000010001 PTNQ 6 22 0.0018% 0 0 0 0 0 0 0 0 0 0 0 0 22 0 0
81938 010100000000010010 STEN 6 9 0.0007% 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0
147473 100100000000010001 PTEQ 6 3 0.0002% 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0
213009 110100000000010001 PTENQ 6 2 0.0002% 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0
32786 001000000000010010 STY 6 2 0.0002% 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0
32897 001000000010000001 PMY 6 2 0.0002% 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0
81937 010100000000010001 PTEN 6 2 0.0002% 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0
163857 101000000000010001 PTYQ 6 1 0.0001% 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
32913 001000000010010001 PTMY 6 1 0.0001% 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
40978 001010000000010010 STIY 6 1 0.0001% 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
81986 010100000001000010 SDEN 6 1 0.0001% 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Scores (Table 2)

Character Description of feature (when the value is 1) Frequency
D The source database (D) listed in the interaction record is different than what is expected for the given accession for the protein. In specific cases, this difference is tolerated and the assignment is made. 19203(1.5953%)
E The protein reference was a retired NCBI Identifier or a UniProt identifier. NCBI's eUtils (E) were used to retrieve the current accession and/or sequence. For the identifiers still with no sequence after going through eUtils, sequence information obtained from UniProt. 13361(1.11%)
G The interaction record's reference for the protein was an EntrezGene (G) identifier. The corresponding products of the gene were used to make the assignment. 32463(2.6969%)
L More than one possible assignment is possible (see + above). (e.g. isoforms for a geneid) In such a situation, references are picked using a ranking system (first look for RefSeq, then UniProt). Even after this ranking if ambiguity exists, the reference with lengthiest sequences selected. (Please note that this score class definition is different from originally published one) 23827(1.9795%)
M The protein reference listed by the interaction record was a typographical modification (M) of a known accession. In specific cases, this variation is tolerated and the assignment is made. 54079(4.4927%)
+ More than one possible assignment is possible (+). This case may arise in one of three ways. 1) The reference supplied by the interaction record requires updating but more than one possibility exists. For example, Q7XJL8 was found to be a secondary accession in three separate UniProt records (Q3EBZ2, Q6DR20, and Q8GWA9). 2) The secondary references supplied by the interaction record point to more than one unique protein sequence. 3) An EntrezGene identifier is provided in the interaction record as a protein reference. This identifier points to more than one protein product. An attempt is made to resolve this ambiguity as indicated by ROG score features O, X or L (see below). 36767(3.0545%)
N The protein reference, taxonomy identifier and sequence for the protein as provided in the interaction record are used to make a new entry in the SEGUID table. The protein interactor is assigned the newly (N) generated ROG identifier. 14757(1.226%)
O More than one possible assignment is possible (see + above). The assignment chosen has a SEGUID that is identical to the SEGUID of the original (O) sequence provided in the interaction record. 30203(2.5092%)
I The protein reference used was an NCBI GenInfo Identifier (I). 13821(1.1482%)
U The protein reference listed in the interaction record and used to make the assignment was a secondary UniProt accession and was updated (U) to a primary UniProt accession in order to make the assignment. 24923(2.0705%)
T The taxonomy (T) identifier for the protein (as supplied by the interaction record) differed from what was found in the protein sequence record. This discrepancy was tolerated and the assignment was made 348549(28.9562%)
V The protein reference listed by the interaction record contained version (V) information that was ignored. For example, RefSeq accession.version NP_012420.1 was listed but treated as RefSeq accession NP_012420. 32583(2.7069%)
Q The protein reference used to make the assignment was of the type 'see-also'. See PSI-MI Path: entrySet/entry/interactorList/interactor/xref/primaryRef/refType = 'see-also'. 49968(4.1512%)
P The interaction record's primary (P) reference for the protein was used to make the assignment 1015411(84.3567%)
S One of the interaction record's secondary (S) references for the protein was used to make the assignment 188300(15.6433%)
Y the accession was referring an accession which was removed from RefSeq or UniProt after beta3 build of iRefIndex (March 9th, 2009) 17775(1.4767%)
X More than one possible assignment is possible (see + above). The assignment chosen has the same taxonomy (X) identifier as listed in the interaction record 14633(1.2157%)