Difference between revisions of "Statistics iRefIndex free 4.0"

From irefindex
Line 79: Line 79:
 
| MPPI||1813||1813||1696(93.5466%)||829(48.8797%)
 
| MPPI||1813||1813||1696(93.5466%)||829(48.8797%)
 
|-
 
|-
| ALL||743302||642845||637650(99.1919%)||460011(72.1416%)
+
| ALL||743302||642845||637650(99.1919%)|| 337379(52.9097%)
 
|}
 
|}
  

Revision as of 11:22, 9 June 2009

This page lists statistics for the freely available version of iRefIndex distributed at ftp://ftp.no.embnet.org/irefindex/data/current/.

These data are a subset of the full iRefIndex build that we can freely redistribute according to the usage policies of the source databases. Specifically, this data set does not include interactions from DIP, CORUM, HPRD or MPact. Please refer to http://irefindex.uio.no/wiki/Statistics_iRefIndex_4.0 for statistics that are applicable to the full build.

Back to the iRefIndex main page.


Interactions

BIND 62921
BIOGRID 20497 163891
DIP 0 0 0
HPRD 0 0 0 0
INTACT 24241 25650 0 0 111334
MINT 22154 32446 0 0 42185 73739
MPACT 0 0 0 0 0 0 0
MPPI 385 26 0 0 89 105 0 829
OPHID 2210 1333 0 0 7197 7142 0 183 47297
CORUM 0 0 0 0 0 0 0 0 0 0
BIND BIOGRID DIP HPRD INTACT MINT MPACT MPPI OPHID CORUM
(31178) (120963) (0) (0) (58514) (16176) (0) (306) (35730) (0)

Interactors

BIND 40801
BIOGRID 14442 27471
DIP 0 0 0
HPRD 0 0 0 0
INTACT 18121 16830 0 0 41601
MINT 16314 14935 0 0 22895 28055
MPACT 0 0 0 0 0 0 0
MPPI 672 212 0 0 575 550 0 863
OPHID 3242 2300 0 0 5747 4808 0 421 9629
CORUM 0 0 0 0 0 0 0 0 0 0
BIND BIOGRID DIP HPRD INTACT MINT MPACT MPPI OPHID CORUM
(20127) (8594) (0) (0) (13320) (3408) (0) (48) (2285) (0)

Summary of mapping interaction records to RIGs (Table 5)

Source Total records Protein-only interactors PPI Assigned to RIGID Unique RIGIDs
bind 193648 93957 91291(97.1625%) 62921(68.9236%)
grid 240501 240501 240197(99.8736%) 163891(68.2319%)
intact 129236 128470 128037(99.6630%) 111334(86.9546%)
mint 104847 104847 103522(98.7363%) 73739(71.2303%)
ophid 73257 73257 72907(99.5222%) 47297(64.8731%)
MPPI 1813 1813 1696(93.5466%) 829(48.8797%)
ALL 743302 642845 637650(99.1919%) 337379(52.9097%)

Assignment of protein interactors to ROGs (Table 3)

Source Protein_Intractors Assigned % Arbitrary New Unassigned Unique proteins
bind 285482 273646 95.8540 0 7942 3894 40801
grid 29318 19162 65.3592 10053 5 98 27471
intact 98122 94514 96.3229 18 3354 236 41601
mint 80543 77298 95.9711 6 2760 479 28055
MPPI 3628 3457 95.2867 0 39 132 863
ophid 146423 145362 99.2754 103 699 259 9629
All 643516 613439 95.3261 10180 14799 5098 79750

ROG summary

Decimal_score Binary_flag String_score Score_class Proteins Percentage BIND BioGrid DIP MINT HPRD OPHID MIIP MPACT
1802 000000011100001010 SVOX+ -1 4 0.0006% 0 0 0 0 0 0 0 0
1 000000000000000001 P 1 535022 83.1404% 232685 7503 0 74838 0 125715 3023 0
130 000000000010000010 SM 1 551 0.0856% 0 551 0 0 0 0 0 0
129 000000000010000001 PM 1 524 0.0814% 473 1 0 0 0 0 32 0
66 000000000001000010 SD 1 2 0.0003% 0 2 0 0 0 0 0 0
65 000000000001000001 PD 1 9549 1.4839% 8084 1446 0 19 0 0 0 0
8193 000010000000000001 PI 1 48 0.0075% 0 2 0 0 0 0 0 0
8194 000010000000000010 SI 1 12395 1.9261% 12336 59 0 0 0 0 0 0
10 000000000000001010 SV 1 7 0.0011% 0 0 0 0 0 0 0 0
2 000000000000000010 S 1 7907 1.2287% 0 7402 0 263 0 0 0 0
774 000000001100000110 SUO+ 2 1 0.0002% 0 0 0 0 0 0 0 0
773 000000001100000101 PUO+ 2 7 0.0011% 0 0 0 1 0 0 0 0
778 000000001100001010 SVO+ 2 1 0.0002% 0 0 0 0 0 0 0 0
5 000000000000000101 PU 2 20462 3.1797% 0 0 0 427 0 19520 320 0
6 000000000000000110 SU 2 672 0.1044% 0 659 0 5 0 0 0 0
16386 000100000000000010 SE 2 5405 0.8399% 5405 0 0 0 0 0 0 0
16385 000100000000000001 PE 2 189 0.0294% 0 0 0 0 0 0 0 0
8209 000010000000010001 PTI 3 13 0.0020% 0 0 0 0 0 0 0 0
8210 000010000000010010 STI 3 903 0.1403% 855 48 0 0 0 0 0 0
146 000000000010010010 STM 3 1 0.0002% 0 1 0 0 0 0 0 0
17 000000000000010001 PT 3 16244 2.5243% 11876 0 0 1739 0 122 47 0
18 000000000000010010 ST 3 1490 0.2315% 0 1487 0 0 0 0 0 0
26 000000000000011010 SVT 3 1 0.0002% 0 0 0 0 0 0 0 0
81 000000000001010001 PTD 3 1486 0.2309% 1486 0 0 0 0 0 0 0
145 000000000010010001 PTM 3 170 0.0264% 132 0 0 0 0 0 35 0
16402 000100000000010010 STE 4 314 0.0488% 314 0 0 0 0 0 0 0
16401 000100000000010001 PTE 4 3 0.0005% 0 0 0 0 0 0 0 0
22 000000000000010110 SUT 4 1 0.0002% 0 1 0 0 0 0 0 0
789 000000001100010101 PUTO+ 4 15 0.0023% 0 0 0 0 0 0 0 0
131089 100000000000010001 PTQ 5 38 0.0059% 0 0 0 0 0 0 0 0
131077 100000000000000101 PUQ 5 1 0.0002% 0 0 0 0 0 0 0 0
131073 100000000000000001 PQ 5 2 0.0003% 0 0 0 0 0 0 0 0
4373 000001000100010101 PUTL+ 5 9 0.0014% 0 0 0 1 0 0 0 0
12546 000011000100000010 SLI+ 5 6660 1.0349% 0 6660 0 0 0 0 0 0
21 000000000000010101 PUT 5 11 0.0017% 0 0 0 6 0 5 0 0
4482 000001000110000010 SML+ 5 409 0.0636% 0 409 0 0 0 0 0 0
4354 000001000100000010 SL+ 5 2984 0.4637% 0 2984 0 0 0 0 0 0
5386 000001010100001010 SVXL+ 5 1 0.0002% 0 0 0 0 0 0 0 0
4357 000001000100000101 PUL+ 5 84 0.0131% 0 0 0 0 0 84 0 0
5378 000001010100000010 SXL+ 5 1 0.0002% 0 0 0 0 0 0 0 0
5381 000001010100000101 PUXL+ 5 32 0.0050% 0 0 0 5 0 19 0 0
86274 010101000100000010 SLEN+ 6 2 0.0003% 0 2 0 0 0 0 0 0
81938 010100000000010010 STEN 6 24 0.0037% 24 0 0 0 0 0 0 0
81937 010100000000010001 PTEN 6 5 0.0008% 3 0 0 0 0 0 2 0
81922 010100000000000010 SEN 6 5456 0.8478% 5452 3 0 1 0 0 0 0
81921 010100000000000001 PEN 6 2868 0.4457% 2462 0 0 54 0 98 37 0
65601 010000000001000001 PDN 6 1 0.0002% 1 0 0 0 0 0 0 0
65553 010000000000010001 PTN 6 10 0.0016% 0 0 0 0 0 0 0 0
65537 010000000000000001 PN 6 6432 0.9995% 0 0 0 2705 0 601 0 0
196625 110000000000010001 PTNQ 6 1 0.0002% 0 0 0 0 0 0 0 0

Scores (Table 2)

Character Description of feature (when the value is 1) Frequency
D The source database (D) listed in the interaction record is different than what is expected for the given accession for the protein. In specific cases, this difference is tolerated and the assignment is made. 11038(1.729%)
E The protein reference was a retired NCBI Identifier. NCBI's eUtils (E) were used to retrieve the current accession and/or sequence. 14266(2.2346%)
G The interaction record's reference for the protein was an EntrezGene (G) identifier. The corresponding products of the gene were used to make the assignment. 0(0.0%)
L More than one possible assignment is possible (see + above). The assignment with the largest (L) SEGUID is arbitrarily chosen (see Methods) 10182(1.5949%)
M The protein reference listed by the interaction record was a typographical modification (M) of a known accession. In specific cases, this variation is tolerated and the assignment is made. 1655(0.2592%)
+ More than one possible assignment is possible (+). This case may arise in one of three ways. 1) The reference supplied by the interaction record requires updating but more than one possibility exists. For example, Q7XJL8 was found to be a secondary accession in three separate UniProt records (Q3EBZ2, Q6DR20, and Q8GWA9). 2) The secondary references supplied by the interaction record point to more than one unique protein sequence. 3) An EntrezGene identifier is provided in the interaction record as a protein reference. This identifier points to more than one protein product. An attempt is made to resolve this ambiguity as indicated by ROG score features O, X or L (see below). 10210(1.5993%)
N The protein reference, taxonomy identifier and sequence for the protein as provided in the interaction record are used to make a new entry in the SEGUID table. The protein interactor is assigned the newly (N) generated ROG identifier. 14799(2.3181%)
O More than one possible assignment is possible (see + above). The assignment chosen has a SEGUID that is identical to the SEGUID of the original (O) sequence provided in the interaction record. 28(0.0044%)
I The protein reference used was an NCBI GenInfo Identifier (I). 20019(3.1357%)
U The protein reference listed in the interaction record and used to make the assignment was a secondary UniProt accession and was updated (U) to a primary UniProt accession in order to make the assignment. 21295(3.3356%)
T The taxonomy (T) identifier for the protein (as supplied by the interaction record) differed from what was found in the protein sequence record. This discrepancy was tolerated and the assignment was made 20739(3.2485%)
V The protein reference listed by the interaction record contained version (V) information that was ignored. For example, RefSeq accession.version NP_012420.1 was listed but treated as RefSeq accession NP_012420. 14(0.0022%)
Q The protein reference used to make the assignment was of the type 'see-also'. See PSI-MI Path: entrySet/entry/interactorList/interactor/xref/primaryRef/refType = 'see-also'. 42(0.0066%)
P The interaction record's primary (P) reference for the protein was used to make the assignment 593226(92.9213%)
S One of the interaction record's secondary (S) references for the protein was used to make the assignment 45192(7.0787%)
X More than one possible assignment is possible (see + above). The assignment chosen has the same taxonomy (X) identifier as listed in the interaction record 38(0.0060%)