Statistics iRefIndex 12.0

From irefindex
Revision as of 09:43, 6 June 2013 by Ian.donaldson (talk | contribs) (Created page with "== Data source information == {| cellspacing="0" cellpadding="5" | align="center" style="background:#f0f0f0;"|'''Source''' ||align="center" style="background:#f0f0f0;"|'''Rel...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Data source information

Source Release date Release URL Download files Version
BIND 2013-06-03 20060525*.txt
BIND_TRANSLATION 2013-06-03 BINDTranslation_v1_xml_AllSpecies.tar.gz
BIOGRID 2013-05-31 http://thebiogrid.org/downloads/archives/Release%20Archive/BIOGRID-3.2.101/ BIOGRID-ALL-3.2.101.psi25.zip
CORUM 2009-12-02 http://mips.gsf.de/genre/export/sites/default/corum/ allComplexes.psimi.zip
DIG 2013-06-03 morbidmap14062010.txt
DIP 2013-01-31 http://dip.doe-mbi.ucla.edu/dip/Download.cgi?SM=3
FLY 2013-05-29 http://www.uniprot.org/docs/ fly.txt 2013_06
GENE 2013-06-02 ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ gene2refseq.gz gene_info.gz gene2go.gz gene_history.gz
INNATEDB 2013-06-02 http://www.innatedb.com/download/interactions/ innatedb_all.mitab.gz
INTACT 2013-05-02 ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psi25/ pmidMIF25.zip
IPI 2013-06-03 *.fasta.gz
MATRIXDB 2012-08-03 http://matrixdb.ibcp.fr/cgi-bin/download%3C../../ MatrixDB_20120801.xml.zip
MINT 2013-06-03 *.psi25.zip
MMDB 2013-05-31 ftp://ftp.ncbi.nih.gov/mmdb/pdbeast/ table
MPACT 2013-06-03 mpact-complete.psi25.xml.gz
MPIDB 2013-06-03 http://www.jcvi.org/mpidb/download.php?dbsource= MPI-IMEX MPI-LIT
MPPI 2013-06-03 mppi.gz
OPHID 2013-06-03 ophid*.xml
PDB 2013-06-01 ftp://ftp.ncbi.nih.gov/blast/db/FASTA/ pdbaa.gz
PSI_MI 2013-06-03 http://psidev.cvs.sourceforge.net/viewvc/psidev/psi/mi/rel25/data/ psi-mi25.obo
REFSEQ 2013-05-01 ftp://ftp.ncbi.nih.gov/refseq/release//complete/ complete*.protein.gpff.gz
TAXONOMY 2013-06-02 ftp://ftp.ncbi.nih.gov/pub/taxonomy/ taxdump.tar.gz
UNIPROT 2013-05-29 ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/ uniprot_sprot.dat.gz uniprot_trembl.dat.gz uniprot_sprot_varsplic.fasta.gz reldate.txt 2013_06
YEAST 2013-05-29 http://www.uniprot.org/docs/ yeast.txt 2013_06

Interactions available from major taxonomies

NCBI taxonomy identifier Scientific name Number of interactions
9606 Homo sapiens 238350
559292 Saccharomyces cerevisiae S288c 100970
7227 Drosophila melanogaster 59275
4932 Saccharomyces cerevisiae 46791
40674 Mammalia 36341
10090 Mus musculus 31510
3702 Arabidopsis thaliana 21077
83333 Escherichia coli K-12 15198
6239 Caenorhabditis elegans 15188
192222 Campylobacter jejuni subsp. jejuni NCTC 11168 = ATCC 700819 11970
10116 Rattus norvegicus 9238
562 Escherichia coli 5370
4896 Schizosaccharomyces pombe 4972
632 Yersinia pestis 3954

Interactions available from major taxonomies (corrected)

NCBI taxonomy identifier Scientific name Number of interactions
9606 Homo sapiens 248233
559292 Saccharomyces cerevisiae S288c 114846
7227 Drosophila melanogaster 59278
10090 Mus musculus 28245
3702 Arabidopsis thaliana 21077
83333 Escherichia coli K-12 17234
6239 Caenorhabditis elegans 15188
192222 Campylobacter jejuni subsp. jejuni NCTC 11168 = ATCC 700819 12000
10116 Rattus norvegicus 7755
284812 Schizosaccharomyces pombe 972h- 5189
632 Yersinia pestis 3954
243276 Treponema pallidum subsp. pallidum str. Nichols 3643
1111708 Synechocystis sp. PCC 6803 substr. Kazusa 3229
1392 Bacillus anthracis 3041

Interactions

BIND BIND_TRANSLATION BIOGRID CORUM DIP HPRD INNATEDB INTACT MATRIXDB MINT MPACT MPI-IMEX MPI-LIT MPPI OPHID
BIND 62980 52196 22669 221 25157 1981 171 23748 4 22196 6318 6 27 357 2160
BIND_TRANSLATION 60766 24787 195 25164 2741 241 24211 4 23050 6284 6 23 365 2779
BIOGRID 265695 156 29930 11040 660 40539 6 40327 4212 1 122 7387
CORUM 2607 132 160 30 286 128 15 239
DIP 70253 525 212 26612 3 33102 6755 43 192 57 1174
HPRD 40531 472 4496 17 3443 120 7449
INNATEDB 5331 446 3 323 18 694
INTACT 166524 14 44965 6251 290 166 103 7905
MATRIXDB 229 2 1 25
MINT 88926 6484 15 37 90 7220
MPACT 13338
MPI-IMEX 468 30
MPI-LIT 742
MPPI 778 181
OPHID 47499
(Exclusive to source) 8911 4314 187038 1951 23006 24264 3780 98060 186 21088 5343 163 434 221 30013

Interactors

BIND BIND_TRANSLATION BIOGRID CORUM DIP HPRD INNATEDB INTACT MATRIXDB MINT MPACT MPI-IMEX MPI-LIT MPPI OPHID
BIND 37516 30353 17223 2036 15589 2782 1298 18476 99 16890 4360 32 88 663 3095
BIND_TRANSLATION 36143 17918 2010 15757 3146 1428 18783 116 17005 4003 30 97 666 3365
BIOGRID 46054 2557 14793 6704 2016 27864 117 19241 4483 2 494 6003
CORUM 4363 1428 1559 899 3427 51 2676 408 2248
DIP 23368 1470 1047 18530 75 17312 4550 127 385 384 2083
HPRD 9836 1329 5568 103 4102 275 5209
INNATEDB 3487 2534 85 1868 270 1899
INTACT 59229 167 26157 4932 378 529 662 7576
MATRIXDB 249 118 15 145
MINT 32869 4804 69 243 567 5691
MPACT 4982 1
MPI-IMEX 470 91
MPI-LIT 934
MPPI 835 418
OPHID 9533
(Exclusive to source) 5670 3553 12614 388 1898 1735 395 18948 35 3674 18 80 314 23 650

Summary of mapping interaction records to RIGs (Table 5)

Source Total records Protein-related interactions PPI assigned to RIGID % Unique RIGIDs %
BIND 157736 91767 91094 99.27 62980 69.14
BIND_TRANSLATION 192923 84138 82037 97.50 60766 74.07
BIOGRID 684996 404834 402886 99.52 265695 65.95
CORUM 2844 2844 2844 100.00 2607 91.67
DIP 74086 72661 72630 99.96 70253 96.73
HPRD 83022 83022 82983 99.95 40531 48.84
INNATEDB 18676 18676 7853 42.05 5331 67.88
INTACT 204708 198018 197976 99.98 166524 84.11
MATRIXDB 1065 392 392 100.00 229 58.42
MINT 127577 127022 126718 99.76 88926 70.18
MPACT 16504 16504 16308 98.81 13338 81.79
MPI-IMEX 473 473 468 98.94 468 100.00
MPI-LIT 745 745 742 99.60 742 100.00
MPPI 1814 1758 1583 90.05 778 49.15
OPHID 73257 73257 73257 100.00 47499 64.84
(All) 1640426 1176111 1159771 98.61 826667 71.28

Assignment of protein interactors to ROGs (Table 3)

Source Protein interactors Assigned % Arbitrary Matching sequence New or obsolete sequence Unassigned Unique proteins
BIND 253014 251997 99.60 0 0 40366 1017 37516
BIND_TRANSLATION 257683 252224 97.88 20770 0 23766 5459 36143
BIOGRID 47011 46247 98.37 9960 0 312 764 46054
CORUM 12916 12916 100.00 7 0 0 0 4363
DIP 24142 24127 99.94 570 0 1272 15 23368
HPRD 123812 123812 100.00 13685 85465 213 0 9836
INNATEDB 39725 25384 63.90 0 0 0 14341 3487
INTACT 169815 169749 99.96 69 29 395 66 59229
MATRIXDB 1274 1274 100.00 0 0 0 0 249
MINT 91829 91589 99.74 570 11 4031 240 32869
MPACT 40349 40134 99.47 0 0 3 215 4982
MPI-IMEX 946 940 99.37 2 0 0 6 470
MPI-LIT 1490 1487 99.80 7 0 0 3 934
MPPI 3568 3366 94.34 16 0 5 202 835
OPHID 146514 146514 100.00 405 12 1014 0 9533
(All) 1214088 1191760 98.16 46061 85517 71377 22328 108227

ROG summary

BIND BIND_TRANSLATION BIOGRID CORUM DIP HPRD INNATEDB INTACT MATRIXDB MINT MPACT MPI-IMEX MPI-LIT MPPI OPHID
P 185770 31422 12877 25384 168920 51762 616 1071
P+IN 2
P+LY 160 2
P+N 9 223
PD 124956 1271 2 2996 124479
PD+IN 1
PD+LQ 10230
PD+LYQ 67
PD+N 22
PD+XQ 26
PDIQ 219
PDIYQ 513
PDQ 15773
PDY 4437 1 5 992
PDYQ 15454
PGD 657 2163 308
PGD+L 6256 9933 493
PGD+X 13
PI 2 17
PIY 373 49
PT 2671 1851 34333 30579 320 405
PTD 86561 3 2 44 114
PTD+LQ 3992
PTD+LYQ 12
PTDIYQ 13
PTDQ 2159
PTDY 220
PTDYQ 138
PTGD 21 1
PTGD+L 17 3
PTI 11
PTIY 1
PTY 1
PU 15 32 306 366 2
PU+L 17 7 42 39 7
PU+O 14 3
PU+X 610 2 15 1
PUD 6 143 16955
PUD+L 13 265
PUD+O 12
PUD+X 82 162 3526
PUT 4 14 170 2527 2 1
PUT+L 19 27 37 2
PUT+O 15 8
PUTD 4 9
PUTD+L 3 140
PV 3
PV+LY 1
PVY 7
PY 7409 269 8 3747
S 2 543 11535 168 1
S+L 6 167 752
S+LY 16 66 6
S+N 2
S+O 223
S+X 88
S+XY 175
SD 3418 5709
SD+L 220 465
SD+LY 29
SD+N 124
SD+O 8626
SD+OY 14
SD+X 1112
SD+XY 3
SDY 233
SGD 1126
SGD+L 1669
SGD+O 14940
SI 17
SIY 32240
ST 4768 216 7025
ST+L 25 4628
ST+LY 61
ST+O 748
STD 627 14013
STD+L 4 1162
STD+O 22808
STD+OY 6
STDY 2 2
STGD 3296
STGD+L 4927
STGD+O 37967
STI 5
STIY 3469
STY 2 3
SUD 44
SUD+L 32 8
SUD+O 2
SUD+X 777
SUTD 11 8
SUTD+L 27 7
SUTD+O 131
SY 24 762 2