Statistics iRefIndex 17.0

Interactions available from major taxonomies (corrected)

Taxons of the protein interactors have been corrected to correspond to the taxon provided in the protein sequence record regardless of the taxon listed in the interaction record. See PMID 18823568 for details.

NCBI taxonomy identifier	Scientific name	Number of interactions
9606	Homo sapiens	732100
559292	Saccharomyces cerevisiae S288C	139956
7227	Drosophila melanogaster	76585
10090	Mus musculus	70996
3702	Arabidopsis thaliana	59726
6239	Caenorhabditis elegans	32382
83333	Escherichia coli K-12	16968
10116	Rattus norvegicus	14516
316407	Escherichia coli str. K-12 substr. W3110	12822
4896	Schizosaccharomyces pombe	12432
192222	Campylobacter jejuni subsp. jejuni NCTC 11168 =ATCC 700819	11930
632	Yersinia pestis	4166
243276	Treponema pallidum subsp. pallidum str. Nichols	3643
1111708	Synechocystis sp. PCC 6803 substr. Kazusa	3275

Summary of mapping interaction records to RIGs (redundant interaction groups)

Source: Interaction data source. Total records: Total number of interaction records found in source. Protein-only interactors:Total number of interactions involving only protein interactors. PPI assigned to RIGID: Number of interactions where all protein interactors were assigned to a ROG. Percentage of column 3 is shown. Unique RIGIDs (interactions): Number of unique protein interactions and complexes (RIGID's) found in the data source (also expressed as a percentage of column 4). For a description of the term RIGs, see README_MITAB2.6_for_iRefIndex#Understanding_the_iRefIndex_MITAB_format and the original paper PMID 18823568.

Source	Total records	Protein-related interactions	PPI assigned to RIGID	%	Unique RIGIDs	%
BAR	10396	10396	10383	99.87	10369	99.87
BHF_UCL	2341	2327	2327	100.00	1514	65.06
BIND	157736	91309	68064	74.54	49513	72.74
BIND_TRANSLATION	192923	84138	82233	97.74	60855	74.00
BIOGRID	1760395	873924	869987	99.55	646657	74.33
CORUM	4274	4274	4270	99.91	4018	94.10
DIP	81731	80134	79878	99.68	77468	96.98
HPIDB	6038	5769	5769	100.00	2432	42.16
HPRD	83022	83022	82983	99.95	40530	48.84
HURI	171545	168756	168750	100.00	51482	30.51
INNATEDB	18408	18408	6903	37.50	4815	69.75
INTACT	651130	597534	597404	99.98	345369	57.81
INTCOMPLEX	2821	2261	2261	100.00	2226	98.45
MATRIXDB	37217	36866	36866	100.00	22361	60.65
MBINFO	1084	1057	1057	100.00	539	50.99
MINT	165946	165136	165118	99.99	61283	37.11
MPACT	16504	16504	16373	99.21	13398	81.83
MPIDB	1505	1504	1425	94.75	893	62.67
MPPI	1814	1758	1578	89.76	776	49.18
QUICKGO	75574	60741	58630	96.52	29763	50.76
REACTOME	141996	141996	141844	99.89	126328	89.06
SPIKE	29686	29686	28327	95.42	27828	98.24
UNIPROTPP	12863	12775	12775	100.00	7233	56.62
VIRUSHOST	15000	15000	15000	100.00	9397	62.65
(All)	3641949	2505275	2460205	98.20	1185907	48.20

Assignment of protein interactors to ROGs (redundant object group)

Source: Interaction data source (see methods). Protein interactors: Total number of interactors found in all interaction records. Assigned: Number of proteins assigned unambiguously to a ROG. Assignments listed in columns 5 and 6 are not included here. %: Column 3 expressed as a percentage of column 2. Arbitrary: Total number of ROG assignments that were ambiguous and resolved with an arbitrary method (see ROG scores with 'L'). Matching sequence: Total number of assignments made where a sequence in the interaction record matched a known sequence. Unassigned:Total number of protein interactors that could not be assigned to a ROG. Unique: Total number of unique proteins (ROG's). For a description of the term ROGs, see README_MITAB2.6_for_iRefIndex#Understanding_the_iRefIndex_MITAB_format and the original paper PMID 18823568.

Source	Protein interactors	Assigned	%	Arbitrary	Matching sequence	New or obsolete sequence	Unassigned	Unique proteins
BAR	20792	20779	99.94	0	0	0	13	3267
BHF_UCL	6185	6185	100.00	0	0	0	0	1792
BIND	252251	215207	85.31	17	0	6491	37044	30237
BIND_TRANSLATION	257681	254882	98.91	20507	0	10469	2799	36881
BIOGRID	70364	69209	98.36	3194	0	6748	1155	68896
CORUM	17317	17313	99.98	2	0	7	4	6125
DIP	28066	27916	99.47	643	0	1398	150	27166
HPIDB	12529	12529	100.00	0	0	0	0	2481
HPRD	123812	123812	100.00	16325	87744	169	0	9837
HURI	340295	340289	100.00	33	0	438	6	8181
INNATEDB	42658	25503	59.78	0	0	0	17155	3742
INTACT	541927	541743	99.97	202	60	480	184	95845
INTCOMPLEX	10287	10287	100.00	0	0	2	0	5753
MATRIXDB	180936	180936	100.00	0	0	27	0	21540
MBINFO	1746	1746	100.00	0	0	0	0	274
MINT	480615	480588	99.99	312	0	31	27	26859
MPACT	40349	40199	99.63	0	0	0	150	4995
MPIDB	3238	3090	95.43	0	0	1	148	930
MPPI	3568	3361	94.20	16	0	0	207	833
QUICKGO	136308	134148	98.42	0	0	0	2160	26894
REACTOME	283992	283839	99.95	704	0	0	153	5860
SPIKE	65934	64565	97.92	889	0	17	1369	8809
UNIPROTPP	35584	35584	100.00	1	0	0	0	8132
VIRUSHOST	30000	30000	100.00	0	0	378	0	3882
(All)	2986434	2923710	97.90	42845	87804	26656	62724	153527

Mapping score summary

See below for definitions of the mapping score codes.

	BAR	BHF_UCL	BIND	BIND_TRANSLATION	BIOGRID	CORUM	DIP	HPIDB	HPRD	HURI	INNATEDB	INTACT	INTCOMPLEX	MATRIXDB	MBINFO	MINT	MPACT	MPIDB	MPPI	QUICKGO	REACTOME	SPIKE	UNIPROTPP	VIRUSHOST
P	20640	6185		180023	45936	17272		12521		339113	25501	540427	10270	180764	1746	479471		3075		131370	266602	54981	35563	29622
P+IN												385
P+N												28
PD			127837		7245							3							2994
PD+LQ				10123
PD+LYQ				43
PD+XQ				26
PDQ				31341
PDY			5447		1
PDYQ				16
PE					1114
PGD				675	1844							1										397
PGD+L				6237	3164							3										876
PGD+X					1
PT				8450	3076	19						1					30579	2		2778
PTD			80723		2							2							44
PTD+LQ				4036
PTD+LYQ				8
PTDQ				2696
PTDY			1044
PTDYQ				6
PTGD				18	1
PTGD+L				19	2
PTM												3
PTY				1	1	3																		378
PU	139			118		13		8		705	2	539	15	145		739		6			16533	8281	20
PU+L				17		2				33		157				293					704	13
PU+O												46
PU+X				604								1
PUD			81		9														145
PUD+L			7		9														13
PUD+X			54																162
PUT				4								12				8	2527	6
PUT+L				24								42				19							1
PUT+O												14
PUTD			4
PUTD+L			10																3
PV												9				27
PY				10395	6729	4				438		54	2	27		31		1				17
S				2	45		12982		147			3
S+IN												1
S+L					10		214		732
S+LE					1
S+LY					8		63
S+N												4
S+O									243
S+X							215
S+XY							216
SD							5384		4286
SD+L							245		607
SD+N									133
SD+O									10167
SD+X							1268
SDY							10
SE					2
SGD									936
SGD+L									2215
SGD+O									14365
ST							4699		136								7093
ST+L							26		4614
ST+LY							4		36
ST+O									829
STD							733		11603
STD+L							9		1149
STD+O									26380
STGD									2502
STGD+L									6922
STGD+O									35594
STY							25
SUD							70
SUD+L							50		27
SUD+O									7
SUD+X							568
SUTD							23
SUTD+L							32		23
SUTD+O									159
SY					9		1080					8

Mapping score code definitions

Character	Description of feature (when the value is 1)	align="center" style="background:#f0f0f0;"
D	The source database (D) listed in the interaction record is different than what is expected for the given accession for the protein. In specific cases, this difference is tolerated and the assignment is made.
E	The protein reference was a retired NCBI Identifier or a UniProt identifier. NCBI's eUtils (E) were used to retrieve the current accession and/or sequence. For the identifiers still with no sequence after going through eUtils, sequence information obtained from UniProt.
G	The interaction record's reference for the protein was an EntrezGene (G) identifier. The corresponding products of the gene were used to make the assignment.
L	More than one possible assignment is possible (see + above). (e.g. isoforms for a geneid) In such a situation, references are picked using a ranking system (first look for RefSeq, then UniProt). Even after this ranking if ambiguity exists, the reference with lengthiest sequences selected. (Please note that this score class definition is different from originally published one)
M	The protein reference listed by the interaction record was a typographical modification (M) of a known accession. In specific cases, this variation is tolerated and the assignment is made.
+	More than one possible assignment is possible (+). This case may arise in one of three ways. 1) The reference supplied by the interaction record requires updating but more than one possibility exists. For example, Q7XJL8 was found to be a secondary accession in three separate UniProt records (Q3EBZ2, Q6DR20, and Q8GWA9). 2) The secondary references supplied by the interaction record point to more than one unique protein sequence. 3) An EntrezGene identifier is provided in the interaction record as a protein reference. This identifier points to more than one protein product. An attempt is made to resolve this ambiguity as indicated by ROG score features O, X or L (see below).
N	The protein reference, taxonomy identifier and sequence for the protein as provided in the interaction record are used to make a new entry in the SEGUID table. The protein interactor is assigned the newly (N) generated ROG identifier.
O	More than one possible assignment is possible (see + above). The assignment chosen has a SEGUID that is identical to the SEGUID of the original (O) sequence provided in the interaction record.
I	The protein reference used was an NCBI GenInfo Identifier (I).
U	The protein reference listed in the interaction record and used to make the assignment was a secondary UniProt accession and was updated (U) to a primary UniProt accession in order to make the assignment.
T	The taxonomy (T) identifier for the protein (as supplied by the interaction record) differed from what was found in the protein sequence record. This discrepancy was tolerated and the assignment was made
V	The protein reference listed by the interaction record contained version (V) information that was ignored. For example, RefSeq accession.version NP_012420.1 was listed but treated as RefSeq accession NP_012420.
Q	The protein reference used to make the assignment was of the type 'see-also'. See PSI-MI Path: entrySet/entry/interactorList/interactor/xref/primaryRef/refType = 'see-also'.
P	The interaction record's primary (P) reference for the protein was used to make the assignment
S	One of the interaction record's secondary (S) references for the protein was used to make the assignment
Y	the accession was referring an accession which was removed from RefSeq or UniProt after beta3 build of iRefIndex (March 9th, 2009)
X	More than one possible assignment is possible (see + above). The assignment chosen has the same taxonomy (X) identifier as listed in the interaction record

Anonymous

Search

Statistics iRefIndex 17.0

Namespaces

More

Page actions

Contents

Interactions available from major taxonomies (corrected)

Summary of mapping interaction records to RIGs (redundant interaction groups)

Assignment of protein interactors to ROGs (redundant object group)

Mapping score summary

Mapping score code definitions

Navigation

Navigation

Internal Links

Wiki tools

Wiki tools

Anonymous

Search

Statistics iRefIndex 17.0

Contents

Interactions available from major taxonomies (corrected)

Summary of mapping interaction records to RIGs (redundant interaction groups)

Assignment of protein interactors to ROGs (redundant object group)

Mapping score summary

Mapping score code definitions

Navigation

Wiki tools

Page tools

Categories