Difference between revisions of "iRefIndex/PBR Notes"

From irefindex
m (→‎Results: Added table borders.)
(→‎Retrieving BIND Identifiers: Added queries for interaction identifiers, clarifying what the existing queries actually retrieve.)
Line 205: Line 205:
 
The following queries determine the appropriate set of interactions and then derive the BIND identifiers from them.
 
The following queries determine the appropriate set of interactions and then derive the BIND identifiers from them.
  
Find common BIND identifiers:
+
Find all BIND identifiers in iRefIndex 5.1:
 +
 
 +
create table export_all_i5_bind as
 +
select distinct i5_mitab_interaction_identifiers.uid
 +
from i5_mitab_sources
 +
inner join i5_mitab_interactions
 +
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
 +
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
 +
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
 +
  and i5_mitab_interactions.taxA = 9606
 +
  and i5_mitab_interactions.taxB = 9606
 +
inner join i5_mitab_interaction_identifiers
 +
  on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
 +
  and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
 +
  and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
 +
where name = 'bind' and uid like 'bind:%';
 +
 
 +
Find all BIND identifiers in BIND PSI-XML:
 +
 
 +
create table export_all_xml_bind as
 +
select distinct mitab_interaction_identifiers.uid
 +
from mitab_sources
 +
inner join mitab_interactions
 +
  on mitab_sources.uidA = mitab_interactions.uidA
 +
  and mitab_sources.uidB = mitab_interactions.uidB
 +
  and mitab_sources.rigid = mitab_interactions.rigid
 +
  and mitab_interactions.taxA = 9606
 +
  and mitab_interactions.taxB = 9606
 +
inner join mitab_interaction_identifiers
 +
  on mitab_sources.uidA = mitab_interaction_identifiers.uidA
 +
  and mitab_sources.uidB = mitab_interaction_identifiers.uidB
 +
  and mitab_sources.rigid = mitab_interaction_identifiers.rigid
 +
where name = 'bind' and uid like 'bind:%';
 +
 
 +
Find BIND identifiers '''for common interactions''' where those interactions are known by the '''same identifier''':
  
 
  create table export_common_bind as
 
  create table export_common_bind as
Line 237: Line 271:
 
   and i5_mitab_interaction_identifiers.uid like 'bind:%';
 
   and i5_mitab_interaction_identifiers.uid like 'bind:%';
  
Find BIND identifiers only in iRefIndex 5.1:
+
Find BIND identifiers '''for interactions''' only in iRefIndex 5.1:
  
 
  create table export_i5_bind as
 
  create table export_i5_bind as
Line 270: Line 304:
 
     );
 
     );
  
Find BIND identifiers only in BIND PSI-XML:
+
Find BIND identifiers '''for interactions''' only in BIND PSI-XML:
  
 
  create table export_xml_bind as
 
  create table export_xml_bind as
Line 307: Line 341:
 
There are 183614 BIND identifiers in the iRefIndex 5.1 release, 188690 in the BIND PSI-XML dataset.
 
There are 183614 BIND identifiers in the iRefIndex 5.1 release, 188690 in the BIND PSI-XML dataset.
  
The following distribution was observed (with each cell representing an intersection of the listed sets):
+
The following distribution of BIND identifiers '''for interactions''' was observed (with each cell representing an intersection of the listed sets):
  
 
{| border="1" cellpadding="5" cellspacing="0"
 
{| border="1" cellpadding="5" cellspacing="0"

Revision as of 12:55, 2 October 2009

Preliminary BIND PSI-XML parsing results.

Results

Importing MITAB output from the iRefIndex 5.1 build and a build done using the BIND PSI-XML data, then comparing the interaction details for BIND, the following distributions are observed:

Interactions

iRefIndex 5.1
(Total) (7122)
1324
PSI-XML (8706) 2908 5798

To summarise:

  • Common interactions: 5798
  • Interactions only in iRefIndex 5.1: 1324
  • Interactions only in BIND PSI-XML: 2908

Interactors

iRefIndex 5.1
(Total) (5669)
1450
PSI-XML (5152) 933 4219

To summarise:

  • Common interactors: 4219
  • Interactors only in iRefIndex 5.1: 1450
  • Interactors only in BIND PSI-XML: 933

Queries

Common interactions:

select count(distinct mitab_sources.rigid)
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
inner join mitab_sources
  on i5_mitab_sources.rigid = mitab_sources.rigid
  and i5_mitab_sources.name = mitab_sources.name
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
where mitab_sources.name = 'bind';

Interactions only in iRefIndex 5.1:

select count(distinct i5_mitab_sources.rigid)
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
where i5_mitab_sources.name = 'bind'
  and i5_mitab_sources.rigid not in (
    select distinct mitab_sources.rigid
    from mitab_sources
    inner join mitab_interactions
      on mitab_sources.uidA = mitab_interactions.uidA
      and mitab_sources.uidB = mitab_interactions.uidB
      and mitab_sources.rigid = mitab_interactions.rigid
      and mitab_interactions.taxA = 9606
      and mitab_interactions.taxB = 9606
    where mitab_sources.name = 'bind'
    );

Interactions only in BIND PSI-XML:

select count(distinct mitab_sources.rigid)
from mitab_sources
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
where mitab_sources.name = 'bind'
  and mitab_sources.rigid not in (
    select distinct i5_mitab_sources.rigid
    from i5_mitab_sources
    inner join i5_mitab_interactions
      on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
      and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
      and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
      and i5_mitab_interactions.taxA = 9606
      and i5_mitab_interactions.taxB = 9606
    where i5_mitab_sources.name = 'bind'
    );

Common interactors:

select count(distinct X.uid)
from (
  select i5_mitab_sources.uidA as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select i5_mitab_sources.uidB as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  ) as X
inner join (
  select mitab_sources.uidA as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidA = mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select mitab_sources.uidB as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidB = mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  ) as Y
  on X.uid = Y.uid;

Interactors only in iRefIndex 5.1:

select count(distinct X.uid)
from (
  select i5_mitab_sources.uidA as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select i5_mitab_sources.uidB as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  ) as X
where X.uid not in (
  select mitab_sources.uidA as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidA = mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select mitab_sources.uidB as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidB = mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  );

Interactors only in BIND PSI-XML:

select count(distinct X.uid)
from (
  select mitab_sources.uidA as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidA = mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select mitab_sources.uidB as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidB = mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  ) as X
where X.uid not in (
  select i5_mitab_sources.uidA as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select i5_mitab_sources.uidB as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  );

Retrieving BIND Identifiers

The following queries determine the appropriate set of interactions and then derive the BIND identifiers from them.

Find all BIND identifiers in iRefIndex 5.1:

create table export_all_i5_bind as
select distinct i5_mitab_interaction_identifiers.uid
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
inner join i5_mitab_interaction_identifiers
  on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
  and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
  and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
where name = 'bind' and uid like 'bind:%';

Find all BIND identifiers in BIND PSI-XML:

create table export_all_xml_bind as
select distinct mitab_interaction_identifiers.uid
from mitab_sources
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
inner join mitab_interaction_identifiers
  on mitab_sources.uidA = mitab_interaction_identifiers.uidA
  and mitab_sources.uidB = mitab_interaction_identifiers.uidB
  and mitab_sources.rigid = mitab_interaction_identifiers.rigid
where name = 'bind' and uid like 'bind:%';

Find BIND identifiers for common interactions where those interactions are known by the same identifier:

create table export_common_bind as
select distinct i5_mitab_interaction_identifiers.uid
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
inner join i5_mitab_interaction_identifiers
  on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
  and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
  and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
inner join mitab_sources
  on i5_mitab_sources.rigid = mitab_sources.rigid
  and i5_mitab_sources.name = mitab_sources.name
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
inner join mitab_interaction_identifiers
  on mitab_sources.uidA = mitab_interaction_identifiers.uidA
  and mitab_sources.uidB = mitab_interaction_identifiers.uidB
  and mitab_sources.rigid = mitab_interaction_identifiers.rigid
  and i5_mitab_interaction_identifiers.uid = mitab_interaction_identifiers.uid
where mitab_sources.name = 'bind'
  and i5_mitab_interaction_identifiers.uid like 'bind:%';

Find BIND identifiers for interactions only in iRefIndex 5.1:

create table export_i5_bind as
select distinct i5_mitab_interaction_identifiers.uid
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
inner join i5_mitab_interaction_identifiers
  on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
  and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
  and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
where i5_mitab_sources.name = 'bind'
  and i5_mitab_interaction_identifiers.uid like 'bind:%'
  and i5_mitab_sources.rigid not in (
    select distinct mitab_sources.rigid
    from mitab_sources
    inner join mitab_interactions
      on mitab_sources.uidA = mitab_interactions.uidA
      and mitab_sources.uidB = mitab_interactions.uidB
      and mitab_sources.rigid = mitab_interactions.rigid
      and mitab_interactions.taxA = 9606
      and mitab_interactions.taxB = 9606
    inner join mitab_interaction_identifiers
      on mitab_sources.uidA = mitab_interaction_identifiers.uidA
      and mitab_sources.uidB = mitab_interaction_identifiers.uidB
      and mitab_sources.rigid = mitab_interaction_identifiers.rigid
    where mitab_sources.name = 'bind'
    );

Find BIND identifiers for interactions only in BIND PSI-XML:

create table export_xml_bind as
select distinct mitab_interaction_identifiers.uid
from mitab_sources
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
inner join mitab_interaction_identifiers
  on mitab_sources.uidA = mitab_interaction_identifiers.uidA
  and mitab_sources.uidB = mitab_interaction_identifiers.uidB
  and mitab_sources.rigid = mitab_interaction_identifiers.rigid
where mitab_sources.name = 'bind'
  and mitab_interaction_identifiers.uid like 'bind:%'
  and mitab_sources.rigid not in (
    select distinct i5_mitab_sources.rigid
    from i5_mitab_sources
    inner join i5_mitab_interactions
      on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
      and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
      and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
      and i5_mitab_interactions.taxA = 9606
      and i5_mitab_interactions.taxB = 9606
    inner join i5_mitab_interaction_identifiers
      on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
      and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
      and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
    where i5_mitab_sources.name = 'bind'
    );

Observations

There are 183614 BIND identifiers in the iRefIndex 5.1 release, 188690 in the BIND PSI-XML dataset.

The following distribution of BIND identifiers for interactions was observed (with each cell representing an intersection of the listed sets):

Common iRefIndex 5.1 only BIND PSI-XML only
Common (8639)
iRefIndex 5.1 only 76 (1613)
BIND PSI-XML only 0 1088 (3248)

In other words:

  • There are 8639 common BIND identifiers in total.
  • There are 1613 BIND identifiers appearing for interactions only in iRefIndex 5.1.
  • There are 3248 BIND identifiers appearing for interactions only in the BIND PSI-XML dataset.
  • Although the intersection between "iRefIndex 5.1 only" and "BIND PSI-XML only" should yield zero interactions, since by definition an interaction in one of these sets should not appear in another, 1088 BIND identifiers appear in both.
  • Of the interactions that should only be referenced by iRefIndex 5.1, 76 BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset.
  • Of the interactions that should only be referenced by the BIND PSI-XML dataset, 0 BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset. This is as expected.

In fact, some BIND identifiers appear to refer to more than one RIG identifier. For example:

  • BIND identifier 100771 refers to 5TlbBV7VB0MFa/R1cmyeByxSFjw in only the iRefIndex 5.1 release, but also to pXZtvCDYsluwWrmdXf5FJmQ0y54 in that release and in the BIND PSI-XML dataset.
  • BIND identifier 100321 refers to VwWuNAmhR/FDx6FtF6FacwAmzr0 in only the iRefIndex 5.1 release, but to Dr5ri7vTj02leEUMKr35jj1fP/c in only the BIND PSI-XML dataset.

See bug #133 for further details and discussion.

Other Stuff

This is what I did for the unnecessary "new BIND vs. other sources" experiment...

Downloaded the following:

  • BioGrid
  • CORUM
  • DIP
  • HPRD
  • Intact
  • MINT
  • MIPS (MPPI)
  • OPHID

Modified the following:

  • BioGrid - removed all but human data

Parsed all sources mentioned above.

Removed all non-9606 records.

Did ROG assignment and so on.

Made the reports.