Difference between revisions of "iRefIndex/PBR Notes"

From irefindex
(Restricted interactors to 9606 for the "human" BIND PSI-XML data.)
(→‎Queries: Added BIND identifier extraction.)
Line 200: Line 200:
 
   where name = 'bind' and taxB = 9606
 
   where name = 'bind' and taxB = 9606
 
   );
 
   );
 +
 +
== Retrieving BIND Identifiers ==
 +
 +
The following queries determine the appropriate set of interactions and then derive the BIND identifiers from them.
 +
 +
Find common BIND identifiers:
 +
 +
create table export_common_bind as
 +
select distinct i5_mitab_interaction_identifiers.uid
 +
from i5_mitab_sources
 +
inner join i5_mitab_interactions
 +
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
 +
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
 +
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
 +
  and i5_mitab_interactions.taxA = 9606
 +
  and i5_mitab_interactions.taxB = 9606
 +
inner join i5_mitab_interaction_identifiers
 +
  on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
 +
  and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
 +
  and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
 +
inner join mitab_sources
 +
  on i5_mitab_sources.rigid = mitab_sources.rigid
 +
  and i5_mitab_sources.name = mitab_sources.name
 +
inner join mitab_interactions
 +
  on mitab_sources.uidA = mitab_interactions.uidA
 +
  and mitab_sources.uidB = mitab_interactions.uidB
 +
  and mitab_sources.rigid = mitab_interactions.rigid
 +
  and mitab_interactions.taxA = 9606
 +
  and mitab_interactions.taxB = 9606
 +
inner join mitab_interaction_identifiers
 +
  on mitab_sources.uidA = mitab_interaction_identifiers.uidA
 +
  and mitab_sources.uidB = mitab_interaction_identifiers.uidB
 +
  and mitab_sources.rigid = mitab_interaction_identifiers.rigid
 +
  and i5_mitab_interaction_identifiers.uid = mitab_interaction_identifiers.uid
 +
where mitab_sources.name = 'bind'
 +
  and i5_mitab_interaction_identifiers.uid like 'bind:%';
 +
 +
Find BIND identifiers only in iRefIndex 5.1:
 +
 +
create table export_i5_bind as
 +
select distinct i5_mitab_interaction_identifiers.uid
 +
from i5_mitab_sources
 +
inner join i5_mitab_interactions
 +
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
 +
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
 +
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
 +
  and i5_mitab_interactions.taxA = 9606
 +
  and i5_mitab_interactions.taxB = 9606
 +
inner join i5_mitab_interaction_identifiers
 +
  on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
 +
  and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
 +
  and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
 +
where i5_mitab_sources.name = 'bind'
 +
  and i5_mitab_interaction_identifiers.uid like 'bind:%'
 +
  and i5_mitab_sources.rigid not in (
 +
    select distinct mitab_sources.rigid
 +
    from mitab_sources
 +
    inner join mitab_interactions
 +
      on mitab_sources.uidA = mitab_interactions.uidA
 +
      and mitab_sources.uidB = mitab_interactions.uidB
 +
      and mitab_sources.rigid = mitab_interactions.rigid
 +
      and mitab_interactions.taxA = 9606
 +
      and mitab_interactions.taxB = 9606
 +
    inner join mitab_interaction_identifiers
 +
      on mitab_sources.uidA = mitab_interaction_identifiers.uidA
 +
      and mitab_sources.uidB = mitab_interaction_identifiers.uidB
 +
      and mitab_sources.rigid = mitab_interaction_identifiers.rigid
 +
    where mitab_sources.name = 'bind'
 +
    );
 +
 +
Find BIND identifiers only in BIND PSI-XML:
 +
 +
create table export_xml_bind as
 +
select distinct mitab_interaction_identifiers.uid
 +
from mitab_sources
 +
inner join mitab_interactions
 +
  on mitab_sources.uidA = mitab_interactions.uidA
 +
  and mitab_sources.uidB = mitab_interactions.uidB
 +
  and mitab_sources.rigid = mitab_interactions.rigid
 +
  and mitab_interactions.taxA = 9606
 +
  and mitab_interactions.taxB = 9606
 +
inner join mitab_interaction_identifiers
 +
  on mitab_sources.uidA = mitab_interaction_identifiers.uidA
 +
  and mitab_sources.uidB = mitab_interaction_identifiers.uidB
 +
  and mitab_sources.rigid = mitab_interaction_identifiers.rigid
 +
where mitab_sources.name = 'bind'
 +
  and mitab_interaction_identifiers.uid like 'bind:%'
 +
  and mitab_sources.rigid not in (
 +
    select distinct i5_mitab_sources.rigid
 +
    from i5_mitab_sources
 +
    inner join i5_mitab_interactions
 +
      on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
 +
      and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
 +
      and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
 +
      and i5_mitab_interactions.taxA = 9606
 +
      and i5_mitab_interactions.taxB = 9606
 +
    inner join i5_mitab_interaction_identifiers
 +
      on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
 +
      and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
 +
      and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
 +
    where i5_mitab_sources.name = 'bind'
 +
    );
 +
 +
=== Observations ===
 +
 +
There are 183614 BIND identifiers in the iRefIndex 5.1 release, 188690 in the BIND PSI-XML dataset.
 +
 +
The following distribution was observed (with each cell representing an intersection of the listed sets):
 +
 +
{|
 +
|                        || Common  || iRefIndex 5.1 ''only'' || BIND PSI-XML ''only''
 +
|-
 +
| Common                || (8639)  ||                        ||
 +
|-
 +
| iRefIndex 5.1 ''only'' || '''76''' || (1613)                ||   
 +
|-
 +
| BIND PSI-XML ''only''  || '''0'''  || '''1088'''            || (3248)
 +
|}
 +
 +
In other words:
 +
 +
* There are 8639 common BIND identifiers in total.
 +
* There are 1613 BIND identifiers appearing for interactions only in iRefIndex 5.1.
 +
* There are 3248 BIND identifiers appearing for interactions only in the BIND PSI-XML dataset.
 +
* Although the intersection between "iRefIndex 5.1 only" and "BIND PSI-XML only" should yield zero interactions, since by definition an interaction in one of these sets should not appear in another, '''1088''' BIND identifiers appear in both.
 +
* Of the interactions that should only be referenced by iRefIndex 5.1, '''76''' BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset.
 +
* Of the interactions that should only be referenced by the BIND PSI-XML dataset, '''0''' BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset. This is as expected.
 +
 +
In fact, some BIND identifiers appear to refer to more than one RIG identifier. For example:
 +
 +
* BIND identifier 100771 refers to <tt>5TlbBV7VB0MFa/R1cmyeByxSFjw</tt> in only the iRefIndex 5.1 release, but also to <tt>pXZtvCDYsluwWrmdXf5FJmQ0y54</tt> in the BIND PSI-XML dataset.
 +
* BIND identifier 100321 refers to <tt>VwWuNAmhR/FDx6FtF6FacwAmzr0</tt> in only the iRefIndex 5.1 release, but to <tt>Dr5ri7vTj02leEUMKr35jj1fP/c</tt> in only the BIND PSI-XML dataset.
  
 
== Other Stuff ==
 
== Other Stuff ==

Revision as of 16:26, 25 September 2009

Preliminary BIND PSI-XML parsing results.

Results

Importing MITAB output from the iRefIndex 5.1 build and a build done using the BIND PSI-XML data, then comparing the interaction details for BIND, the following distributions are observed:

Interactions

iRefIndex 5.1
(Total) (7122)
1324
PSI-XML (8706) 2908 5798

To summarise:

  • Common interactions: 5798
  • Interactions only in iRefIndex 5.1: 1324
  • Interactions only in BIND PSI-XML: 2908

Interactors

iRefIndex 5.1
(Total) (5669)
1450
PSI-XML (5152) 933 4219

To summarise:

  • Common interactors: 4219
  • Interactors only in iRefIndex 5.1: 1450
  • Interactors only in BIND PSI-XML: 933

Queries

Common interactions:

select count(distinct mitab_sources.rigid)
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
inner join mitab_sources
  on i5_mitab_sources.rigid = mitab_sources.rigid
  and i5_mitab_sources.name = mitab_sources.name
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
where mitab_sources.name = 'bind';

Interactions only in iRefIndex 5.1:

select count(distinct i5_mitab_sources.rigid)
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
where i5_mitab_sources.name = 'bind'
  and i5_mitab_sources.rigid not in (
    select distinct mitab_sources.rigid
    from mitab_sources
    inner join mitab_interactions
      on mitab_sources.uidA = mitab_interactions.uidA
      and mitab_sources.uidB = mitab_interactions.uidB
      and mitab_sources.rigid = mitab_interactions.rigid
      and mitab_interactions.taxA = 9606
      and mitab_interactions.taxB = 9606
    where mitab_sources.name = 'bind'
    );

Interactions only in BIND PSI-XML:

select count(distinct mitab_sources.rigid)
from mitab_sources
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
where mitab_sources.name = 'bind'
  and mitab_sources.rigid not in (
    select distinct i5_mitab_sources.rigid
    from i5_mitab_sources
    inner join i5_mitab_interactions
      on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
      and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
      and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
      and i5_mitab_interactions.taxA = 9606
      and i5_mitab_interactions.taxB = 9606
    where i5_mitab_sources.name = 'bind'
    );

Common interactors:

select count(distinct X.uid)
from (
  select i5_mitab_sources.uidA as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select i5_mitab_sources.uidB as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  ) as X
inner join (
  select mitab_sources.uidA as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidA = mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select mitab_sources.uidB as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidB = mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  ) as Y
  on X.uid = Y.uid;

Interactors only in iRefIndex 5.1:

select count(distinct X.uid)
from (
  select i5_mitab_sources.uidA as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select i5_mitab_sources.uidB as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  ) as X
where X.uid not in (
  select mitab_sources.uidA as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidA = mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select mitab_sources.uidB as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidB = mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  );

Interactors only in BIND PSI-XML:

select count(distinct X.uid)
from (
  select mitab_sources.uidA as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidA = mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select mitab_sources.uidB as uid
  from mitab_sources
  inner join mitab_interactions
    on mitab_sources.uidB = mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  ) as X
where X.uid not in (
  select i5_mitab_sources.uidA as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  where name = 'bind' and taxA = 9606
  union
  select i5_mitab_sources.uidB as uid
  from i5_mitab_sources
  inner join i5_mitab_interactions
    on i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  where name = 'bind' and taxB = 9606
  );

Retrieving BIND Identifiers

The following queries determine the appropriate set of interactions and then derive the BIND identifiers from them.

Find common BIND identifiers:

create table export_common_bind as
select distinct i5_mitab_interaction_identifiers.uid
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
inner join i5_mitab_interaction_identifiers
  on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
  and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
  and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
inner join mitab_sources
  on i5_mitab_sources.rigid = mitab_sources.rigid
  and i5_mitab_sources.name = mitab_sources.name
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
inner join mitab_interaction_identifiers
  on mitab_sources.uidA = mitab_interaction_identifiers.uidA
  and mitab_sources.uidB = mitab_interaction_identifiers.uidB
  and mitab_sources.rigid = mitab_interaction_identifiers.rigid
  and i5_mitab_interaction_identifiers.uid = mitab_interaction_identifiers.uid
where mitab_sources.name = 'bind'
  and i5_mitab_interaction_identifiers.uid like 'bind:%';

Find BIND identifiers only in iRefIndex 5.1:

create table export_i5_bind as
select distinct i5_mitab_interaction_identifiers.uid
from i5_mitab_sources
inner join i5_mitab_interactions
  on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
  and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
  and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
  and i5_mitab_interactions.taxA = 9606
  and i5_mitab_interactions.taxB = 9606
inner join i5_mitab_interaction_identifiers
  on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
  and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
  and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
where i5_mitab_sources.name = 'bind'
  and i5_mitab_interaction_identifiers.uid like 'bind:%'
  and i5_mitab_sources.rigid not in (
    select distinct mitab_sources.rigid
    from mitab_sources
    inner join mitab_interactions
      on mitab_sources.uidA = mitab_interactions.uidA
      and mitab_sources.uidB = mitab_interactions.uidB
      and mitab_sources.rigid = mitab_interactions.rigid
      and mitab_interactions.taxA = 9606
      and mitab_interactions.taxB = 9606
    inner join mitab_interaction_identifiers
      on mitab_sources.uidA = mitab_interaction_identifiers.uidA
      and mitab_sources.uidB = mitab_interaction_identifiers.uidB
      and mitab_sources.rigid = mitab_interaction_identifiers.rigid
    where mitab_sources.name = 'bind'
    );

Find BIND identifiers only in BIND PSI-XML:

create table export_xml_bind as
select distinct mitab_interaction_identifiers.uid
from mitab_sources
inner join mitab_interactions
  on mitab_sources.uidA = mitab_interactions.uidA
  and mitab_sources.uidB = mitab_interactions.uidB
  and mitab_sources.rigid = mitab_interactions.rigid
  and mitab_interactions.taxA = 9606
  and mitab_interactions.taxB = 9606
inner join mitab_interaction_identifiers
  on mitab_sources.uidA = mitab_interaction_identifiers.uidA
  and mitab_sources.uidB = mitab_interaction_identifiers.uidB
  and mitab_sources.rigid = mitab_interaction_identifiers.rigid
where mitab_sources.name = 'bind'
  and mitab_interaction_identifiers.uid like 'bind:%'
  and mitab_sources.rigid not in (
    select distinct i5_mitab_sources.rigid
    from i5_mitab_sources
    inner join i5_mitab_interactions
      on i5_mitab_sources.uidA = i5_mitab_interactions.uidA
      and i5_mitab_sources.uidB = i5_mitab_interactions.uidB
      and i5_mitab_sources.rigid = i5_mitab_interactions.rigid
      and i5_mitab_interactions.taxA = 9606
      and i5_mitab_interactions.taxB = 9606
    inner join i5_mitab_interaction_identifiers
      on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA
      and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB
      and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid
    where i5_mitab_sources.name = 'bind'
    );

Observations

There are 183614 BIND identifiers in the iRefIndex 5.1 release, 188690 in the BIND PSI-XML dataset.

The following distribution was observed (with each cell representing an intersection of the listed sets):

Common iRefIndex 5.1 only BIND PSI-XML only
Common (8639)
iRefIndex 5.1 only 76 (1613)
BIND PSI-XML only 0 1088 (3248)

In other words:

  • There are 8639 common BIND identifiers in total.
  • There are 1613 BIND identifiers appearing for interactions only in iRefIndex 5.1.
  • There are 3248 BIND identifiers appearing for interactions only in the BIND PSI-XML dataset.
  • Although the intersection between "iRefIndex 5.1 only" and "BIND PSI-XML only" should yield zero interactions, since by definition an interaction in one of these sets should not appear in another, 1088 BIND identifiers appear in both.
  • Of the interactions that should only be referenced by iRefIndex 5.1, 76 BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset.
  • Of the interactions that should only be referenced by the BIND PSI-XML dataset, 0 BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset. This is as expected.

In fact, some BIND identifiers appear to refer to more than one RIG identifier. For example:

  • BIND identifier 100771 refers to 5TlbBV7VB0MFa/R1cmyeByxSFjw in only the iRefIndex 5.1 release, but also to pXZtvCDYsluwWrmdXf5FJmQ0y54 in the BIND PSI-XML dataset.
  • BIND identifier 100321 refers to VwWuNAmhR/FDx6FtF6FacwAmzr0 in only the iRefIndex 5.1 release, but to Dr5ri7vTj02leEUMKr35jj1fP/c in only the BIND PSI-XML dataset.

Other Stuff

This is what I did for the unnecessary "new BIND vs. other sources" experiment...

Downloaded the following:

  • BioGrid
  • CORUM
  • DIP
  • HPRD
  • Intact
  • MINT
  • MIPS (MPPI)
  • OPHID

Modified the following:

  • BioGrid - removed all but human data

Parsed all sources mentioned above.

Removed all non-9606 records.

Did ROG assignment and so on.

Made the reports.