Difference between revisions of "iRefIndex/PBR Notes"
PaulBoddie (talk | contribs) (Restricted interactors to 9606 for the "human" BIND PSI-XML data.) |
PaulBoddie (talk | contribs) (→Queries: Added BIND identifier extraction.) |
||
Line 200: | Line 200: | ||
where name = 'bind' and taxB = 9606 | where name = 'bind' and taxB = 9606 | ||
); | ); | ||
+ | |||
+ | == Retrieving BIND Identifiers == | ||
+ | |||
+ | The following queries determine the appropriate set of interactions and then derive the BIND identifiers from them. | ||
+ | |||
+ | Find common BIND identifiers: | ||
+ | |||
+ | create table export_common_bind as | ||
+ | select distinct i5_mitab_interaction_identifiers.uid | ||
+ | from i5_mitab_sources | ||
+ | inner join i5_mitab_interactions | ||
+ | on i5_mitab_sources.uidA = i5_mitab_interactions.uidA | ||
+ | and i5_mitab_sources.uidB = i5_mitab_interactions.uidB | ||
+ | and i5_mitab_sources.rigid = i5_mitab_interactions.rigid | ||
+ | and i5_mitab_interactions.taxA = 9606 | ||
+ | and i5_mitab_interactions.taxB = 9606 | ||
+ | inner join i5_mitab_interaction_identifiers | ||
+ | on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA | ||
+ | and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB | ||
+ | and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid | ||
+ | inner join mitab_sources | ||
+ | on i5_mitab_sources.rigid = mitab_sources.rigid | ||
+ | and i5_mitab_sources.name = mitab_sources.name | ||
+ | inner join mitab_interactions | ||
+ | on mitab_sources.uidA = mitab_interactions.uidA | ||
+ | and mitab_sources.uidB = mitab_interactions.uidB | ||
+ | and mitab_sources.rigid = mitab_interactions.rigid | ||
+ | and mitab_interactions.taxA = 9606 | ||
+ | and mitab_interactions.taxB = 9606 | ||
+ | inner join mitab_interaction_identifiers | ||
+ | on mitab_sources.uidA = mitab_interaction_identifiers.uidA | ||
+ | and mitab_sources.uidB = mitab_interaction_identifiers.uidB | ||
+ | and mitab_sources.rigid = mitab_interaction_identifiers.rigid | ||
+ | and i5_mitab_interaction_identifiers.uid = mitab_interaction_identifiers.uid | ||
+ | where mitab_sources.name = 'bind' | ||
+ | and i5_mitab_interaction_identifiers.uid like 'bind:%'; | ||
+ | |||
+ | Find BIND identifiers only in iRefIndex 5.1: | ||
+ | |||
+ | create table export_i5_bind as | ||
+ | select distinct i5_mitab_interaction_identifiers.uid | ||
+ | from i5_mitab_sources | ||
+ | inner join i5_mitab_interactions | ||
+ | on i5_mitab_sources.uidA = i5_mitab_interactions.uidA | ||
+ | and i5_mitab_sources.uidB = i5_mitab_interactions.uidB | ||
+ | and i5_mitab_sources.rigid = i5_mitab_interactions.rigid | ||
+ | and i5_mitab_interactions.taxA = 9606 | ||
+ | and i5_mitab_interactions.taxB = 9606 | ||
+ | inner join i5_mitab_interaction_identifiers | ||
+ | on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA | ||
+ | and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB | ||
+ | and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid | ||
+ | where i5_mitab_sources.name = 'bind' | ||
+ | and i5_mitab_interaction_identifiers.uid like 'bind:%' | ||
+ | and i5_mitab_sources.rigid not in ( | ||
+ | select distinct mitab_sources.rigid | ||
+ | from mitab_sources | ||
+ | inner join mitab_interactions | ||
+ | on mitab_sources.uidA = mitab_interactions.uidA | ||
+ | and mitab_sources.uidB = mitab_interactions.uidB | ||
+ | and mitab_sources.rigid = mitab_interactions.rigid | ||
+ | and mitab_interactions.taxA = 9606 | ||
+ | and mitab_interactions.taxB = 9606 | ||
+ | inner join mitab_interaction_identifiers | ||
+ | on mitab_sources.uidA = mitab_interaction_identifiers.uidA | ||
+ | and mitab_sources.uidB = mitab_interaction_identifiers.uidB | ||
+ | and mitab_sources.rigid = mitab_interaction_identifiers.rigid | ||
+ | where mitab_sources.name = 'bind' | ||
+ | ); | ||
+ | |||
+ | Find BIND identifiers only in BIND PSI-XML: | ||
+ | |||
+ | create table export_xml_bind as | ||
+ | select distinct mitab_interaction_identifiers.uid | ||
+ | from mitab_sources | ||
+ | inner join mitab_interactions | ||
+ | on mitab_sources.uidA = mitab_interactions.uidA | ||
+ | and mitab_sources.uidB = mitab_interactions.uidB | ||
+ | and mitab_sources.rigid = mitab_interactions.rigid | ||
+ | and mitab_interactions.taxA = 9606 | ||
+ | and mitab_interactions.taxB = 9606 | ||
+ | inner join mitab_interaction_identifiers | ||
+ | on mitab_sources.uidA = mitab_interaction_identifiers.uidA | ||
+ | and mitab_sources.uidB = mitab_interaction_identifiers.uidB | ||
+ | and mitab_sources.rigid = mitab_interaction_identifiers.rigid | ||
+ | where mitab_sources.name = 'bind' | ||
+ | and mitab_interaction_identifiers.uid like 'bind:%' | ||
+ | and mitab_sources.rigid not in ( | ||
+ | select distinct i5_mitab_sources.rigid | ||
+ | from i5_mitab_sources | ||
+ | inner join i5_mitab_interactions | ||
+ | on i5_mitab_sources.uidA = i5_mitab_interactions.uidA | ||
+ | and i5_mitab_sources.uidB = i5_mitab_interactions.uidB | ||
+ | and i5_mitab_sources.rigid = i5_mitab_interactions.rigid | ||
+ | and i5_mitab_interactions.taxA = 9606 | ||
+ | and i5_mitab_interactions.taxB = 9606 | ||
+ | inner join i5_mitab_interaction_identifiers | ||
+ | on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA | ||
+ | and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB | ||
+ | and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid | ||
+ | where i5_mitab_sources.name = 'bind' | ||
+ | ); | ||
+ | |||
+ | === Observations === | ||
+ | |||
+ | There are 183614 BIND identifiers in the iRefIndex 5.1 release, 188690 in the BIND PSI-XML dataset. | ||
+ | |||
+ | The following distribution was observed (with each cell representing an intersection of the listed sets): | ||
+ | |||
+ | {| | ||
+ | | || Common || iRefIndex 5.1 ''only'' || BIND PSI-XML ''only'' | ||
+ | |- | ||
+ | | Common || (8639) || || | ||
+ | |- | ||
+ | | iRefIndex 5.1 ''only'' || '''76''' || (1613) || | ||
+ | |- | ||
+ | | BIND PSI-XML ''only'' || '''0''' || '''1088''' || (3248) | ||
+ | |} | ||
+ | |||
+ | In other words: | ||
+ | |||
+ | * There are 8639 common BIND identifiers in total. | ||
+ | * There are 1613 BIND identifiers appearing for interactions only in iRefIndex 5.1. | ||
+ | * There are 3248 BIND identifiers appearing for interactions only in the BIND PSI-XML dataset. | ||
+ | * Although the intersection between "iRefIndex 5.1 only" and "BIND PSI-XML only" should yield zero interactions, since by definition an interaction in one of these sets should not appear in another, '''1088''' BIND identifiers appear in both. | ||
+ | * Of the interactions that should only be referenced by iRefIndex 5.1, '''76''' BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset. | ||
+ | * Of the interactions that should only be referenced by the BIND PSI-XML dataset, '''0''' BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset. This is as expected. | ||
+ | |||
+ | In fact, some BIND identifiers appear to refer to more than one RIG identifier. For example: | ||
+ | |||
+ | * BIND identifier 100771 refers to <tt>5TlbBV7VB0MFa/R1cmyeByxSFjw</tt> in only the iRefIndex 5.1 release, but also to <tt>pXZtvCDYsluwWrmdXf5FJmQ0y54</tt> in the BIND PSI-XML dataset. | ||
+ | * BIND identifier 100321 refers to <tt>VwWuNAmhR/FDx6FtF6FacwAmzr0</tt> in only the iRefIndex 5.1 release, but to <tt>Dr5ri7vTj02leEUMKr35jj1fP/c</tt> in only the BIND PSI-XML dataset. | ||
== Other Stuff == | == Other Stuff == |
Revision as of 16:26, 25 September 2009
Preliminary BIND PSI-XML parsing results.
Contents
Results
Importing MITAB output from the iRefIndex 5.1 build and a build done using the BIND PSI-XML data, then comparing the interaction details for BIND, the following distributions are observed:
Interactions
iRefIndex 5.1 | |||
(Total) | (7122) | ||
1324 | |||
PSI-XML | (8706) | 2908 | 5798 |
To summarise:
- Common interactions: 5798
- Interactions only in iRefIndex 5.1: 1324
- Interactions only in BIND PSI-XML: 2908
Interactors
iRefIndex 5.1 | |||
(Total) | (5669) | ||
1450 | |||
PSI-XML | (5152) | 933 | 4219 |
To summarise:
- Common interactors: 4219
- Interactors only in iRefIndex 5.1: 1450
- Interactors only in BIND PSI-XML: 933
Queries
Common interactions:
select count(distinct mitab_sources.rigid) from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA and i5_mitab_sources.uidB = i5_mitab_interactions.uidB and i5_mitab_sources.rigid = i5_mitab_interactions.rigid and i5_mitab_interactions.taxA = 9606 and i5_mitab_interactions.taxB = 9606 inner join mitab_sources on i5_mitab_sources.rigid = mitab_sources.rigid and i5_mitab_sources.name = mitab_sources.name inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA and mitab_sources.uidB = mitab_interactions.uidB and mitab_sources.rigid = mitab_interactions.rigid and mitab_interactions.taxA = 9606 and mitab_interactions.taxB = 9606 where mitab_sources.name = 'bind';
Interactions only in iRefIndex 5.1:
select count(distinct i5_mitab_sources.rigid) from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA and i5_mitab_sources.uidB = i5_mitab_interactions.uidB and i5_mitab_sources.rigid = i5_mitab_interactions.rigid and i5_mitab_interactions.taxA = 9606 and i5_mitab_interactions.taxB = 9606 where i5_mitab_sources.name = 'bind' and i5_mitab_sources.rigid not in ( select distinct mitab_sources.rigid from mitab_sources inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA and mitab_sources.uidB = mitab_interactions.uidB and mitab_sources.rigid = mitab_interactions.rigid and mitab_interactions.taxA = 9606 and mitab_interactions.taxB = 9606 where mitab_sources.name = 'bind' );
Interactions only in BIND PSI-XML:
select count(distinct mitab_sources.rigid) from mitab_sources inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA and mitab_sources.uidB = mitab_interactions.uidB and mitab_sources.rigid = mitab_interactions.rigid and mitab_interactions.taxA = 9606 and mitab_interactions.taxB = 9606 where mitab_sources.name = 'bind' and mitab_sources.rigid not in ( select distinct i5_mitab_sources.rigid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA and i5_mitab_sources.uidB = i5_mitab_interactions.uidB and i5_mitab_sources.rigid = i5_mitab_interactions.rigid and i5_mitab_interactions.taxA = 9606 and i5_mitab_interactions.taxB = 9606 where i5_mitab_sources.name = 'bind' );
Common interactors:
select count(distinct X.uid) from ( select i5_mitab_sources.uidA as uid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA where name = 'bind' and taxA = 9606 union select i5_mitab_sources.uidB as uid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidB = i5_mitab_interactions.uidB where name = 'bind' and taxB = 9606 ) as X inner join ( select mitab_sources.uidA as uid from mitab_sources inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA where name = 'bind' and taxA = 9606 union select mitab_sources.uidB as uid from mitab_sources inner join mitab_interactions on mitab_sources.uidB = mitab_interactions.uidB where name = 'bind' and taxB = 9606 ) as Y on X.uid = Y.uid;
Interactors only in iRefIndex 5.1:
select count(distinct X.uid) from ( select i5_mitab_sources.uidA as uid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA where name = 'bind' and taxA = 9606 union select i5_mitab_sources.uidB as uid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidB = i5_mitab_interactions.uidB where name = 'bind' and taxB = 9606 ) as X where X.uid not in ( select mitab_sources.uidA as uid from mitab_sources inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA where name = 'bind' and taxA = 9606 union select mitab_sources.uidB as uid from mitab_sources inner join mitab_interactions on mitab_sources.uidB = mitab_interactions.uidB where name = 'bind' and taxB = 9606 );
Interactors only in BIND PSI-XML:
select count(distinct X.uid) from ( select mitab_sources.uidA as uid from mitab_sources inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA where name = 'bind' and taxA = 9606 union select mitab_sources.uidB as uid from mitab_sources inner join mitab_interactions on mitab_sources.uidB = mitab_interactions.uidB where name = 'bind' and taxB = 9606 ) as X where X.uid not in ( select i5_mitab_sources.uidA as uid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA where name = 'bind' and taxA = 9606 union select i5_mitab_sources.uidB as uid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidB = i5_mitab_interactions.uidB where name = 'bind' and taxB = 9606 );
Retrieving BIND Identifiers
The following queries determine the appropriate set of interactions and then derive the BIND identifiers from them.
Find common BIND identifiers:
create table export_common_bind as select distinct i5_mitab_interaction_identifiers.uid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA and i5_mitab_sources.uidB = i5_mitab_interactions.uidB and i5_mitab_sources.rigid = i5_mitab_interactions.rigid and i5_mitab_interactions.taxA = 9606 and i5_mitab_interactions.taxB = 9606 inner join i5_mitab_interaction_identifiers on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid inner join mitab_sources on i5_mitab_sources.rigid = mitab_sources.rigid and i5_mitab_sources.name = mitab_sources.name inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA and mitab_sources.uidB = mitab_interactions.uidB and mitab_sources.rigid = mitab_interactions.rigid and mitab_interactions.taxA = 9606 and mitab_interactions.taxB = 9606 inner join mitab_interaction_identifiers on mitab_sources.uidA = mitab_interaction_identifiers.uidA and mitab_sources.uidB = mitab_interaction_identifiers.uidB and mitab_sources.rigid = mitab_interaction_identifiers.rigid and i5_mitab_interaction_identifiers.uid = mitab_interaction_identifiers.uid where mitab_sources.name = 'bind' and i5_mitab_interaction_identifiers.uid like 'bind:%';
Find BIND identifiers only in iRefIndex 5.1:
create table export_i5_bind as select distinct i5_mitab_interaction_identifiers.uid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA and i5_mitab_sources.uidB = i5_mitab_interactions.uidB and i5_mitab_sources.rigid = i5_mitab_interactions.rigid and i5_mitab_interactions.taxA = 9606 and i5_mitab_interactions.taxB = 9606 inner join i5_mitab_interaction_identifiers on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid where i5_mitab_sources.name = 'bind' and i5_mitab_interaction_identifiers.uid like 'bind:%' and i5_mitab_sources.rigid not in ( select distinct mitab_sources.rigid from mitab_sources inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA and mitab_sources.uidB = mitab_interactions.uidB and mitab_sources.rigid = mitab_interactions.rigid and mitab_interactions.taxA = 9606 and mitab_interactions.taxB = 9606 inner join mitab_interaction_identifiers on mitab_sources.uidA = mitab_interaction_identifiers.uidA and mitab_sources.uidB = mitab_interaction_identifiers.uidB and mitab_sources.rigid = mitab_interaction_identifiers.rigid where mitab_sources.name = 'bind' );
Find BIND identifiers only in BIND PSI-XML:
create table export_xml_bind as select distinct mitab_interaction_identifiers.uid from mitab_sources inner join mitab_interactions on mitab_sources.uidA = mitab_interactions.uidA and mitab_sources.uidB = mitab_interactions.uidB and mitab_sources.rigid = mitab_interactions.rigid and mitab_interactions.taxA = 9606 and mitab_interactions.taxB = 9606 inner join mitab_interaction_identifiers on mitab_sources.uidA = mitab_interaction_identifiers.uidA and mitab_sources.uidB = mitab_interaction_identifiers.uidB and mitab_sources.rigid = mitab_interaction_identifiers.rigid where mitab_sources.name = 'bind' and mitab_interaction_identifiers.uid like 'bind:%' and mitab_sources.rigid not in ( select distinct i5_mitab_sources.rigid from i5_mitab_sources inner join i5_mitab_interactions on i5_mitab_sources.uidA = i5_mitab_interactions.uidA and i5_mitab_sources.uidB = i5_mitab_interactions.uidB and i5_mitab_sources.rigid = i5_mitab_interactions.rigid and i5_mitab_interactions.taxA = 9606 and i5_mitab_interactions.taxB = 9606 inner join i5_mitab_interaction_identifiers on i5_mitab_sources.uidA = i5_mitab_interaction_identifiers.uidA and i5_mitab_sources.uidB = i5_mitab_interaction_identifiers.uidB and i5_mitab_sources.rigid = i5_mitab_interaction_identifiers.rigid where i5_mitab_sources.name = 'bind' );
Observations
There are 183614 BIND identifiers in the iRefIndex 5.1 release, 188690 in the BIND PSI-XML dataset.
The following distribution was observed (with each cell representing an intersection of the listed sets):
Common | iRefIndex 5.1 only | BIND PSI-XML only | |
Common | (8639) | ||
iRefIndex 5.1 only | 76 | (1613) | |
BIND PSI-XML only | 0 | 1088 | (3248) |
In other words:
- There are 8639 common BIND identifiers in total.
- There are 1613 BIND identifiers appearing for interactions only in iRefIndex 5.1.
- There are 3248 BIND identifiers appearing for interactions only in the BIND PSI-XML dataset.
- Although the intersection between "iRefIndex 5.1 only" and "BIND PSI-XML only" should yield zero interactions, since by definition an interaction in one of these sets should not appear in another, 1088 BIND identifiers appear in both.
- Of the interactions that should only be referenced by iRefIndex 5.1, 76 BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset.
- Of the interactions that should only be referenced by the BIND PSI-XML dataset, 0 BIND identifiers also refer to interactions in both iRefIndex 5.1 and the BIND PSI-XML dataset. This is as expected.
In fact, some BIND identifiers appear to refer to more than one RIG identifier. For example:
- BIND identifier 100771 refers to 5TlbBV7VB0MFa/R1cmyeByxSFjw in only the iRefIndex 5.1 release, but also to pXZtvCDYsluwWrmdXf5FJmQ0y54 in the BIND PSI-XML dataset.
- BIND identifier 100321 refers to VwWuNAmhR/FDx6FtF6FacwAmzr0 in only the iRefIndex 5.1 release, but to Dr5ri7vTj02leEUMKr35jj1fP/c in only the BIND PSI-XML dataset.
Other Stuff
This is what I did for the unnecessary "new BIND vs. other sources" experiment...
Downloaded the following:
- BioGrid
- CORUM
- DIP
- HPRD
- Intact
- MINT
- MIPS (MPPI)
- OPHID
Modified the following:
- BioGrid - removed all but human data
Parsed all sources mentioned above.
Removed all non-9606 records.
Did ROG assignment and so on.
Made the reports.