Difference between revisions of "iRefIndex Development"

From irefindex
(Added link to a new page.)
(→‎Adding Sources to iRefIndex: Added description of XML element path analysis.)
Line 8: Line 8:
 
## For the specific version, review the format's schema and how the data uses the schema. For example, PSI MI XML permits the specification of interactors within interaction descriptions as well as in a separate interactor list.
 
## For the specific version, review the format's schema and how the data uses the schema. For example, PSI MI XML permits the specification of interactors within interaction descriptions as well as in a separate interactor list.
 
# Review existing, similar mapper definition files.
 
# Review existing, similar mapper definition files.
 +
 +
The <tt>show_xml_paths.py</tt> script in the <tt>iRef_PSI_XML2RDBMS</tt> directory can be used to show the different element paths used in an XML data file to hold data items. For example:
 +
 +
python show_xml_paths.py /home/irefindex/data/MINT/2010-09-14/10023771.psi25.xml
 +
 +
The resulting list of paths indicates the places in the element hierarchy of a PSI-MI XML file where information is actually stored. For example:
 +
 +
<pre>
 +
entrySet/entry/experimentList/experimentDescription/attributeList/attribute
 +
entrySet/entry/experimentList/experimentDescription/hostOrganismList/hostOrganism/names/fullName
 +
entrySet/entry/experimentList/experimentDescription/hostOrganismList/hostOrganism/names/shortLabel
 +
entrySet/entry/experimentList/experimentDescription/interactionDetectionMethod/names/alias
 +
entrySet/entry/experimentList/experimentDescription/interactionDetectionMethod/names/fullName
 +
entrySet/entry/experimentList/experimentDescription/interactionDetectionMethod/names/shortLabel
 +
entrySet/entry/experimentList/experimentDescription/names/fullName
 +
entrySet/entry/experimentList/experimentDescription/names/shortLabel
 +
entrySet/entry/interactionList/interaction/attributeList/attribute
 +
entrySet/entry/interactionList/interaction/confidenceList/confidence/unit/names/fullName
 +
entrySet/entry/interactionList/interaction/confidenceList/confidence/unit/names/shortLabel
 +
entrySet/entry/interactionList/interaction/confidenceList/confidence/value
 +
entrySet/entry/interactionList/interaction/experimentList/experimentRef
 +
entrySet/entry/interactionList/interaction/interactionType/names/fullName
 +
entrySet/entry/interactionList/interaction/interactionType/names/shortLabel
 +
entrySet/entry/interactionList/interaction/intraMolecular
 +
entrySet/entry/interactionList/interaction/modelled
 +
entrySet/entry/interactionList/interaction/names/shortLabel
 +
entrySet/entry/interactionList/interaction/negative
 +
entrySet/entry/interactionList/interaction/participantList/participant/biologicalRole/names/fullName
 +
entrySet/entry/interactionList/interaction/participantList/participant/biologicalRole/names/shortLabel
 +
entrySet/entry/interactionList/interaction/participantList/participant/experimentalPreparationList/experimentalPreparation/names/fullName
 +
entrySet/entry/interactionList/interaction/participantList/participant/experimentalPreparationList/experimentalPreparation/names/shortLabel
 +
entrySet/entry/interactionList/interaction/participantList/participant/experimentalRoleList/experimentalRole/names/fullName
 +
entrySet/entry/interactionList/interaction/participantList/participant/experimentalRoleList/experimentalRole/names/shortLabel
 +
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/endStatus/names/fullName
 +
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/endStatus/names/shortLabel
 +
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/isLink
 +
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/startStatus/names/fullName
 +
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/startStatus/names/shortLabel
 +
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureType/names/fullName
 +
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureType/names/shortLabel
 +
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/names/shortLabel
 +
entrySet/entry/interactionList/interaction/participantList/participant/interactorRef
 +
entrySet/entry/interactionList/interaction/participantList/participant/names/shortLabel
 +
entrySet/entry/interactionList/interaction/participantList/participant/participantIdentificationMethodList/participantIdentificationMethod/names/alias
 +
entrySet/entry/interactionList/interaction/participantList/participant/participantIdentificationMethodList/participantIdentificationMethod/names/fullName
 +
entrySet/entry/interactionList/interaction/participantList/participant/participantIdentificationMethodList/participantIdentificationMethod/names/shortLabel
 +
entrySet/entry/interactorList/interactor/attributeList/attribute
 +
entrySet/entry/interactorList/interactor/interactorType/names/fullName
 +
entrySet/entry/interactorList/interactor/interactorType/names/shortLabel
 +
entrySet/entry/interactorList/interactor/names/alias
 +
entrySet/entry/interactorList/interactor/names/fullName
 +
entrySet/entry/interactorList/interactor/names/shortLabel
 +
entrySet/entry/interactorList/interactor/organism/names/fullName
 +
entrySet/entry/interactorList/interactor/organism/names/shortLabel
 +
entrySet/entry/interactorList/interactor/sequence
 +
entrySet/entry/source/attributeList/attribute
 +
entrySet/entry/source/names/fullName
 +
entrySet/entry/source/names/shortLabel
 +
</pre>
 +
 +
With this information, a suitable mapper file can be identified for the conversion of the XML-encoded data into tabular data to be stored in a database.
  
 
== All iRefIndex Pages ==
 
== All iRefIndex Pages ==

Revision as of 16:10, 1 October 2010

See iRefIndex Issues and Notes for details of ongoing work to improve the iRefIndex software.

Adding Sources to iRefIndex

  1. Identify the location of the downloaded data.
  2. Evaluate the form of the data:
    1. For PSI MI XML (Molecular Interaction XML) documents, check the version of the format employed by the data documents.
    2. For the specific version, review the format's schema and how the data uses the schema. For example, PSI MI XML permits the specification of interactors within interaction descriptions as well as in a separate interactor list.
  3. Review existing, similar mapper definition files.

The show_xml_paths.py script in the iRef_PSI_XML2RDBMS directory can be used to show the different element paths used in an XML data file to hold data items. For example:

python show_xml_paths.py /home/irefindex/data/MINT/2010-09-14/10023771.psi25.xml

The resulting list of paths indicates the places in the element hierarchy of a PSI-MI XML file where information is actually stored. For example:

entrySet/entry/experimentList/experimentDescription/attributeList/attribute
entrySet/entry/experimentList/experimentDescription/hostOrganismList/hostOrganism/names/fullName
entrySet/entry/experimentList/experimentDescription/hostOrganismList/hostOrganism/names/shortLabel
entrySet/entry/experimentList/experimentDescription/interactionDetectionMethod/names/alias
entrySet/entry/experimentList/experimentDescription/interactionDetectionMethod/names/fullName
entrySet/entry/experimentList/experimentDescription/interactionDetectionMethod/names/shortLabel
entrySet/entry/experimentList/experimentDescription/names/fullName
entrySet/entry/experimentList/experimentDescription/names/shortLabel
entrySet/entry/interactionList/interaction/attributeList/attribute
entrySet/entry/interactionList/interaction/confidenceList/confidence/unit/names/fullName
entrySet/entry/interactionList/interaction/confidenceList/confidence/unit/names/shortLabel
entrySet/entry/interactionList/interaction/confidenceList/confidence/value
entrySet/entry/interactionList/interaction/experimentList/experimentRef
entrySet/entry/interactionList/interaction/interactionType/names/fullName
entrySet/entry/interactionList/interaction/interactionType/names/shortLabel
entrySet/entry/interactionList/interaction/intraMolecular
entrySet/entry/interactionList/interaction/modelled
entrySet/entry/interactionList/interaction/names/shortLabel
entrySet/entry/interactionList/interaction/negative
entrySet/entry/interactionList/interaction/participantList/participant/biologicalRole/names/fullName
entrySet/entry/interactionList/interaction/participantList/participant/biologicalRole/names/shortLabel
entrySet/entry/interactionList/interaction/participantList/participant/experimentalPreparationList/experimentalPreparation/names/fullName
entrySet/entry/interactionList/interaction/participantList/participant/experimentalPreparationList/experimentalPreparation/names/shortLabel
entrySet/entry/interactionList/interaction/participantList/participant/experimentalRoleList/experimentalRole/names/fullName
entrySet/entry/interactionList/interaction/participantList/participant/experimentalRoleList/experimentalRole/names/shortLabel
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/endStatus/names/fullName
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/endStatus/names/shortLabel
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/isLink
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/startStatus/names/fullName
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureRangeList/featureRange/startStatus/names/shortLabel
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureType/names/fullName
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/featureType/names/shortLabel
entrySet/entry/interactionList/interaction/participantList/participant/featureList/feature/names/shortLabel
entrySet/entry/interactionList/interaction/participantList/participant/interactorRef
entrySet/entry/interactionList/interaction/participantList/participant/names/shortLabel
entrySet/entry/interactionList/interaction/participantList/participant/participantIdentificationMethodList/participantIdentificationMethod/names/alias
entrySet/entry/interactionList/interaction/participantList/participant/participantIdentificationMethodList/participantIdentificationMethod/names/fullName
entrySet/entry/interactionList/interaction/participantList/participant/participantIdentificationMethodList/participantIdentificationMethod/names/shortLabel
entrySet/entry/interactorList/interactor/attributeList/attribute
entrySet/entry/interactorList/interactor/interactorType/names/fullName
entrySet/entry/interactorList/interactor/interactorType/names/shortLabel
entrySet/entry/interactorList/interactor/names/alias
entrySet/entry/interactorList/interactor/names/fullName
entrySet/entry/interactorList/interactor/names/shortLabel
entrySet/entry/interactorList/interactor/organism/names/fullName
entrySet/entry/interactorList/interactor/organism/names/shortLabel
entrySet/entry/interactorList/interactor/sequence
entrySet/entry/source/attributeList/attribute
entrySet/entry/source/names/fullName
entrySet/entry/source/names/shortLabel

With this information, a suitable mapper file can be identified for the conversion of the XML-encoded data into tabular data to be stored in a database.

All iRefIndex Pages

Follow this link for a listing of all iRefIndex related pages (archived and current).