Difference between revisions of "Bioscape Methods"

From irefindex
m (Added category.)
(Added notes about result scoring.)
Line 4: Line 4:
  
 
----
 
----
 +
 +
This document describes the role of methods in Bioscape.
 +
 +
== Processing, Methods and Scoring ==
  
 
The processing pipeline of Bioscape can be summarised as follows:
 
The processing pipeline of Bioscape can be summarised as follows:
Line 28: Line 32:
  
 
== Result Scoring ==
 
== Result Scoring ==
 +
 +
The scoring of results involves inspecting the proposed bioentities and assessing their suitability in a particular document location. Such assessment methods include the following:
 +
 +
* Comparing information about proposed bioentities directly with other "contextual" information in the document.
 +
* Comparing information about proposed bioentities with similar information for other proposed bioentities.
  
 
=== Competing Names ===
 
=== Competing Names ===

Revision as of 13:06, 5 March 2009


Please note that this documentation covers an unreleased product and is for internal use only.


This document describes the role of methods in Bioscape.

Processing, Methods and Scoring

The processing pipeline of Bioscape can be summarised as follows:

  1. Import information about biological entities (genes), also known as bioentities.
  2. Build a lexicon consisting of names associated with the imported entities.
  3. Search biomedical literature using the contents of the lexicon, subject to filtering.
  4. Assign bioentities to the text search results.

At each stage in the pipeline, Bioscape employs methods which are used to assess the value or suitability of the information employed by assigning scores to the information based on particular criteria. Consequently, the following kinds of methods are applied:

  1. Name scoring: assessing whether a name should be used in text searches.
  2. Search scoring: assessing whether a bioentity should be assigned to a text search result.
  3. Sentence scoring: assessing whether a sentence has a particular importance.
  4. Result scoring: assessing whether a result (combining bioentity and textual information) is genuine.

Examples of methods are given below.

Name Scoring

Search Scoring

Sentence Scoring

Result Scoring

The scoring of results involves inspecting the proposed bioentities and assessing their suitability in a particular document location. Such assessment methods include the following:

  • Comparing information about proposed bioentities directly with other "contextual" information in the document.
  • Comparing information about proposed bioentities with similar information for other proposed bioentities.

Competing Names

PubMed #7479798: gene #1434 is referenced by names CSE1 and CAS, but CAS is used ambiguously. Since the other genes referenced by CAS are not supported by other names, CAS is interpreted as also being a reference to gene #1434.