Bioscape Result Assessment

From irefindex
Revision as of 13:47, 14 July 2010 by PaulBoddie (talk | contribs) (Added status note.)
NoteNotePlease note that this documentation covers an unreleased product and is for internal use only.

The suggestions produced by Bioscape's search activities can be assessed subject to the availability of "gold standard" data which confirms whether each particular result can be regarded as genuine.

BioCreative 2 Gene Normalisation

In the bsindex distribution, a script is available to export filtered results from Bioscape for assessment against the BioCreative gold standard:

python scripts/ --bionames <generation> --results <generation> --methods human_gene --min-score 1 --output <output>

Once result data is available, this data can be scored through comparison to the gold standard file:

python scripts/ gold <output>

A number of options to the scoring script help compare different sets of results:

python scripts/ gold <output files> --pretty

The --pretty option provides a table with the following columns:

  1. Output filename
  2. Number of true positive results
  3. Number of false positive results
  4. Number of false negative results
  5. Precision
  6. Recall

Combining the output of this script with other Unix commands can be convenient:

python scripts/ gold <output files> --pretty | sort -n -k 5

The above combination should sort the entries on the precision column in order of increasing precision.

Comparing BioCreative Results and Bioscape Results

In order to compare results from BioCreative and Bioscape in the Web interface, the gold standard data must be imported; this involves the following processes:

  1. Import of the gene identifiers and names referenced in the gold standard data file.
  2. Text searching using these names in the appropriate documents, so that regions of text may be shown to provide results.
  3. Propagation of region and gene name information in order to produce specific gene references.

With this information available to Bioscape, it becomes possible to see each result set in the same document and to perform further analysis on the accuracy of Bioscape results.

Isolating Correct and Incorrect Bioscape Results

Using BioCreative results, it is possible to take a selection of Bioscape results and to assess them according to a number of criteria:

  • Correctness: whether each Bioscape result is correct or not - this can already be assessed using the export and scoring scripts described above, but only at the document level.
  • Correspondence: whether each BioCreative result corresponds to any Bioscape results - although this can be done using the scripts at the document level, it now becomes possible to consider the correspondence at the mention level.
  • Ambiguity: the ambiguity of Bioscape suggestions for each BioCreative result - where many Bioscape suggestions indicate ambiguity, and a single suggestion indicates an unambiguous suggestion.
  • Whether Bioscape results appear in places not associated with BioCreative results, and whether these happen to be correspond to BioCreative suggestions for a particular document.

Thus, each Bioscape result can be classified as follows:

Class At known location Predicts correct gene at location Predicts correct gene for document
True positive at "true" BioCreative mention location Yes Yes Yes
False positive at "true" BioCreative mention location Yes No (may co-exist with correct suggestion) No
True positive at wrong "true" BioCreative mention location Yes No (may co-exist with correct suggestion) Yes
True positive at "false" unknown-to-BioCreative mention location No No Yes
False positive at "false" unknown-to-BioCreative mention location No No No

Another way of expressing these result categories is as follows:

At "true" known location At "wrong" known location At "false" unknown location
True positive Bioscape suggestion matches (true positive for mention) Bioscape suggestion matches a suggestion for the document ("accidental" true positive for document)
False positive Bioscape suggestion does not match (and is inappropriate for the document) Bioscape suggestion neither appears at a recognised place or is appropriate for the document

Assessing Ambiguous and Unambiguous Suggestions

For each BioCreative result, zero, one or many suggestions may have been made by Bioscape. For various purposes, we may wish to divide the BioCreative gold standard data into a number of sets of gene mentions:

  1. Those for which results are unambiguously suggested by Bioscape - this can be used to assess the reliability of unambiguous suggestions (albeit at genuine mention locations)
  2. Those for which results are ambiguously suggested by Bioscape - this can be used to assess disambiguation method performance (at least at locations where a correct suggestion has been made)
  3. Those for which no results are suggested by Bioscape - this can be used to assess improved detection techniques