Bioscape Data Types
Bioscape is aware of a number of different data types, focusing mostly on the notion of a bioentity, but handling that and other kinds of data. These data types are defined in the bioscape.constants module in the bsadmin distribution:
- bioentities - principally genes and proteins (defined by text_bioentitytype constants)
- organisms
- chemicals or molecules
In their textual form, such data types manifest themselves as term types (defined by text_termtype constants) and also include the following kinds of data:
- Gene Ontology terms
- natural language keywords
- adjective information
- specific keywords ("disqualifier", "uninformative" and "interaction")
Much of this information is predefined - that is, Bioscape has knowledge of the range of values that belong to such data types - but other kinds of information may be found on a speculative basis:
- maplocation information
- chromosome information
- parenthesis regions (text in brackets)
Such information may have a practically unlimited range of values that cannot be enumerated in advance but which can be detected in text.
Relationships between Data Types
Of particular significance when connecting textual search results (involving terms) to concrete suggestions about the nature or meaning of such results (involving the data types described above), are bioterms and searches. A bioterm suggests a search term for a particular piece of data, referencing the underlying data type; a search connects a globally unique search term to a bioterm.
For example, a bioentity representing a gene will be referenced by a bioterm itself representing that bioentity. Consequently, its type will be that of something that "represents" another thing (defined by the text_biotermtype_represents constant), referring to a bioentity (defined by the text_reftype_bioentity constant).
Meanwhile, a chemical or molecule will be referenced by a bioterm itself representing that chemical or molecule. Consequently, its type will be that of something that "represents" another thing (defined by the text_biotermtype_represents constant), referring to a chemical (defined by the text_reftype_chemical constant).
Some bioterms cannot be considered to actually represent other things, but they may indicate a connection to such things. A Gene Ontology term may be associated with a particular gene, and a bioterm representing that Gene Ontology term may maintain a connection to that gene. Consequently, its type will be that of something that "references" another thing (defined by the text_biotermtype_references constant), referring to a bioentity (defined by the text_reftype_bioentity constant).