Bioscape Workflow

From irefindex
Revision as of 11:49, 2 October 2009 by PaulBoddie (talk | contribs) (Added category.)

The preparation of a working Bioscape system involves a number of activities in a workflow or schedule. These activities are performed in the following general order (with annotations referring to functions in the bsindex_quickstart.py script, found in the scripts directory of the bsindex distribution):

  1. Initialise basic resources (quickstart, init_database).
  2. Import essential data and initialise data sources (update_sources, update_source).
  3. Update derived information such as lexicon tables, scores for essential data (update_derived_sources).
  4. Import textual data and initialise textual data sources (update_text_source).
  5. Update text search results and related information such as result scores (update_text).
  6. Initialise the Web database in order to present a coherent view of the system (init_web_database).

The bioscape/sql directory (in the bsadmin distribution) provides a reasonable overview of the different activities, containing activity-specific directories which each contain templates for manipulating the database. The activities involved include the following:

  1. Basic resource initialisation: dictionaries, score
  2. Data source initialisation: chebi, gene, go, taxonomy
  3. Derived data preparation: bioentities, searchscore, termscore
  4. Textual data source initialisation: text
  5. Search results and related data preparation: search, sentencescore, results, resultscore, evidence, evidencescore
  6. Web data preparation: web-bioentities, web-index, web-search, web-results, web-evidence, web-resultscore