|
|
(9 intermediate revisions by the same user not shown) |
Line 1: |
Line 1: |
| + | {{:Bioscape Status}} |
| + | |
| == Installation == | | == Installation == |
| | | |
− | Before installing, it is necessary to consider the dependencies listed in the
| + | Bioscape consists of three separate applications which must be combined to provide all the facilities of a functional Bioscape installation: |
− | section given below. Precise information about installing the dependencies is
| |
− | not provided in this document, and it is recommended that you make use of your
| |
− | system's package management tools, perhaps installing Bioscape itself from a
| |
− | suitable package, in order to save time and effort working through the
| |
− | installation process manually. However, for those interested in installing | |
− | Bioscape from the source code distribution, the procedure is given below.
| |
| | | |
− | === Installation from Source Code ===
| + | * The administrative application: <tt>bsadmin</tt> |
| + | * The text-indexing application: <tt>bsindex</tt> |
| + | * The Web front-end application: <tt>bsweb</tt> |
| | | |
− | Bioscape can be installed as follows: | + | Before installing, it is necessary to consider the dependencies listed in the section given below. Precise information about installing the dependencies is not provided in this document, and it is recommended that you make use of your system's package management tools, perhaps installing Bioscape itself from suitable packages, in order to save time and effort working through the installation process manually. However, for those interested in installing |
| + | Bioscape from the source code distribution of each application, the procedure is given below. |
| | | |
− | <pre>
| + | === Installation from Source Code === |
− | python setup.py install
| |
− | </pre>
| |
− | | |
− | Note that you may need to be a privileged user to perform the above command,
| |
− | and it might be preferable to choose an alternative installation location if
| |
− | you do not have administrative or superuser rights. The following command
| |
− | provides an example of installing the software in another location:
| |
− | | |
− | <pre>
| |
− | python setup.py install --prefix=/home/user/software/usr
| |
− | </pre>
| |
− | | |
− | You will need to change the location according to your own system's
| |
− | conventions and your own preferences. Once installed, you may also need to
| |
− | tell your system where to find the installed libraries and programs; this is
| |
− | usually done by modifying environment variables, and could be done for the
| |
− | above example by adding the following definitions to your environment
| |
− | configuration:
| |
− | | |
− | <pre>
| |
− | export PATH=${PATH}:/home/user/software/usr/bin
| |
− | export PYTHONPATH=${PYTHONPATH}:/home/user/software/usr/lib/python2.3/site-packages
| |
− | </pre>
| |
− | | |
− | Note that the exact details of the latter definition, particularly the version
| |
− | of Python (2.3) and the library directory (<tt>lib</tt>) may depend on certain system
| |
− | details.
| |
− | | |
− | == Dependency Configuration ==
| |
− | | |
− | For some of the dependencies, even with pre-installed packages, you will need
| |
− | to do some preparatory work in order to use Bioscape. Some brief details of
| |
− | this work are given below.
| |
− | | |
− | === PostgreSQL ===
| |
− | | |
− | It is necessary to initialise a "database cluster" for Bioscape. This is
| |
− | typically done using commands such as the following:
| |
− | | |
− | <pre>
| |
− | mkdir -p /home/user/software/var/lib/pgsql
| |
− | initdb -D /home/user/software/var/lib/pgsql
| |
− | </pre>
| |
− | | |
− | Setting the <tt>PGDATA</tt> environment variable to the directory given in the above
| |
− | commands will save you the effort of specifying it later with other
| |
− | PostgreSQL-related commands.
| |
− | | |
− | In order to get improved performance from PostgreSQL, consider replacing the
| |
− | <tt>postgresql.conf</tt> file in the database cluster with the version found in the
| |
− | <tt>docs/database</tt> directory.
| |
− | | |
− | == Configuration ==
| |
− | | |
− | Before use, the distribution must be configured according to the environment
| |
− | in which the software will operate. This is done most conveniently by running
| |
− | the configuration program:
| |
− | | |
− | <pre>
| |
− | python bioscape_configure.py
| |
− | </pre>
| |
− | | |
− | The configuration program takes the <tt>bioscape.cfg.in</tt> template and produces a
| |
− | specific <tt>bioscape.cfg</tt> configuration file. An alternative approach is to copy
| |
− | <tt>bioscape.cfg.in</tt> to <tt>bioscape.cfg</tt> and to edit the file manually.
| |
− | | |
− | Once the <tt>bioscape.cfg</tt> file has been produced, it may be left in a "working
| |
− | directory" where all Bioscape-related tasks will be performed, or it can be
| |
− | copied or moved to your home directory; for example:
| |
− | | |
− | <pre>
| |
− | mv bioscape.cfg /home/user
| |
− | </pre>
| |
− | | |
− | See below for advice on setting database parameters in the configuration.
| |
− | | |
− | === Useful Configuration Value Groups ===
| |
− | | |
− | The following groups of settings and values may be of use when choosing
| |
− | particular configurations of the software.
| |
− | | |
− | {| border="1" cellpadding="5" cellspacing="0"
| |
− | ! Setting !! Value
| |
− | |-
| |
− | | database_system || pgsql
| |
− | |-
| |
− | | jdbc_database_url || jdbc:postgresql://localhost/bioscape
| |
− | |-
| |
− | | jdbc_driver_class || org.postgresql.Driver
| |
− | |}
| |
− | | |
− | == Database Configuration ==
| |
− | | |
− | In order to use certain modules (or packages) within the distribution, the
| |
− | database support must be configured, preferably using the database
| |
− | configuration program:
| |
− | | |
− | <pre>
| |
− | python bioscape_dbconfigure.py
| |
− | </pre>
| |
− | | |
− | Each of the modules (or packages) requiring database support can be listed,
| |
− | and the specific table and data definitions can be prepared and invoked using
| |
− | the database configuration program.
| |
− | | |
− | == Quick Start ==
| |
− | | |
− | Use the quick start program in order to initialise Bioscape as quickly as
| |
− | possible:
| |
− | | |
− | <pre>
| |
− | bioscape_quickstart.py -t quickstart
| |
− | </pre>
| |
− | | |
− | Or, from the distribution directory:
| |
− | | |
− | <pre>
| |
− | python scripts/bioscape_quickstart.py -t quickstart
| |
− | </pre>
| |
− | | |
− | The program has a range of "targets" that can be specified; running the
| |
− | program without any arguments (given as <tt>-t quickstart</tt> above) will indicate
| |
− | some of these targets.
| |
− | | |
− | == Dependencies ==
| |
− | | |
− | Bioscape has the following basic dependencies:
| |
− | | |
− | {| border="1" cellspacing="0" cellpadding="5"
| |
− | ! Package !! Release Information !! Purpose !! Notes
| |
− | |-
| |
− | | [http://www.python.org/ Python] || Tested with 2.3.6, 2.4.4 || Runs most of the software
| |
− | | rowspan="2" | Note that Python releases in the 2.3 series earlier than 2.3.5 have threading issues which are exposed by PyLucene, causing deadlock situations. Additional compatibility issues with gcj apply to PyLucene, and it is recommended that the software be compiled with gcj 3.4.6, potentially together with a suitable version of Python (such as 2.3.5 or 2.4.4 or later).
| |
− | |-
| |
− | | [http://pylucene.osafoundation.org/ PyLucene] || Tested with 2.0.0, 2.1.0-2 || Indexes textual documents
| |
− | |-
| |
− | | [http://www.boddie.org.uk/david/Projects/Python/CMDSyntax/ CMDsyntax] || 0.91 || Processes command line options
| |
− | |-
| |
− | | [http://www.boddie.org.uk/python/XSLTools.html XSLTools] || 0.6 || Produces the Web interface
| |
− | |-
| |
− | | [http://www.boddie.org.uk/python/WebStack.html WebStack] || 1.3 || Produces the Web interface
| |
− | |-
| |
− | | [http://www.boddie.org.uk/python/libxml2dom.html libxml2dom] || 0.4.6 || Required by XSLTools
| |
− | |-
| |
− | | [http://www.xmlsoft.org/XSLT.html libxslt] || Tested with 1.1.20 || Required by XSLTools
| |
− | |-
| |
− | | [http://www.xmlsoft.org/ libxml2] || Tested with 2.6.27 || Required by libxml2dom
| |
− | |-
| |
− | | [http://www.postgresql.org/ PostgreSQL] || Tested with 8.1.9 || Storage of information
| |
− | | rowspan="3" | Currently PostgreSQL is the only supported database system
| |
− | |-
| |
− | | [http://pypgsql.sourceforge.net/ pyPgSQL] || Tested with 2.5.1 || Database access
| |
− | |-
| |
− | | [http://www.egenix.com/products/python/mxBase/ egenix-mx-base] || Tested with 3.0.0 || Required by pyPgSQL
| |
− | |-
| |
− | ! colspan="4" | Optional: to collect words from WordNet, the following dependencies apply:
| |
− | |-
| |
− | ! Package !! Release Information !! Purpose !! Notes
| |
− | |-
| |
− | | [http://wordnet.princeton.edu/ WordNet] || 3.0 || Provides the WordNet database
| |
− | |-
| |
− | | [http://pywordnet.sourceforge.net/ pywordnet] || 2.0.1 || A Python interface to WordNet
| |
− | |-
| |
− | ! colspan="4" | Alternative: to use Bioscape with LingPipe, the following dependencies apply:
| |
− | |-
| |
− | ! Package !! Release Information !! Purpose !! Notes
| |
− | |-
| |
− | | [http://www.jython.org/ Jython] || Tested with 2.2a1 || Used to run LingPipe-related software
| |
− | |-
| |
− | | [http://www.alias-i.com/lingpipe/ LingPipe] || Tested with 2.3.0 || Sentence splitting in textual documents
| |
− | |-
| |
− | | [http://lucene.apache.org/java/docs/index.html Lucene] || Tested with 2.0.0 || Indexes textual documents
| |
− | |-
| |
− | | [http://jdbc.postgresql.org/ PostgreSQL JDBC Driver] || Tested with 8.1-407 JDBC 3 || Database access (if PostgreSQL is used) || Required by Jython
| |
− | |-
| |
− | ! colspan="4" | Optional: the following dependencies are related to improving the software:
| |
− | |-
| |
− | | [http://epydoc.sourceforge.net/ Epydoc] || Tested with 3.0a3 || API document generation
| |
− | |}
| |
− | | |
− | == Bundled Resources ==
| |
− | | |
− | The following resources are currently bundled with the software:
| |
| | | |
− | {| border="1" cellspacing="0" cellpadding="5"
| + | First, nominate a common directory to hold the Bioscape application directories. For example: |
− | | english.words || ftp://ftp.cs.cornell.edu/pub/smart/
| |
− | |-
| |
− | | abbreviations.txt
| |
− | | A combination of the following, plus additional terms, with fragments incorporated in the list, in place of the full abbreviations, where appropriate:
| |
| | | |
− | * http://en.wikipedia.org/wiki/List_of_medical_abbreviations
| + | /home/bioscape/apps |
− | * http://web.cn.edu/kwheeler/latin.html
| |
− | * http://www.daube.ch/docu/glossary/latin_abbrev.html
| |
− | |-
| |
− | | official.txt
| |
− | | A combination of files from the downloadable archive found at the following location:
| |
| | | |
− | http://www.dcs.shef.ac.uk/research/ilash/Moby/mwords.html
| + | Then, acquire each application's source code distribution '''(details to be provided)''' and unpack the archives in this common directory: |
| | | |
− | The following files from the archive were concatenated, sorted, with duplicate and multiple-word entries removed:
| + | cd /home/bioscape/apps |
| + | tar zxf bsadmin-x.y.tar.gz |
| + | tar zxf bsindex-x.y.tar.gz |
| + | tar zxf bsweb-x.y.tar.gz |
| | | |
− | <pre>113809of.fic 4160offi.cia</pre> | + | Since these applications contain Python libraries, it is important to configure the environment so that they may be accessed by Python. This may be done by creating a short configuration file resembling the one provided as <tt>docs/configuration/env.sh</tt> in the <tt>bsadmin</tt> distribution and then incorporating it into your environment within a <tt>.bashrc</tt> or equivalent file as follows: |
| | | |
− | The following command was used to prepare the file:
| + | source /home/bioscape/apps/bsadmin/docs/configuration/env.sh |
| | | |
− | <pre>cat 113809of.fic 4160offi.cia | sort | uniq > official.txt</pre>
| + | === Installation of Dependencies === |
| | | |
− | According to a notice at the following location, the Moby lexicon project has been placed in the public domain:
| + | See the [[Bioscape Dependencies]] page for a list of the dependencies. |
| | | |
− | http://www.dcs.shef.ac.uk/research/ilash/Moby/
| + | The <tt>docs/dependencies/download.sh</tt> file in the <tt>bsadmin</tt> distribution provides some commands which should be able to download the source distributions of various dependencies. This file or a modified version of it could be run in a nominated directory which would then hold copies of the dependencies' archive files. |
− | |-
| |
− | | wordnet.txt
| |
− | | A list of distinct nouns, verbs, adjectives and adverbs from the WordNet 3.0 database, prepared using the <tt>bioscape_get_wordnet.py</tt> script. See the <tt>docs/licences/LICENSE-WordNet</tt> file for copyright and licensing information.
| |
− | |-
| |
− | | common_english.txt
| |
− | | Common English word token dictionary processed from the common_english file (taking stripped text after the <tt>.</tt> field separator), with the original file retrieved from the following location:
| |
| | | |
− | http://pir.georgetown.edu/pirwww/iprolink/protname.shtml
| + | The <tt>docs/dependencies/build.sh</tt> file in the <tt>bsadmin</tt> distribution provides some commands which could be run to build each of the dependencies from the previously downloaded archive files. |
− | |-
| |
− | | adjectives.txt
| |
− | | Animal adjectives. See the permissive licensing details in the <tt>docs/licences/adjectives.txt</tt> file for more information.
| |
− | |}
| |
| | | |
− | == Additional Resources ==
| + | For some of the dependencies, even with pre-installed packages, you will need to do some preparatory work in order to use Bioscape. This is documented on the [[Bioscape Configuration]] page. |
| | | |
− | ; Entrez Gene : http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene
| + | [[Category:Bioscape]] |
− | ; Entrez Taxonomy : http://www.ncbi.nlm.nih.gov/sites/entrez?db=taxonomy
| |
− | ; NCBI PubMed : http://www.ncbi.nlm.nih.gov/sites/entrez?db=PubMed
| |
Note | Please note that this documentation covers an unreleased product and is for internal use only. |
Installation
Bioscape consists of three separate applications which must be combined to provide all the facilities of a functional Bioscape installation:
- The administrative application: bsadmin
- The text-indexing application: bsindex
- The Web front-end application: bsweb
Before installing, it is necessary to consider the dependencies listed in the section given below. Precise information about installing the dependencies is not provided in this document, and it is recommended that you make use of your system's package management tools, perhaps installing Bioscape itself from suitable packages, in order to save time and effort working through the installation process manually. However, for those interested in installing
Bioscape from the source code distribution of each application, the procedure is given below.
Installation from Source Code
First, nominate a common directory to hold the Bioscape application directories. For example:
/home/bioscape/apps
Then, acquire each application's source code distribution (details to be provided) and unpack the archives in this common directory:
cd /home/bioscape/apps
tar zxf bsadmin-x.y.tar.gz
tar zxf bsindex-x.y.tar.gz
tar zxf bsweb-x.y.tar.gz
Since these applications contain Python libraries, it is important to configure the environment so that they may be accessed by Python. This may be done by creating a short configuration file resembling the one provided as docs/configuration/env.sh in the bsadmin distribution and then incorporating it into your environment within a .bashrc or equivalent file as follows:
source /home/bioscape/apps/bsadmin/docs/configuration/env.sh
Installation of Dependencies
See the Bioscape Dependencies page for a list of the dependencies.
The docs/dependencies/download.sh file in the bsadmin distribution provides some commands which should be able to download the source distributions of various dependencies. This file or a modified version of it could be run in a nominated directory which would then hold copies of the dependencies' archive files.
The docs/dependencies/build.sh file in the bsadmin distribution provides some commands which could be run to build each of the dependencies from the previously downloaded archive files.
For some of the dependencies, even with pre-installed packages, you will need to do some preparatory work in order to use Bioscape. This is documented on the Bioscape Configuration page.