README DiG 1.0
Last edited: April 6, 2009
Applies to Disease Groups (DiG) release: 1.0
Release date: April 6, 2009
Download location: currently not available for download. Contact ian.donaldson@biotek.uio.no
Authors: Katerina Michalickova and Ian Donaldson
Database: DiG (http://donaldson.uio.no/wiki/DiG:_Disease_groups)
Organization: Biotechnology Centre of Oslo, University of Oslo (http://www.biotek.uio.no/)
Contents
Description
This file describes the contents of the tab-delimited format of the Disease Groups list.
Details on the build process are available from http://donaldson.uio.no/wiki/DiG:_Disease_groups
Contact ian.donaldson at biotek.uio.no if you are interested in using DiG.
Directory contents
README | pointer to this file at http://donaldson.uio.no/wiki/README_DiG_1.0 |
diseasegroups.mmddyyyytxt.zip | the DiG list |
DiG data consists of one tab-delimited text file with the name diseasegroups.mmddyyyy.txt.zip where mmddyyyy represents the file's creation date.
Changes from last version
Not applicable
Source data used for this build | http://donaldson.uio.no/wiki/Sources_DiG_1.0 |
Statistics for this release | Not available |
Known Issues
None. First release.
Understanding the DiG format
License
Copyright © 2008, 2009 Ian Donaldson
Citation
DiG is not yet published.
Disclaimer
Data is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Description of the DiG file
Each line in the DiG list represents a single gene and its association with some disease (taken from Morbid Map). Each gene (line) has been assigned a disease group number (column 7). Each group represents a set of phenotypically related diseases as determined by their Morbid Map titles.
Column number: 1
Column name: | title |
Column type: | text, contains multiple fields |
Description: | title column as listed in Morbid Map |
Example: | 17,20-lyase deficiency, isolated, 202110 (3) |
Notes
"17,20-lyase deficiency, isolated" is the disease title
"202110" is the OMIM identifier
(3) is the evidence code (diseasetag)
See http://irefindex/wiki/DiG:_Disease_groups for more information.
Column number: 2
Column name: | genesymbol |
Column type: | text, multiple values comma delimited |
Description: | gene symbols as originally listed in Morbid Map. |
Example: | CYP17A1, CYP17, P450C17 |
Notes
Column number: 3
Column name: | locus |
Column type: | text |
Description: | gene locus as originally listed in Morbid Map |
Example: | 10q24.3 |
Notes
Column number: 4
Column name: | diseaseomimid |
Column type: | integer |
Description: | OMIMID extracted from title column in Morbid Map (see Column 1 above). |
Example: | 202110 |
Notes
See [1].
Column number: 5
Column name: | diseasetag |
Column type: | string ((1), (2) or (3)) |
Description: | evidence tag extracted from title column in Morbid Map (see Column 1 above) |
Example: | (3) |
Notes
Only entries with (3) in this column have been mapped to a disease group (see column 7). For explanation of disease tags see [2]
Column number: 6
Column name: | geneid |
Column type: | integer |
Description: | EntrezGene identifier |
Example: | 64087 |
Notes
Cross-reference to EntrezGene (see [3]). In some cases, a zero will appear in this column if a mapping to an EntrezGene identifier could not be made using the gene names provided by Morbid Map (see Column 2 above).
Column number: 7
Column name: | digid |
Column type: | integer |
Description: | Disease Group identifier |
Example: | 1 |
Notes
The whole point of the table. This identifier is not stable between releases of DiG. Entries with identical DiG identifiers are deemed to belong to a set of phenotypically-related diseases (and genes).
Column number: 8
Column name: | mantitle |
Column type: | text |
Description: | manually created title |
Example: | 17,20-lyase deficiency, isolated, 202110 (3) |
Notes
In some rare cases, titles provided by Morbid Map could not be properly processed by the text matching process. These titles were manually re-written to avoid these problems. In most cases, the text in this column is identical to that in column 1. See http://irefindex/wiki/DiG:_Disease_groups for details. All these manual changes can be propagated from release to release.