README DiG 1.0

From irefindex

Last edited: April 6, 2009

Applies to Disease Groups (DiG) release: 1.0

Release date: April 6, 2009

Download location: currently not available for download. Contact ian.donaldson@biotek.uio.no

Authors: Katerina Michalickova and Ian Donaldson

Database: DiG (http://donaldson.uio.no/wiki/DiG:_Disease_groups)

Organization: Biotechnology Centre of Oslo, University of Oslo (http://www.biotek.uio.no/)


Description

This file describes the contents of the tab-delimited format of the Disease Groups list.

Details on the build process are available from http://donaldson.uio.no/wiki/DiG:_Disease_groups


Contact ian.donaldson at biotek.uio.no if you are interested in using DiG.

Directory contents

README pointer to this file at http://donaldson.uio.no/wiki/README_DiG_1.0
diseasegroups.mmddyyyytxt.zip the DiG list

DiG data consists of one tab-delimited text file with the name diseasegroups.mmddyyyy.txt.zip where mmddyyyy represents the file's creation date.

Changes from last version

Not applicable

Source data used for this build http://donaldson.uio.no/wiki/Sources_DiG_1.0
Statistics for this release Not available

Known Issues

None. First release.

Understanding the DiG format

License

Copyright © 2008, 2009 Ian Donaldson

Citation

DiG is not yet published.

Disclaimer

Data is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Description of the DiG file

Each line in this file represents

Each line in the DiG list represents a single gene and its association with some disease (taken from Morbid Map). Each gene (line) has been assigned a disease group number. Each group represents a set of phenotypically related diseases as determined by their Morbid Map titles.


Column number: 1

Column name: title
Column type: text, contains multiple fields
Description: title column as listed in Morbid Map
Example: 17,20-lyase deficiency, isolated, 202110 (3)

Notes "17,20-lyase deficiency, isolated" is the disease title "202110" is the OMIM identifier (3) is the evidence code See http://irefindex/wiki/DiG:_Disease_groups for more information.


Column number: 2

Column name: genesymbol
Column type: text, multiple values comma delimited
Description: gene symbols as originally listed in Morbid Map.
Example: CYP17A1, CYP17, P450C17

Notes

Column number: 3

Column name: locus
Column type: text
Description: gene locus as originally listed in Morbid Map
Example: 10q24.3

Notes

Column number: 4

Column name: diseaseomimid
Column type: integer
Description: omimid extracted from title column in Morbid Map (see Column 1 above).
Example: 202110

Notes

Column number: 5

Column name: diseasetag
Column type: integer (0, -1, -2, -3)
Description: evidence tag extracted from title column in Morbid Map (see Column 1 above)
Example: -3

Notes Only entries with -3 in this column have been mapped to a disease groupentries DiG (see column 7).

Column number: 6

Column name: geneid
Column type: integer
Description: EntrezGene identifier
Example: 64087

Notes In some cases, a zero will appear in this column if a mapping to an EntrezGene identifier could not be made using the gene names provided by Morbid Map (see Column 2 above).

Column number: 7

Column name: digid
Column type: integer
Description: Disease Group identifier
Example: 1

Notes The whole point of the table. This identifier is not stable between releases of DiG. Entries with identical DiG identifiers are deemed to belong to a set of phenotypically-related diseases (and genes).


Column number: 8

Column name: mantitle
Column type: text
Description:
Example: manually created title

Notes In some rare cases, titles provided by Morbid Map were not properly processed. These titles were manually re-written to avoid these problems. In most cases, the text in this column is identical to that in column 1. See http://irefindex/wiki/DiG:_Disease_groups for details.