Difference between revisions of "README DiG 1.0"

From irefindex
Line 156: Line 156:
  
 
Only entries with (3) in this column have been mapped to a disease group (see column 7).
 
Only entries with (3) in this column have been mapped to a disease group (see column 7).
For explanation of disease tags see [http://www.ncbi.nlm.nih.gov/Omim/omimfaq.html#mim_number_symbols]
+
For explanation of disease tags see [http://www.ncbi.nlm.nih.gov/Omim/omimfaq.html#gene_map_symbols]
  
 
=== Column number: 6 ===
 
=== Column number: 6 ===

Revision as of 07:53, 15 May 2009

Last edited: April 6, 2009

Applies to Disease Groups (DiG) release: 1.0

Release date: April 6, 2009

Download location: currently not available for download. Contact ian.donaldson@biotek.uio.no

Authors: Katerina Michalickova and Ian Donaldson

Database: DiG (http://donaldson.uio.no/wiki/DiG:_Disease_groups)

Organization: Biotechnology Centre of Oslo, University of Oslo (http://www.biotek.uio.no/)


Description

This file describes the contents of the tab-delimited format of the Disease Groups list.

Details on the build process are available from http://donaldson.uio.no/wiki/DiG:_Disease_groups


Contact ian.donaldson at biotek.uio.no if you are interested in using DiG.

Directory contents

README pointer to this file at http://donaldson.uio.no/wiki/README_DiG_1.0
diseasegroups.mmddyyyytxt.zip the DiG list

DiG data consists of one tab-delimited text file with the name diseasegroups.mmddyyyy.txt.zip where mmddyyyy represents the file's creation date.

Changes from last version

Not applicable

Source data used for this build http://donaldson.uio.no/wiki/Sources_DiG_1.0
Statistics for this release Not available

Known Issues

None. First release.

Understanding the DiG format

License

Copyright © 2008, 2009 Ian Donaldson

Citation

DiG is not yet published.

Disclaimer

Data is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Description of the DiG file

Each line in the DiG list represents a single gene and its association with some disease (taken from Morbid Map). Each gene (line) has been assigned a disease group number (column 7). Each group represents a set of phenotypically related diseases as determined by their Morbid Map titles.


Column number: 1

Column name: title
Column type: text, contains multiple fields
Description: title column as listed in Morbid Map
Example: 17,20-lyase deficiency, isolated, 202110 (3)

Notes

"17,20-lyase deficiency, isolated" is the disease title

"202110" is the OMIM identifier

(3) is the evidence code

See http://irefindex/wiki/DiG:_Disease_groups for more information.

Column number: 2

Column name: genesymbol
Column type: text, multiple values comma delimited
Description: gene symbols as originally listed in Morbid Map.
Example: CYP17A1, CYP17, P450C17

Notes

Column number: 3

Column name: locus
Column type: text
Description: gene locus as originally listed in Morbid Map
Example: 10q24.3

Notes

Column number: 4

Column name: diseaseomimid
Column type: integer
Description: OMIMID extracted from title column in Morbid Map (see Column 1 above).
Example: 202110

Notes

See [1].

Column number: 5

Column name: diseasetag
Column type: string ((1), (2) or (3))
Description: evidence tag extracted from title column in Morbid Map (see Column 1 above)
Example: (3)

Notes

Only entries with (3) in this column have been mapped to a disease group (see column 7). For explanation of disease tags see [2]

Column number: 6

Column name: geneid
Column type: integer
Description: EntrezGene identifier
Example: 64087

Notes

Cross-reference to EntrezGene (see [3]). In some cases, a zero will appear in this column if a mapping to an EntrezGene identifier could not be made using the gene names provided by Morbid Map (see Column 2 above).

Column number: 7

Column name: digid
Column type: integer
Description: Disease Group identifier
Example: 1

Notes

The whole point of the table. This identifier is not stable between releases of DiG. Entries with identical DiG identifiers are deemed to belong to a set of phenotypically-related diseases (and genes).


Column number: 8

Column name: mantitle
Column type: text
Description: manually created title
Example: 17,20-lyase deficiency, isolated, 202110 (3)

Notes

In some rare cases, titles provided by Morbid Map were not properly processed. These titles were manually re-written to avoid these problems. In most cases, the text in this column is identical to that in column 1. See http://irefindex/wiki/DiG:_Disease_groups for details.