Difference between revisions of "README DiG 1.0"

Revision as of 09:25, 30 June 2010

Last edited: June 30, 2010

Applies to Disease Groups (DiG) release: 2.0

Release date: June 14, 2010

Download location: currently not available for download. Contact ian.donaldson@biotek.uio.no

Authors: Katerina Michalickova and Ian Donaldson

Database: DiG (http://donaldson.uio.no/wiki/DiG:_Disease_groups)

Organization: Biotechnology Centre of Oslo, University of Oslo (http://www.biotek.uio.no/)

Description

This file describes the contents of the tab-delimited format of the Disease Groups list.

Details on the build process are available from http://donaldson.uio.no/wiki/DiG:_Disease_groups

Contact ian.donaldson at biotek.uio.no if you are interested in using DiG.

Directory contents

`README`	pointer to this file at http://donaldson.uio.no/wiki/README_DiG_1.0
`diseasegroups.mmddyyyytxt.zip`	the DiG list

DiG data consists of one tab-delimited text file with the name diseasegroups.mmddyyyy.txt.zip where mmddyyyy represents the file's creation date.

Changes from last version

Not applicable

Source data used for this build	http://donaldson.uio.no/wiki/Sources_DiG_1.0
Statistics for this release	Not available

Known Issues

None. First release.

Understanding the DiG format

License

Citation

DiG is not yet published.

Disclaimer

Data is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Description of the DiG file

Each line in the DiG list represents a single gene and its association with some disease (taken from Morbid Map). Each gene (line) has been assigned a disease group number (column 7). Each group represents a set of phenotypically related diseases as determined by their Morbid Map titles.

Column number: 1

Column name:	title
Column type:	text, contains multiple fields
Description:	title column as listed in Morbid Map
Example:	17,20-lyase deficiency, isolated, 202110 (3)

Notes

"17,20-lyase deficiency, isolated" is the disease title

"202110" is the OMIM identifier

(3) is the evidence code (diseasetag)

See http://irefindex/wiki/DiG:_Disease_groups for more information.

Column number: 2

Column name:	genesymbols
Column type:	text, multiple values comma delimited
Description:	gene symbols as originally listed in Morbid Map
Example:	CYP17A1, CYP17, P450C17

Notes

Column number: 3

Column name:	locus
Column type:	text
Description:	gene locus as originally listed in Morbid Map
Example:	10q24.3

Notes

Column number: 4

Column name:	diseaseomimid
Column type:	integer
Description:	OMIMID extracted from title column in Morbid Map (see Column 1 above)
Example:	202110

Notes This omim identifier usually refers to a record describing a disease phenotype; it is a descriptive entry that does not refer to a unique locus.

See [1].

Column number: 5

Column name:	diseasetag
Column type:	string ((1), (2) or (3))
Description:	evidence code extracted from title column in Morbid Map (see Column 1 above)
Example:	(3)

Notes

Only entries with (3) in this column have been mapped to a disease group (see column 7). For explanation of disease tags see [2]

Column number: 6

Column name:	geneid
Column type:	integer
Description:	EntrezGene identifier
Example:	64087

Notes

Cross-reference to EntrezGene (see [3]). The file used to mine for geneids is gene_info file at [4]. In some cases, a zero will appear in this column if a mapping to an EntrezGene identifier could not be made using the gene names provided by Morbid Map (see Column 2 above).

Column number: 7

Column name:	digid
Column type:	integer
Description:	Disease Group identifier
Example:	1

Notes

The whole point of the table. This identifier is not stable between releases of DiG. Entries with identical DiG identifiers are deemed to belong to a set of phenotypically-related diseases (and genes).

Column number: 8

Column name:	mantitle
Column type:	text
Description:	manually created title
Example:	17,20-lyase deficiency, isolated, 202110 (3)

Notes

In some rare cases, titles provided by Morbid Map could not be properly processed by the text matching process. These titles were manually re-written to avoid these problems. In most cases, the text in this column is identical to that in column 1. See http://irefindex/wiki/DiG:_Disease_groups for details. All these manual changes can be propagated from release to release. Note that the manual titles are not guaranteed to contain OMIM identifiers and evidence codes.

Column number: 9

Column name:	geneomimid
Column type:	integer
Description:	omim identifier as originally listed in Morbid Map
Example:	609300

Notes

This omim identifier usually refers to a record describing a gene.

@@ Line 173: / Line 173: @@
 '''Notes'''
-Cross-reference to EntrezGene (see [http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene]).
+Cross-reference to EntrezGene (see [http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene]).  The file used to mine for geneids is gene_info file at [ftp://ftp.ncbi.nih.gov/gene/DATA/gene_info.gz].
 In some cases, a zero will appear in this column if a mapping to an EntrezGene identifier could not be made using the gene names provided by Morbid Map (see Column 2 above).
 === Column number: 7 ===

Anonymous

Search

Difference between revisions of "README DiG 1.0"

Namespaces

More

Page actions

Revision as of 09:25, 30 June 2010

Contents

Description

Directory contents

Changes from last version

Known Issues

Understanding the DiG format

License

Citation

Disclaimer

Description of the DiG file

Column number: 1

Column number: 2

Column number: 3

Column number: 4

Column number: 5

Column number: 6

Column number: 7

Column number: 8

Column number: 9

Navigation

Navigation

Internal Links

Wiki tools

Wiki tools

Anonymous

Search

Difference between revisions of "README DiG 1.0"

Revision as of 09:25, 30 June 2010

Contents

Description

Directory contents

Changes from last version

Known Issues

Understanding the DiG format

License

Citation

Disclaimer

Description of the DiG file

Column number: 1

Column number: 2

Column number: 3

Column number: 4

Column number: 5

Column number: 6

Column number: 7

Column number: 8

Column number: 9

Navigation

Wiki tools

Page tools

Categories