dChip: Parametric Linkage Analysis

 

Call MERLIN for analysis                  Specify consanguineous relationship

 

See Leykin et al. 2005 for reference. An older version of user guide by Carsten Rosenow (PowerPoint, Word). Also see CompareLinkage software for automatically converting SNP array genotype data into the "Linkage" format and calling Merlin or Allegro for linkage analysis.

 

Applying 10K and 100K SNP arrays to linkage analysis has demonstrated improvement in genome coverage and information content over microsatellite-based assays (Janecke et al. 2004, Middleton et al. 2004). However, the current linkage analysis usually relies on command line programs and users have to correlate peak linkage score regions with genes and genotype data manually through software such as Excel. dChip linkage analysis implements a variant of the Lander-Green algorithm to perform multi-point parametric linkage analysis and haplotyping of SNP array data of small families (bit £ 18). The resulting linkage score curves can be visualized in dChip together with the genotype data, haplotyping results and genome annotations, facilitating finding disease loci.

 

Allele frequencies are specified through a SNP information file at "Open group" or "Get external data". At “Analysis/Chromosome”, specify files, select “Analysis Method: Linage analysis”, click “OK”. In the chromosome view, Home and End to go to a chromosome. In this view, select menu “Chromosome/Linkage analysis”, specify pedigree file and click OK. The pedigree file (such as “example_ped.xls”) is a tab-delimited text file and can be edited in Excel but saved in text format. It has similar format to the MERLIN pedigree file format, but has an additional “Array” column to match an individual to an sample name, and an “Affected” status column (1 for unaffected, 2 for affected, and 0 for unknown or uncertain disease status). A person’s parents should either be both in the pedigree file (not 0), or both 0 (this person is a founder). The first column (family name) in the pedigree file can be integer (such as 1) or string (use integer if calling MERLIN is needed). [Version 07/06+] Multiple families with the same inheritance mode may be specified and analyzed, and the LOD score will be the sum of LOD scores of all families.

 

Currently the practical limit of the pedigree is 18 bits (2*number of non-founders – number of founders). If the bit exceeds limit, one may remove individuals that are not parents and do not have array data or have unaffected status. Right now linkage analysis is only valid for autosomes. Mendelian inheritance errors are checked before performing analysis. Genotypes inconsistent with one or more parents will be reported and set to be NO_CALL in the LOD score computation. The “Apply inheritance vector reduction” option may lead to the "Vector sum is 0 in MakeProb()" error, so when the bit is small this option can be unchecked.

 

After the analysis is finished the LOD curve is displayed on the right side. Click a data point in the peak score region and use Down arrow to zoom in, and press 'P' to display gene and cytobands names. After setting proper threshold at "Tools/Options/Chromosome", one can export interesting regions with LOD score exceeding threshold by using "Chromosome/Export SNP data". This file contains SNP and gene names (example output file). Press ‘D’ to go to the haplotype view, the different colors represent different ancestor alleles of founders. After pressing “I”, the blue and red colors are haplotype genotype A and B (left side for father allele and right side for mother allele). To save or load LOD scores, specify a curve file at “Chromosome/Linkage analysis/options/Linkage curve file”.

 

 

The haplotype view can be viewed by "Chromosome/Next data type" (press "I" to toggle to inferred view). Such view is useful to correlate LOH score with the haplotype inferred for each individual, with different ancestor alleles represented by different color (see figure below). For more details see p13 of Leykin et al. 2005. (data courtesy of Martin Pollak)

 

Call MERLIN for analysis

 

MERLIN software can be downloaded and called from dChip to perform non-parametric linkage analysis. Extract MERLIN to a directory and set this directory at “Chromosome/Linkage analysis/MERLIN directory”, and then check “Use Merlin for linkage analysis” to perform linkage analysis for the current chromosome. dChip will export "dat, map, ped" files for MERLIN to use and read back NPL linkage scores to display with genes and chromosomes as above. These exported files can also be used manually with MERLIN outside dChip. For unknown reasons MERLIN may work or not work for different chromosomes in this setup.

 

Specify consanguineous relationship

 

[Use version 6/1/06+] A consanguineous family has marriage between relatives, such as person 1 and 2 in the pedigree below (from Leykin et al. 2005, courtesy of Richard JH Smith). The disease in such family is usually recessive and inherited from the same ancestor, making a small pedigree powerful to find disease locus. Such families can be analyzed as usual. For example, the pedigree displayed below (pedigree file) has 3 founders and 8 non-founders, therefore 13 bits (2n – f). However, often a consanguineous family traces their common ancestor to several generations ago, including many non-genotyped individuals and enlarging the pedigree size dramatically. In dChip we can specify such consanguineous information without having to include all the individuals. In this reduced pedigree file for the same family below, we exclude person 10 and 11, but specify in the "Sharing" column that person 13 shares with person 1 for father allele with probability 0.5 and for mother allele with probability 0.5. The format is "1|0.5|0.5", representing "Sharing target person | Father sharing probability | Mother sharing probability". Since the genotype of person 13 is now dependent on that of person 1, we will lose 1 bit on bit reduction due to founder phase. But the overall bit is smaller (10, or 2*6 – 2) and the computation is faster to produce similar LOD curve.

 

       

 

(Updated 10/20/07)