My Blog List

Thursday, April 21, 2011

Experiments with relatedness

Yesterday i was investigating metodologies of  pairwise IBD estimation in PLINK software. PLINK allows you to estimate genomewide IBD-sharing coefficients between seemingly unrelated individuals from whole-genome data. In a homogeneous sample, it is possible to calculate genome-wide IBD given IBS information, as long as a large number of SNPs are available (probably 1000 independent SNPs at a bare minimum; ideally 100K or more). The basic PLINK command for IBD calculations is


plink --file mydata --genome --min 0.05

which yields information useful for IBD estimation


FID1      Family ID for first individual
     IID1      Individual ID for first individual
     FID2      Family ID for second individual
     IID2      Individual ID for second individual
     RT        Relationship type given PED file
     EZ        Expected IBD sharing given PED file
     Z0        P(IBD=0)
     Z1        P(IBD=1)
     Z2        P(IBD=2)
     PI_HAT    P(IBD=2)+0.5*P(IBD=1) ( proportion IBD )
     PHE       Pairwise phenotypic code (1,0,-1 = AA, AU and UU pairs)
     DST       IBS distance (IBS2 + 0.5*IBS1) / ( N SNP pairs )
     PPC       IBS binomial test
     RATIO     Of HETHET : IBS 0 SNPs (expected value is 2)


Following the instructions from EMERGE Network article "Visualizing relatedness"  and R graphic libraries (such as ggplot2) one can easily visualize Z1 and Z0, the proportion of markers identical by descent 1 and 0 respectively, for every pair of individuals in the dataset.

Example: Visualizing relatedness of  project's "unrelated" sample of N=159

No comments:

Post a Comment