My Blog List

Monday, May 2, 2011

The output of the PLINK and ADMIXTURE algorithms we are going to present here, is supposed to reflect "genetic substructures" in both subsample of project participants and reference populations. I guess that the best practice for analyzing admixture would be to dissect populations, starting with K=3 or K=4 assumed explicit ancestral populations and ascending to K=7 or K=8 (although the latter K is a doubtless overkill, and that can lead to the formation of "spurious" clusters).

I used Razib Khan's script to visualize the output of Admixture run for MDL sample (please don't read too much into the first experimental results, because i disclaim all liability for harm done by the inaccurate misinterpretation of results :) ).

MDS plot of genetic distances between populations on the basis of ~500 000 snps
MDS GNUplot (read Davidski's post explaining how datasheets (read: Plink matrix files*) that can be turned into interactive genetic maps with just a couple of simple commands in GNUplot). In Plink, to perform multidimensional scaling analysis on the N x N matrix of genome-wide IBS pairwise distances, use the --mds-plot option in conjunction with --cluster. This command takes a single parameter, the number of dimensions to be extracted. For example, assuming we have already calculated the plink.genome file,
plink --file mydata --read-genome plink.genome --cluster --mds-plot 4
creates the file

I hadn't enough time to find informative angles, so please take a look at plot as it is. The legend of plot is the same as in previous posts.

The same MDS plot with population ID labels

1 comment:

  1. Hi,
    Unfortunately the Davidski blog post it is not available,Could you gently help me sending the code to plot MDS matrix??Thank you