My Blog List

Monday, May 16, 2011

Admixture results for V159-V165 and V201-V202

Here are the results of admixture analysis for participants V159-V165 and V201-V202 , compared to reference populations.

One of the most important problems in admixture analysis - how does one choose the reasonable numbers of clusters (K) - is still up in the air. David H. Alexander, John Novembre and Kenneth Lange suggest using ADMIXTURE's cross-validation procedure. A good value of K will exhibit a low cross-validation error compared to other K values. Cross-validation is enabled by simply adding the --cv flag to the ADMIXTURE command line. In this default setting, the crossvalidation procedure will do 10 repetitions, each time holding out 10% of the genotypes at random. The cross-validation error is reported in the screen output, which makes it fairly clear that the bigger the cross-validition error, the less reasonable is appropriate modeling choice of K. 

In our examples, one can easily figure out that  K=5 is more reasonable modeling choice



CV error (K=10): 0.62159 (0.00006)
CV error (K=5): 0.61144 (0.00006)
CV error (K=7): 0.61429 (0.00008)
CV error (K=8): 0.61555 (0.00007)
CV error (K=9): 0.61845 (0.00007)



K=5


K=8


K=9


K=10

No comments:

Post a Comment