Here are the results of admixture analysis for participants V159-V165 and V201-V202 , compared to reference populations.
K=5
K=8
K=9
K=10
One of the most important problems in admixture analysis - how does one choose the reasonable numbers of clusters (K) - is still up in the air. David H. Alexander, John Novembre and Kenneth Lange suggest using ADMIXTURE's cross-validation procedure. A good value of K will exhibit a low cross-validation error compared to other K values. Cross-validation is enabled by simply adding the --cv flag to the ADMIXTURE command line. In this default setting, the crossvalidation procedure will do 10 repetitions, each time holding out 10% of the genotypes at random. The cross-validation error is reported in the screen output, which makes it fairly clear that the bigger the cross-validition error, the less reasonable is appropriate modeling choice of K.
In our examples, one can easily figure out that K=5 is more reasonable modeling choice
CV error (K=10): 0.62159 (0.00006)
CV error (K=5): 0.61144 (0.00006)
CV error (K=7): 0.61429 (0.00008)
CV error (K=8): 0.61555 (0.00007)
CV error (K=9): 0.61845 (0.00007)
CV error (K=5): 0.61144 (0.00006)
CV error (K=7): 0.61429 (0.00008)
CV error (K=8): 0.61555 (0.00007)
CV error (K=9): 0.61845 (0.00007)
K=5
K=8
K=9
K=10
No comments:
Post a Comment