My Blog List

Saturday, May 28, 2011

Experimental test III: (De)constructing ancestry with LAMP

In my previous attempt to find statistical criteria  by which Lithuanians and Belarusians might be separated from the rest of dataset, i outlined some of the most important incongruities/ambuigities between the output of LAMP and the results of  supervised ADMIXTURE run. It might be argued that these ambiguities are somewhat explained by the evident differences in population stratification's algorithms, implemented in LAMP and ADMIXTURE.

1) ADMIXTURE software  implements a model-based approach to estimate ancestry coefficients as the parameters of a statistical model. It is also important to add that the model-based approach in ADMIXTURE is based  on the global ancestry paradigm (i.e the goal of  ADMIXTURE/STRUCTURE analysis is to estimate the proportion of ancestry from each contributing population, considered as an average over the individual's entire genome).

2) LAMP software is built upon an efficient dynamic-programming algorithm WINPOP that infers locus-specific ancestries.Genome is partitioned into chromosome segments of definite ancestral origin (overlapping, contiguous windows of SNPs) and likelihood model optimized over each window. The goal then is to find the segment boundaries and assign each segment's origin.

As we have already seen, for our project's particular purposes (i.e the estimation of definte anectry proportion of very closely related populations) LAMP's model has a number of advantages over ADMIXTURE's. I fully share  Andres Palsen's confidence  in LAMP's ability to separate closely related populations of Scandinavia (Norwegians, Swedes, Danes).
Another advantage of the LAMP's model is that it includes physical positions of SNP, cM (centimorgans) between SNP and recombination events. Moreover, it allows to estimate the ancestral origin of different segments for different number of generations passed from the admixture events.

I have used HapMap's list of recombination rates for Chromosome 22 and estimated alleles' frequencies using PlINK on a training set of 6 reference populations (Orcadians, Romanians, Lithuanians,Belarussians and Russians). Then i performed the analysis in LAMP for G=5, 10 and 25 (number of generations from the admixture events) and visualised the output (click on the attached pictures to obtain the original images).

I would suggest to check the results of LAMP analysis of Chromosome 22 for all project's participants. It would be interesting to compare these results to AncestryFinder's visualizations or  ADMIXTURE results Dodecad's/Eurogenes'/Diogenes' projects.

As always, i am open to comments and criticism.

G=5 (5 generations from the admixture events)
Ancestral segments on Chromosome 22
Ancestral segments on Chromosome 22

Average ancestry for G=5

G=10(10 generations from the admixture events)

Ancestral segments of Chromosome 22
Average ancestry for G=10

G=25(25 generations from the admixture events)

Ancestral segments on Chromosome 22 
Average ancestry for G=25

No comments:

Post a Comment