Thursday, May 26, 2011

Some comments on experimental test

Some participants of the project have requested MDS source file to produce their own plots.
I have uploaded Plink MDS, CLUSTER and NEAREST files to GoogleDocs, so please feel free to download them (all files in text format, so you can open them with text editors or import to Excel/Rgui). NEAREST is essentialy one of those files (another one is MIBS matrix), which are used to produce IBS Rdata object files, distributed by Dienekes or Zack. I haven't time to create the same the same object file, but you can import it to Excel and filter out the participants, who are closest to you.


Several blog readers pointed to the incorrect representations of Belarusian/Lithuanian in the ADMIXTURE plot I published yesterday. I have fixed the plot by swapping Belarusian/Lithuanian components to make sure that the visualization works properly.

A couple of the project's participants have expressed their concern with the high % of Hungarian component in their results. As it was indicated earlier, this component is not associated entirely to Hungarian population and having a high % of this component doesn't necessary imply some distant Hungarian roots. Here, the Hungarian component is used as a suitable proxy for genetic Central- European (or, more specifically Circum-Carpathian) ancestry. I, personally, am inclined to believe that this "Hungarian" component is high in indviduals with Ukrainian ancestry/ancestry from Slovakia or southern Poland (when i'll have enough of Ukrainian/Polish samples, i'll test this scenario).

As for the LAMP/STRUCTURE analysis, i would warn against reading too much into it. Since the  analysis was performed on a single Chromosome 22 (without thinning-out set of SNPs in linkage disequilibrium etc ), the results were shown for evalution purposes only.  The estimated components seem to pick up more distant signal of Neolithic migration events ( it makes sense to compare those results to Davidski's or Dienekes' estimation of  African/Asian admixture levels in the European populations).


  1. The Hungarian set from Behar et al. shows a lot of diversity. Some of the individuals are basically like North Germans or even Dutch, while a couple are similar to Romanians.

    So they cover a lot of ground in Europe as a reference set in a supervised ADMIXTURE analysis; all the way from Northern Germany to Poland, and down to the Northern Balkans.

  2. I'm unable to download MDLP-adm.cluster1. Can you please check the link. Thanks.

  3. Very cool thanks. The plink MDS plot w/ gnuplot is pretty cool. A few more Polish and Ukrainian samples appear needed. Any new ones in the pipe?

    Can you explain the interpretation of the cluster data? I'm guessing cluster1 is what I'm interested listing the clusters and all the other cluster files are temp or log files?
    I'm a multiple decade Linux/UNIX user/admin so you may be very technical if need be.

  4. Hi, Bergep

    I am having troubles with MDS visualizations in Gplot (some issues with RAM). Could you be so kind to generate some visualizations of MDS and send it to me?

    And,yes, you are right in your assumption regarding underpresented Polish and Ukrainian samples. We need more Polish and Ukrainian samples. Yesterday i got 2 admixed Belarussian-Ukrainian, so the project's database keeps slowly growing.