Looking back at the project's goals posited 6 months ago, we would like to recapitulate some of the most important goals in order to analyzes how they have been achieved:
1.Performing the comprehensive Plink analysis, including the estimation of homozygous ROH (shared clusters and groups of homozygosity), possible Mendelian errors, extended LD-haplotypes (based on values of R2), shared IBD segments and IBS matrix (Plink format).
Although we performed all described types of Plink analysises and eve shared some results on the project's blog, we didn't consider these results worth of extensive coverage. And likewise, there was no interest in those analysises on behalf of the project's members.
Experiments with relatedness
Graphoanalytical approach to visualizing relatedness
IBD sharing
IBS similarity matrix in R
2.Phasing the genotype files, i.e establishing the haploid phase (this is a separate analysis demanding genotypes of your parents, so it will not be performed on a regular base) (Beagle or Merlin output format).
We performed ad-hoc phasing of the genotypes in our project (MDLP) and, in order to assess possible discrepancies between phased and unphased data, we performed ADMIXTURE analysis (with 4 assumed clusters K=4) separately for original unphased dataset and BEAGLE-phased dataset.
Analyzing admixture in phased v.unphased dataset
3. Using AISconvert (based on HIRsearch) and Germline software to detect IBD segments.
Used only occassionally in combination with other analyses
Analyzing admixture in phased v.unphased dataset
Grapho-analytical approach to the visualisation of IBD shared segments
IBD sharing
4.Using ADMIXTURE/STRUCTURE software for detecting admixture clusters and claculating allele frequencies.
We performed a plenty of ADMIXTURE and STRUCTURE runs (using different a priori number of assumed clusters under different models of admixture). Discussions of ADMIXTURE results contibuted the most signficant part to the MDLP's blog.
The allele frequencies, estimated in K=7 Admixture run, were provided for creating a custom modification of DIYDodecad's calculator (MDLP).
Analyzing admixture in phased v.unphased dataset
First results: Admixture unsupervised run
Admixture analysis: sorted after Baltic-Slavic component
Admixture results: Baltic-Slavic
Admixture analysis: the rest of groupings
The output of the PLINK and ADMIXTURE algorithms
Admixture clusters, Mclust and populations concordance
DIYDodecad calculator v2.0 for my BGA project (MDLP).
Root Means Square Comparison Excel 2007 Macro Enabled XLSM spreadsheet for the Magnus Ducatus Lithuaniae Project data
and many more ..
PCA plots (Eigensoft)
PCA plots for reference populations and project participants
MDS and PCA plots: for V157-V247
A close-up on "the core" of the MDL project
6.Creating RHHmapper schemes showing the location of rare heterozygous and homozygous genotypes
RHH mapper: results for V158-V165 and V201-V202
1.Performing the comprehensive Plink analysis, including the estimation of homozygous ROH (shared clusters and groups of homozygosity), possible Mendelian errors, extended LD-haplotypes (based on values of R2), shared IBD segments and IBS matrix (Plink format).
Although we performed all described types of Plink analysises and eve shared some results on the project's blog, we didn't consider these results worth of extensive coverage. And likewise, there was no interest in those analysises on behalf of the project's members.
Experiments with relatedness
Graphoanalytical approach to visualizing relatedness
IBD sharing
IBS similarity matrix in R
2.Phasing the genotype files, i.e establishing the haploid phase (this is a separate analysis demanding genotypes of your parents, so it will not be performed on a regular base) (Beagle or Merlin output format).
We performed ad-hoc phasing of the genotypes in our project (MDLP) and, in order to assess possible discrepancies between phased and unphased data, we performed ADMIXTURE analysis (with 4 assumed clusters K=4) separately for original unphased dataset and BEAGLE-phased dataset.
Analyzing admixture in phased v.unphased dataset
3. Using AISconvert (based on HIRsearch) and Germline software to detect IBD segments.
Used only occassionally in combination with other analyses
Analyzing admixture in phased v.unphased dataset
Grapho-analytical approach to the visualisation of IBD shared segments
IBD sharing
4.Using ADMIXTURE/STRUCTURE software for detecting admixture clusters and claculating allele frequencies.
We performed a plenty of ADMIXTURE and STRUCTURE runs (using different a priori number of assumed clusters under different models of admixture). Discussions of ADMIXTURE results contibuted the most signficant part to the MDLP's blog.
The allele frequencies, estimated in K=7 Admixture run, were provided for creating a custom modification of DIYDodecad's calculator (MDLP).
Analyzing admixture in phased v.unphased dataset
First results: Admixture unsupervised run
Admixture analysis: sorted after Baltic-Slavic component
Admixture results: Baltic-Slavic
Admixture analysis: the rest of groupings
The output of the PLINK and ADMIXTURE algorithms
Admixture clusters, Mclust and populations concordance
DIYDodecad calculator v2.0 for my BGA project (MDLP).
Root Means Square Comparison Excel 2007 Macro Enabled XLSM spreadsheet for the Magnus Ducatus Lithuaniae Project data
and many more ..
5. Creating MDS and PCA plots
PCA plots (Eigensoft)
PCA plots for reference populations and project participants
MDS and PCA plots: for V157-V247
A close-up on "the core" of the MDL project
6.Creating RHHmapper schemes showing the location of rare heterozygous and homozygous genotypes
RHH mapper: results for V158-V165 and V201-V202