Yesterday i used Mclust R package on the entirety of MDL dataset to split MDL datasets into several partially-intersecting 5 subsets (see legend, each row represents a single participant, identified by the row number, e.g V+1=V1;...;V+164=V164). 5 clusters was attained with 4 dimensions retained.
From Mclust's page
The method as explained by Dienekes Pontikos
From Mclust's page
MCLUST is an R package for normal mixture modeling via EM, model-based clustering, discriminant analysis and density estimation.
The method as explained by Dienekes Pontikos
In short, this method exploits the clusteredness of individuals along different dimensions of the MDS representation of dense genotypic data. It uses a powerful model-based clustering algorithm (MCLUST) that can infer the existence of clusters of different size, shape, and orientation in the MDS space, and which automatically optimizes for the Bayes Information Criterion, balancing off detail with parsimony.The only parameter that I need to specify to MCLUST is the number of MDS dimensions to retain (for a more detailed analysis, see here), as extra dimensions may add "clusteredness" but also noise. In order to decide on how many dimensions to retain, I empirically run MCLUST with a different number of dimensions (from 2 to 50).
No comments:
Post a Comment