Aglomera.NET

A hierarchical agglomerative clustering (HAC) library written in C#

Source repository: https://github.com/pedrodbs/Aglomera

Aglomera is a .NET open-source library written entirely in C# that implements hierarchical clustering (HC) algorithms. Currently, Aglomera.NET implements program AGNES (AGglomerative NESting) of [Kaufman & Rousseeuw, 1990], i.e., the bottom-up approach. It also supports different linkage criteria and also provides several metrics to perform internal and external evaluation of clustering results. The results of clustering can be exported to a Json file to be visualized as a dendrogram in Dendrogram Viewer, an interactive web-application using D3.js.

Aglomera.NET

A cluster refers to a set of instances or data-points. HC can either be agglomerative (bottom-up approach) or divisive (top-down approach). The distance between each instance is calculated using some dissimilarity function. The distance between clusters is calculated using some linkage criterion. Each step of HC produces a new cluster-set, i.e., a set of clusters, from the cluster-set of the previous step.

Features

  • Supports the following linkage criteria, used to consider the dissimilarity between clusters:
    • Complete (farthest neighbor), average (UPGMA), centroid, minimum energy, single (nearest neighbor), Ward’s minimum variance method.
  • Provides the following external clustering evaluation criteria, used to evaluate the quality of a given cluster-set when each data-point has associated a certain label / class:
    • Purity, normalized mutual information, accuracy, precision, recall, F-measure.
  • Provides the following internal clustering evaluation criteria, used to select the optimal number of clusters when no ground truth is available:
    • Silhouette coefficient, Dunn index, Davies-Bouldin index, Calinski-Harabasz index, modified Gamma statistic, Xie-Beni index, within-between ratio, I-index, Xu index, RMSSD, R-squared.
  • CSV export
    • To export the result of clustering to a comma-separated values (CSV) file.
  • D3.js export
    • Export the result of clustering to a Json file that contains the hierarchical structure of the clustering procedure that can be loaded into Dendrogram Viewer to produce a dendrogram.