Stability Analysis in Unsupervised Learning
Speaker: Dr. Derek Greene. UCD School of Computer Science & Informatics
Date: Thursday, February 5 th
Time: 4.00PM
Location: Science East, Room E0.01
Abstract: When a set of clusters is generated using an unsupervised learning algorithm that includes a random element or requires the selection of key parameter values, it is important to determine whether the clustering represents a definitive solution. The "stability" of an algorithm refers to its ability to
consistently replicate similar solutions on data originating from the same source. Cluster validation techniques based on this concept have been shown to
be effective in helping to choose a suitable number of clusters for data in a range of applications. Here I review previous work on stability analysis for clustering, and present recent work on applying stability-based techniques in conjunction with matrix factorization to help identify the number of topics in text corpora.
Series: Statistics
Social Media Links