California State University San Bernardino Department of Computer Science and Engineering Masters Thesis Defense Date Wednesday, May 20, 2009 Time 3:30pm – 4:30pm Location JB-360 Title An Extendable General Platform for Cluster Analysis and Validation Candidate Brandon Edwards Advisor Dr. Haiyan Qiao Committee Members Dr. Keith Schubert Dr. Ernesto Gomez Abstract Cluster analysis plays an important role in data analysis and knowledge discovery. It is used in a large range of fields, including market research, city-planning, earthquake studies, and scientific research areas, such as bioinformatics. As the complexity and amount of data increases, improvement to old techniques as well as development of novel algorithms is needed. While cluster analysis is a useful, unsupervised technique for analyzing data, there are no clustering algorithms that can be uniformly applied to all sets of data. Many of the algorithms need input parameters that require the user to have some pre-existing knowledge about the data, such as the number of clusters the data holds. This thesis addresses current problems in the area of cluster analysis, such as estimating the number of clusters, detecting outliers, and offering useful visualization of multi-dimensional data. Solutions to these problems as well as a user-friendly clustering platform that aids the user in obtaining useful clustering results are presented.