Staff View: A web-based implementation of k-means algorithms

A web-based implementation of k-means algorithms

The K-means algorithm has been around for over a century. While a rather simplistic and dated algorithm, it remains widely used and taught till this day. The K-means algorithm requires two inputs for it to be applied onto a data set, the value K, and a proximity measure. Picking the right inputs is...

Full description

Saved in:

Bibliographic Details
Main Author:	Lee, Quan
Format:	Final Year Project / Dissertation / Thesis
Published:	2022
Subjects:	QA76 Computer software
Online Access:	http://eprints.utar.edu.my/5010/1/1801846_LEE_QUAN.pdf http://eprints.utar.edu.my/5010/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-utar-eprints.5010
record_format	eprints
spelling	my-utar-eprints.50102022-12-26T14:19:36Z A web-based implementation of k-means algorithms Lee, Quan QA76 Computer software The K-means algorithm has been around for over a century. While a rather simplistic and dated algorithm, it remains widely used and taught till this day. The K-means algorithm requires two inputs for it to be applied onto a data set, the value K, and a proximity measure. Picking the right inputs is of utmost importance if one wishes to achieve good results with the algorithm, especially the proximity measure. There are plenty of different proximity measures available in the world, all of them best suited for different types of applications and data sets. Yet knowing this, most modern data mining tools only offer a handful of proximity measures to the user, with the most common ones being Euclidean distance and Manhattan distance. This stinginess of proximity measures in data mining tools is stifling the performance of the algorithm. This is where k-luster comes in. k-luster, the web application developed as a result of this project, implements the K-means and K-means++ algorithm along with ten proximity measures, seven of which are distance measures and whereas the remaining three are similarity measures. The project was planned using the Kanban development methodology, and was built using HTML, CSS, JavaScript, Django, NumPy and pandas. The completed web application is then hosted on Heroku. k-luster allows users to upload their own data set, or choose from one of three samples if they just want to try out the application. Playing around with different settings and comparing the results obtained, it is clear how large of an impact choosing the right proximity measure can make. In conclusion, this project has accomplished what it first set out to achieve. However, there is still much room for improvement. Firstly, k-luster could incorporate additional clustering algorithms, or even classification algorithms in the future. Furthermore, the web application could save the users’ past work, so that they may resume their work at a later time without skipping a beat. 2022 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/5010/1/1801846_LEE_QUAN.pdf Lee, Quan (2022) A web-based implementation of k-means algorithms. Final Year Project, UTAR. http://eprints.utar.edu.my/5010/
institution	Universiti Tunku Abdul Rahman
building	UTAR Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Tunku Abdul Rahman
content_source	UTAR Institutional Repository
url_provider	http://eprints.utar.edu.my
topic	QA76 Computer software
spellingShingle	QA76 Computer software Lee, Quan A web-based implementation of k-means algorithms
description	The K-means algorithm has been around for over a century. While a rather simplistic and dated algorithm, it remains widely used and taught till this day. The K-means algorithm requires two inputs for it to be applied onto a data set, the value K, and a proximity measure. Picking the right inputs is of utmost importance if one wishes to achieve good results with the algorithm, especially the proximity measure. There are plenty of different proximity measures available in the world, all of them best suited for different types of applications and data sets. Yet knowing this, most modern data mining tools only offer a handful of proximity measures to the user, with the most common ones being Euclidean distance and Manhattan distance. This stinginess of proximity measures in data mining tools is stifling the performance of the algorithm. This is where k-luster comes in. k-luster, the web application developed as a result of this project, implements the K-means and K-means++ algorithm along with ten proximity measures, seven of which are distance measures and whereas the remaining three are similarity measures. The project was planned using the Kanban development methodology, and was built using HTML, CSS, JavaScript, Django, NumPy and pandas. The completed web application is then hosted on Heroku. k-luster allows users to upload their own data set, or choose from one of three samples if they just want to try out the application. Playing around with different settings and comparing the results obtained, it is clear how large of an impact choosing the right proximity measure can make. In conclusion, this project has accomplished what it first set out to achieve. However, there is still much room for improvement. Firstly, k-luster could incorporate additional clustering algorithms, or even classification algorithms in the future. Furthermore, the web application could save the users’ past work, so that they may resume their work at a later time without skipping a beat.
format	Final Year Project / Dissertation / Thesis
author	Lee, Quan
author_facet	Lee, Quan
author_sort	Lee, Quan
title	A web-based implementation of k-means algorithms
title_short	A web-based implementation of k-means algorithms
title_full	A web-based implementation of k-means algorithms
title_fullStr	A web-based implementation of k-means algorithms
title_full_unstemmed	A web-based implementation of k-means algorithms
title_sort	web-based implementation of k-means algorithms
publishDate	2022
url	http://eprints.utar.edu.my/5010/1/1801846_LEE_QUAN.pdf http://eprints.utar.edu.my/5010/
_version_	1753793017444040704
score	13.211869

A web-based implementation of k-means algorithms

Similar Items