Clustering chemical data set using particle swarm optimization based algorithm

Clustering is the process of organizing similar objects into groups, with its main objective is to organize a collection of data items into some meaningful groups. Generally, clustering is the most suitable approach in dealing with huge amount dataset with higher resemblance such as chemical databas...

Full description

Saved in:
Bibliographic Details
Main Author: Triyono, Triyono
Format: Thesis
Language:English
Published: 2008
Subjects:
Online Access:http://eprints.utm.my/id/eprint/9867/1/TriyonoMFKM2008.pdf
http://eprints.utm.my/id/eprint/9867/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:1277
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Clustering is the process of organizing similar objects into groups, with its main objective is to organize a collection of data items into some meaningful groups. Generally, clustering is the most suitable approach in dealing with huge amount dataset with higher resemblance such as chemical database. The chemical data sets contain a huge number of compounds and knowledge of the physiochemical properties. The biological activities of these compounds have a large significance in the process of designing and discovering new drugs. Many algorithms had been applied to cluster chemical data set such as Ward’s algorithm. In this study, Particle Swarm Optimization (PSO) based clustering algorithm is exploited to optimize the results of other clustering algorithm such as K-means. Two chemical data sets were used and downloaded from MDDR (MDL Drug Database Report). The main difference between these two data sets is measured in terms of the similarities quantify of bioactivities between active compounds. The results are compared with Ward’s algorithm in terms of proportion actives percentage in active clusters are. We found that PSO algorithm reveals better performance than Ward’s algorithm on continuous data format; however for binary data format, Ward’s algorithm outperforms arrogantly.