A clustering technique using single pass clustering algorithm for search engine

Internet users rely heavily on search engine to explore and find useful information buried in the websites. Up to now, the search results returned by the search engines are still far from satisfaction due to a long list of search results which in practice contains a mix of relevant and irrelevant in...

Full description

Saved in:
Bibliographic Details
Main Authors: Indra, Z., Zamin, N., Jaafar, J.
Format: Conference or Workshop Item
Published: Institute of Electrical and Electronics Engineers Inc. 2014
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-84946689998&doi=10.1109%2fWICT.2014.7077325&partnerID=40&md5=377706a40ceef04905367b76de1ac08f
http://eprints.utp.edu.my/31285/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Internet users rely heavily on search engine to explore and find useful information buried in the websites. Up to now, the search results returned by the search engines are still far from satisfaction due to a long list of search results which in practice contains a mix of relevant and irrelevant information. The manual process of filtering the irrelevant information is daunting and time consuming. Clustering is one of the popular solutions for this cumbersome task. However, our literature studies revealed that research on document clustering for Asian languages are relatively limited as compared to English. Whilst the application of document clustering technique in search engines is commonly less available. In this research, a clustering technique for search engine using Single Pass Clustering (SPC) Algorithm is proposed. The technique is experimented on a set of Indonesian news documents to support the limited research of document clustering for Indonesian language. An experiment done on 200 Indonesian news documents has produced a number of satisfactory labelled clusters and the application of the algorithm is shown on a simulated search engine. © 2014 IEEE.