Malay documents clustering algorithm based on singular value decomposition.

Document categorization is a widely researched area of information retrieval. A research on Malay natural language processing has been done up to the level of retrieving documents but not to the extent of automatic semantic categorization. Thus, an approach for the clustering of Malay documents bas...

Full description

Saved in:
Bibliographic Details
Main Authors: Ab Samat, Nordianah, Azmi Murad, Masrah Azrifah, Abdullah, Muhamad Taufik, Atan, Rodziah
Format: Article
Language:English
English
Published: Asian Research Publishing Network (ARPN) 2009
Online Access:http://psasir.upm.edu.my/id/eprint/15515/1/Malay%20documents%20clustering%20algorithm%20based%20on%20singular%20value%20decomposition.pdf
http://psasir.upm.edu.my/id/eprint/15515/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Document categorization is a widely researched area of information retrieval. A research on Malay natural language processing has been done up to the level of retrieving documents but not to the extent of automatic semantic categorization. Thus, an approach for the clustering of Malay documents based on semantic relations between words is proposed in this paper. The method described in this paper uses Singular Value Decomposition (SVD) technique for the vector representation of each document where familiar clustering techniques can be applied in this space. The experimental results we obtained taking into account the semantics of the document that performed good document clustering by obtaining relevant subjects appearing in a cluster.