Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification

Subject area classification allow researchers to identify publications based on their discipline or research domain. When number of document is large, classification of publication documents become increasingly difficult. Besides, covering granularity of broad range of subject areas manually is a cr...

Full description

Saved in:
Bibliographic Details
Main Authors: Hossain, Rajan, Ibrahim, Roliana, Dollah @ Md. Zain, Rozilawati, Mohamed Khaidzir, Khairul Anwar
Format: Article
Published: Journal of Information Systems Research and Innovation (JISRI) 2017
Subjects:
Online Access:http://eprints.utm.my/id/eprint/80618/
https://seminar.utmspace.edu.my/jisri/download/Volume%2011-3/Paper%202%20Rajan%20CR%207-13.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.80618
record_format eprints
spelling my.utm.806182019-06-27T06:10:16Z http://eprints.utm.my/id/eprint/80618/ Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification Hossain, Rajan Ibrahim, Roliana Dollah @ Md. Zain, Rozilawati Mohamed Khaidzir, Khairul Anwar QA75 Electronic computers. Computer science Subject area classification allow researchers to identify publications based on their discipline or research domain. When number of document is large, classification of publication documents become increasingly difficult. Besides, covering granularity of broad range of subject areas manually is a critical problem. In recent areas, machine learning has emerged as an effective way for automated classification in various domains such as text, images and videos. Problems with classifying large amount of publication papers can be solved with automating the process of subject area classification using supervised machine learning approaches. This paper represents an experimental study that used support vector machines and naïve bayes for automated classification of subject areas. Text classification method is used to find the probability of a document to be in certain category based on co-words and their frequency in a document. The proposed experimentation is consisted of two phases. In first phase, a list of co-words was generated from a collection of document in each of selected subject areas using text pre-processing technique. In second phase, both Support Vector Machines(SVM) and Naïve Bayes classifiers were used to conduct the experimentation and performance of each method was observed. It was found that SVM performs better than Naïve Bayes classifier in multi-label classification. Journal of Information Systems Research and Innovation (JISRI) 2017 Article PeerReviewed Hossain, Rajan and Ibrahim, Roliana and Dollah @ Md. Zain, Rozilawati and Mohamed Khaidzir, Khairul Anwar (2017) Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification. Journal of Information Systems Research and Innovation, 11 (3). pp. 7-13. ISSN 2289-1358 https://seminar.utmspace.edu.my/jisri/download/Volume%2011-3/Paper%202%20Rajan%20CR%207-13.pdf
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Hossain, Rajan
Ibrahim, Roliana
Dollah @ Md. Zain, Rozilawati
Mohamed Khaidzir, Khairul Anwar
Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification
description Subject area classification allow researchers to identify publications based on their discipline or research domain. When number of document is large, classification of publication documents become increasingly difficult. Besides, covering granularity of broad range of subject areas manually is a critical problem. In recent areas, machine learning has emerged as an effective way for automated classification in various domains such as text, images and videos. Problems with classifying large amount of publication papers can be solved with automating the process of subject area classification using supervised machine learning approaches. This paper represents an experimental study that used support vector machines and naïve bayes for automated classification of subject areas. Text classification method is used to find the probability of a document to be in certain category based on co-words and their frequency in a document. The proposed experimentation is consisted of two phases. In first phase, a list of co-words was generated from a collection of document in each of selected subject areas using text pre-processing technique. In second phase, both Support Vector Machines(SVM) and Naïve Bayes classifiers were used to conduct the experimentation and performance of each method was observed. It was found that SVM performs better than Naïve Bayes classifier in multi-label classification.
format Article
author Hossain, Rajan
Ibrahim, Roliana
Dollah @ Md. Zain, Rozilawati
Mohamed Khaidzir, Khairul Anwar
author_facet Hossain, Rajan
Ibrahim, Roliana
Dollah @ Md. Zain, Rozilawati
Mohamed Khaidzir, Khairul Anwar
author_sort Hossain, Rajan
title Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification
title_short Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification
title_full Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification
title_fullStr Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification
title_full_unstemmed Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification
title_sort experimental study of support vector machines and naïve bayes classifier on automated subject area classification
publisher Journal of Information Systems Research and Innovation (JISRI)
publishDate 2017
url http://eprints.utm.my/id/eprint/80618/
https://seminar.utmspace.edu.my/jisri/download/Volume%2011-3/Paper%202%20Rajan%20CR%207-13.pdf
_version_ 1643658466121744384
score 13.160551