Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification
Subject area classification allow researchers to identify publications based on their discipline or research domain. When number of document is large, classification of publication documents become increasingly difficult. Besides, covering granularity of broad range of subject areas manually is a cr...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Published: |
Journal of Information Systems Research and Innovation (JISRI)
2017
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/80618/ https://seminar.utmspace.edu.my/jisri/download/Volume%2011-3/Paper%202%20Rajan%20CR%207-13.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.80618 |
---|---|
record_format |
eprints |
spelling |
my.utm.806182019-06-27T06:10:16Z http://eprints.utm.my/id/eprint/80618/ Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification Hossain, Rajan Ibrahim, Roliana Dollah @ Md. Zain, Rozilawati Mohamed Khaidzir, Khairul Anwar QA75 Electronic computers. Computer science Subject area classification allow researchers to identify publications based on their discipline or research domain. When number of document is large, classification of publication documents become increasingly difficult. Besides, covering granularity of broad range of subject areas manually is a critical problem. In recent areas, machine learning has emerged as an effective way for automated classification in various domains such as text, images and videos. Problems with classifying large amount of publication papers can be solved with automating the process of subject area classification using supervised machine learning approaches. This paper represents an experimental study that used support vector machines and naïve bayes for automated classification of subject areas. Text classification method is used to find the probability of a document to be in certain category based on co-words and their frequency in a document. The proposed experimentation is consisted of two phases. In first phase, a list of co-words was generated from a collection of document in each of selected subject areas using text pre-processing technique. In second phase, both Support Vector Machines(SVM) and Naïve Bayes classifiers were used to conduct the experimentation and performance of each method was observed. It was found that SVM performs better than Naïve Bayes classifier in multi-label classification. Journal of Information Systems Research and Innovation (JISRI) 2017 Article PeerReviewed Hossain, Rajan and Ibrahim, Roliana and Dollah @ Md. Zain, Rozilawati and Mohamed Khaidzir, Khairul Anwar (2017) Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification. Journal of Information Systems Research and Innovation, 11 (3). pp. 7-13. ISSN 2289-1358 https://seminar.utmspace.edu.my/jisri/download/Volume%2011-3/Paper%202%20Rajan%20CR%207-13.pdf |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Hossain, Rajan Ibrahim, Roliana Dollah @ Md. Zain, Rozilawati Mohamed Khaidzir, Khairul Anwar Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification |
description |
Subject area classification allow researchers to identify publications based on their discipline or research domain. When number of document is large, classification of publication documents become increasingly difficult. Besides, covering granularity of broad range of subject areas manually is a critical problem. In recent areas, machine learning has emerged as an effective way for automated classification in various domains such as text, images and videos. Problems with classifying large amount of publication papers can be solved with automating the process of subject area classification using supervised machine learning approaches. This paper represents an experimental study that used support vector machines and naïve bayes for automated classification of subject areas. Text classification method is used to find the probability of a document to be in certain category based on co-words and their frequency in a document. The proposed experimentation is consisted of two phases. In first phase, a list of co-words was generated from a collection of document in each of selected subject areas using text pre-processing technique. In second phase, both Support Vector Machines(SVM) and Naïve Bayes classifiers were used to conduct the experimentation and performance of each method was observed. It was found that SVM performs better than Naïve Bayes classifier in multi-label classification. |
format |
Article |
author |
Hossain, Rajan Ibrahim, Roliana Dollah @ Md. Zain, Rozilawati Mohamed Khaidzir, Khairul Anwar |
author_facet |
Hossain, Rajan Ibrahim, Roliana Dollah @ Md. Zain, Rozilawati Mohamed Khaidzir, Khairul Anwar |
author_sort |
Hossain, Rajan |
title |
Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification |
title_short |
Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification |
title_full |
Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification |
title_fullStr |
Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification |
title_full_unstemmed |
Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification |
title_sort |
experimental study of support vector machines and naïve bayes classifier on automated subject area classification |
publisher |
Journal of Information Systems Research and Innovation (JISRI) |
publishDate |
2017 |
url |
http://eprints.utm.my/id/eprint/80618/ https://seminar.utmspace.edu.my/jisri/download/Volume%2011-3/Paper%202%20Rajan%20CR%207-13.pdf |
_version_ |
1643658466121744384 |
score |
13.160551 |