Using KNN algorithm for classification of textual documents

Nowadays the exponential growth of generation of textual documents and the emergent need to structure them increase the attention to the automated classification of documents into predefined categories. There is wide range of supervised learning algorithms that deal with text classification. This pa...

Full description

Saved in:
Bibliographic Details
Main Authors: Moldagulova, A., Sulaiman, R.B.
Format: Conference Paper
Language:English
Published: 2018
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Nowadays the exponential growth of generation of textual documents and the emergent need to structure them increase the attention to the automated classification of documents into predefined categories. There is wide range of supervised learning algorithms that deal with text classification. This paper deals with an approach for building a machine learning system in R that uses K-Nearest Neighbors (KNN) method for the classification of textual documents. The experimental part of the research was done on collected textual documents from two sources: http://egov.kz and http://www.government.kz. The experiment was devoted to challenging thing of the KNN algorithm that to find the proper value of k which represents the number of neighbors. © 2017 IEEE.