职员浏览: A deep autoencoder-based representation for Arabic text categorization

A deep autoencoder-based representation for Arabic text categorization

Arabic text representation is a challenging assignment for several applications such as text categorization and clustering since the Arabic language is known for its variety, richness and complex morphology. Until recently, the Bag-of-Words remains the most common method for Arabic text representati...

全面介绍

Saved in:

书目详细资料
Main Authors:	El-Alami, Fatima-Zahra, El Mahdaouy, Abdelkader, El Alaoui, Said Ouatik, En-Nahnahi, Noureddine
格式:	Article
语言:	English
出版:	Universiti Utara Malaysia Press 2020
主题:	QA75 Electronic computers. Computer science
在线阅读:	http://repo.uum.edu.my/28135/1/JICT%2019%203%202020%20381-398.pdf http://repo.uum.edu.my/28135/ http://jict.uum.edu.my/index.php/previous-issues/172-journal-of-information-and-communication-technology-jict-vol-19-no-3-july-2020#a4
标签:	添加标签没有标签, 成为第一个标记此记录!

id	my.uum.repo.28135
record_format	eprints
spelling	my.uum.repo.281352021-02-02T02:52:19Z http://repo.uum.edu.my/28135/ A deep autoencoder-based representation for Arabic text categorization El-Alami, Fatima-Zahra El Mahdaouy, Abdelkader El Alaoui, Said Ouatik En-Nahnahi, Noureddine QA75 Electronic computers. Computer science Arabic text representation is a challenging assignment for several applications such as text categorization and clustering since the Arabic language is known for its variety, richness and complex morphology. Until recently, the Bag-of-Words remains the most common method for Arabic text representation. However, it suffers from several shortcomings such as semantics deficiency and high dimensionality of feature space. Moreover, most existing methods ignore the explicit knowledge contained in semantic vocabularies such as Arabic WordNet. To overcome these shortcomings, we proposed a deep Autoencoder based representation for Arabic text categorization. It consisted of three stages: (1) Extracting from Arabic WordNet the most relevant concepts based on feature selection processes (2) Features learning via an unsupervised algorithm for text representation (3) Categorizing text using deep Autoencoder. Our method allowed for the consideration of document semantics by combining both implicit and explicit semantics and reducing feature space dimensionality. To evaluate our method, we conducted several experiments on the standard Arabic dataset, OSAC. The obtained results showed the effectiveness of the proposed method compared to state-of-the-art ones. Universiti Utara Malaysia Press 2020 Article PeerReviewed application/pdf en http://repo.uum.edu.my/28135/1/JICT%2019%203%202020%20381-398.pdf El-Alami, Fatima-Zahra and El Mahdaouy, Abdelkader and El Alaoui, Said Ouatik and En-Nahnahi, Noureddine (2020) A deep autoencoder-based representation for Arabic text categorization. Journal of Information and Communication Technology, 19 (3). pp. 381-398. ISSN 2180-3862 http://jict.uum.edu.my/index.php/previous-issues/172-journal-of-information-and-communication-technology-jict-vol-19-no-3-july-2020#a4
institution	Universiti Utara Malaysia
building	UUM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Utara Malaysia
content_source	UUM Institutional Repository
url_provider	http://repo.uum.edu.my/
language	English
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science El-Alami, Fatima-Zahra El Mahdaouy, Abdelkader El Alaoui, Said Ouatik En-Nahnahi, Noureddine A deep autoencoder-based representation for Arabic text categorization
description	Arabic text representation is a challenging assignment for several applications such as text categorization and clustering since the Arabic language is known for its variety, richness and complex morphology. Until recently, the Bag-of-Words remains the most common method for Arabic text representation. However, it suffers from several shortcomings such as semantics deficiency and high dimensionality of feature space. Moreover, most existing methods ignore the explicit knowledge contained in semantic vocabularies such as Arabic WordNet. To overcome these shortcomings, we proposed a deep Autoencoder based representation for Arabic text categorization. It consisted of three stages: (1) Extracting from Arabic WordNet the most relevant concepts based on feature selection processes (2) Features learning via an unsupervised algorithm for text representation (3) Categorizing text using deep Autoencoder. Our method allowed for the consideration of document semantics by combining both implicit and explicit semantics and reducing feature space dimensionality. To evaluate our method, we conducted several experiments on the standard Arabic dataset, OSAC. The obtained results showed the effectiveness of the proposed method compared to state-of-the-art ones.
format	Article
author	El-Alami, Fatima-Zahra El Mahdaouy, Abdelkader El Alaoui, Said Ouatik En-Nahnahi, Noureddine
author_facet	El-Alami, Fatima-Zahra El Mahdaouy, Abdelkader El Alaoui, Said Ouatik En-Nahnahi, Noureddine
author_sort	El-Alami, Fatima-Zahra
title	A deep autoencoder-based representation for Arabic text categorization
title_short	A deep autoencoder-based representation for Arabic text categorization
title_full	A deep autoencoder-based representation for Arabic text categorization
title_fullStr	A deep autoencoder-based representation for Arabic text categorization
title_full_unstemmed	A deep autoencoder-based representation for Arabic text categorization
title_sort	deep autoencoder-based representation for arabic text categorization
publisher	Universiti Utara Malaysia Press
publishDate	2020
url	http://repo.uum.edu.my/28135/1/JICT%2019%203%202020%20381-398.pdf http://repo.uum.edu.my/28135/ http://jict.uum.edu.my/index.php/previous-issues/172-journal-of-information-and-communication-technology-jict-vol-19-no-3-july-2020#a4
_version_	1691735342086881280
score	13.149126

A deep autoencoder-based representation for Arabic text categorization

相似书籍