NUWT: Jawi-Specific Buckwalter Corpus for Malay Word Tokenization

This paper describes the design and creation of a monolingual parallel corpus for the Malay language written in Jawi. This paper proposes a new corpus called the National University of Malaysia Word Tokenization (NUWT) corpora To the best of our knowledge, currently, there is no sufficiently compreh...

Full description

Saved in:
Bibliographic Details
Main Authors: Abu Bakar, Juhaida, Omar, Khairuddin, Nasrudin, Mohammad Faidzul, Murah, Mohd Zamri
Format: Article
Language:English
Published: Universiti Utara Malaysia Press 2016
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/30364/1/JICT%2015%2001%202016%20107-131.pdf
https://repo.uum.edu.my/id/eprint/30364/
https://e-journal.uum.edu.my/index.php/jict/article/view/8172
Tags: Add Tag
No Tags, Be the first to tag this record!