Semi-automated construction of Neglish-Malay machine readable dictionary for technical terms

Access is limited to UniMAP community.

Saved in:
Bibliographic Details
Main Author: Nurfathiah, Abd Ghani
Other Authors: Dr. Nik Adilah Hanin Zahri
Format:
Published: Universiti Malaysia Perlis (UniMAP) 2016
Subjects:
Online Access:http://dspace.unimap.edu.my:80/xmlui/handle/123456789/42067
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.unimap-42067
record_format dspace
spelling my.unimap-420672016-06-16T03:48:28Z Semi-automated construction of Neglish-Malay machine readable dictionary for technical terms Nurfathiah, Abd Ghani Dr. Nik Adilah Hanin Zahri Machine readable dictionary English-Malay dictionary Technical term Machine readable dictionary -- Design and construction Keyword density Access is limited to UniMAP community. This project presents a method for semi – automated construction of English – Malay machine readable dictionary for technical terms. We proposed to use Keyword Density in order to classify the category for each term by measuring the weight of the term with Visual Studio using visual basic language. In the meantime, Cosine Similarity algorithm is used to measure the similarity between two sentence which are definition and sentence from the journal using C language. In order to calculate the category, 523 trainings data which is a set of journal for each term was collected. Then, we preprocessed the journal by using Brill’s Tagger with Penn-Tree Bank Tagger. We assigned 50 terms to test the algorithm. By using word extraction method the terms occurrence was counted. The total of the word in the category journal are also calculated. To categorize the term, we calculated the keyword density. For example sentence extraction, the data is used from the highest cosine similarity measurement between definition and sentence from journal. The sentence with the highest value was extracted as example sentence by the system. By using this algorithm, the Precision for the example sentence is 79%, Recall 90% and the F-Measure is 84%. It can be considered as a successful since the result is high. As a conclusion, based on the result, the proposed method shows a great potential with further improvement. 2016-06-16T03:48:28Z 2016-06-16T03:48:28Z 2015-06 http://dspace.unimap.edu.my:80/xmlui/handle/123456789/42067 Universiti Malaysia Perlis (UniMAP) School of Computer and Communication Engineering
institution Universiti Malaysia Perlis
building UniMAP Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Perlis
content_source UniMAP Library Digital Repository
url_provider http://dspace.unimap.edu.my/
topic Machine readable dictionary
English-Malay dictionary
Technical term
Machine readable dictionary -- Design and construction
Keyword density
spellingShingle Machine readable dictionary
English-Malay dictionary
Technical term
Machine readable dictionary -- Design and construction
Keyword density
Nurfathiah, Abd Ghani
Semi-automated construction of Neglish-Malay machine readable dictionary for technical terms
description Access is limited to UniMAP community.
author2 Dr. Nik Adilah Hanin Zahri
author_facet Dr. Nik Adilah Hanin Zahri
Nurfathiah, Abd Ghani
format
author Nurfathiah, Abd Ghani
author_sort Nurfathiah, Abd Ghani
title Semi-automated construction of Neglish-Malay machine readable dictionary for technical terms
title_short Semi-automated construction of Neglish-Malay machine readable dictionary for technical terms
title_full Semi-automated construction of Neglish-Malay machine readable dictionary for technical terms
title_fullStr Semi-automated construction of Neglish-Malay machine readable dictionary for technical terms
title_full_unstemmed Semi-automated construction of Neglish-Malay machine readable dictionary for technical terms
title_sort semi-automated construction of neglish-malay machine readable dictionary for technical terms
publisher Universiti Malaysia Perlis (UniMAP)
publishDate 2016
url http://dspace.unimap.edu.my:80/xmlui/handle/123456789/42067
_version_ 1643799866600587264
score 13.214268