Fine-tuning borneo corpus management system

There are many natural language processing applications or systems which have been developed and available in the market. Some of these applications and technologies are open source and some are not. These technologies are used by the linguistics to ease them to process corpora faster rather than p...

Full description

Saved in:
Bibliographic Details
Main Author: Nursakinah, George
Format: Final Year Project Report
Language:English
Published: Universiti Malaysia Sarawak, (UNIMAS) 2013
Subjects:
Online Access:http://ir.unimas.my/id/eprint/39278/2/NURSAKINAH%20%28fulltext%29.pdf
http://ir.unimas.my/id/eprint/39278/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:There are many natural language processing applications or systems which have been developed and available in the market. Some of these applications and technologies are open source and some are not. These technologies are used by the linguistics to ease them to process corpora faster rather than processing corpora manually. However most of the applications or systems in the market only process English language and other language. There are only a few applications that process Malay language and none process Borneo language except Borneo Corpus Management System (BCMS). However, some of the functionalities in BCMS do not work well and one of the functions, KWIC was developed by Satoru Tsukamoto is integrated inside the BCMS. The idea is to take out the KWIC functions and replace it with existing functions from Corplus, one of the natural language processing tools that is developed by Ranaivo-Malancon Bali who is the supervisor for this project. Since both applications are built in Java language, the methodology on how to extract the tools from Corplus and integrate the new tools inside BCMS is documented in this project. Ranaivo-Malancon Bali, the supervisor will be a client of this project. The application will be tested by the client.