The development of the indexing prototype considering tags into the inverted file: case study on FTMSK’s official letter / Mohd Sharizan Mohd Shariff

The combination of IR and structure form (XML) make the retrieval process become more powerful then before. As far as the effectiveness of document retrieval is concerned, each segment (part) in the letter (document) has its own meaning or usage. Thus, term weight must be taken into consideration in...

Full description

Saved in:
Bibliographic Details
Main Author: Mohd Shariff, Mohd Sharizan
Format: Thesis
Language:English
Published: 2005
Online Access:http://ir.uitm.edu.my/id/eprint/18274/2/TD_MOHD%20SHARIZAN%20MOHD%20SHARIFF%20CS%2005_5.pdf
http://ir.uitm.edu.my/id/eprint/18274/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The combination of IR and structure form (XML) make the retrieval process become more powerful then before. As far as the effectiveness of document retrieval is concerned, each segment (part) in the letter (document) has its own meaning or usage. Thus, term weight must be taken into consideration in order to make each segment (part) of the document more meaningful and to make the retrieval process produce more relevant output to the user. This idea is the basis for the prototype development. The prototype has been built using Visual Basic platform with MS Access as the data storage and structure. Inverted files technique had been chosen as the basis for the data structure in this prototype. The retrieval effectiveness is measured using redefined recall (R) and precision (P) that used to measure structured document. The evaluation will be done between the CAS (the prototype) and CO (benchmark) retrieval. The result of evaluation been done shows that the term weighting assist in production of more relevant output to user query rather then ignorance of it in structured document. Each part of the segment in the structured form of the document become more identical in query process with the used of term weighting inserted in the tags.