Side effects recognition as implicit opinion words in drug reviews

Many opinion mining systems and tools have been developed to provide the user with the attitude of people toward entities and their attribute or the overall polarity of document. Unlike explicit opinion mining limited work has been done on implicit one. Similarly, few works has been done for opinion...

Full description

Saved in:
Bibliographic Details
Main Author: Ebrahimi, Monireh
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:http://eprints.utm.my/id/eprint/37035/5/MonirehEbrahimiMFSKSM2013.pdf
http://eprints.utm.my/id/eprint/37035/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:70018?site_name=Restricted Repository
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Many opinion mining systems and tools have been developed to provide the user with the attitude of people toward entities and their attribute or the overall polarity of document. Unlike explicit opinion mining limited work has been done on implicit one. Similarly, few works has been done for opinion mining in medical domain whereas it is a domain dependent task especially about implicit opinions. Besides, side effects are one of critical measures to evaluate the patient’s opinion about one drug. However, side effect recognition is challenging task since side effects coincide with disease symptoms lexically and syntactically. In this regard, this study tries to extract drug side effects from drug reviews as an integrable implicit opinion word detection algorithm to a medical opinion mining system using rule based and SVM algorithm. Developing each of these techniques requires different preprocessing steps including corpus text segmentation, mapping medical terms to concepts, trigger terms list construction and SVM feature extraction. Also, due to the novelty of this issue, corpus construction carried out. The corpus used in this study has 225 drug reviews manually annotated by a medication expert as a reference standard. After corpus preprocessing, two proposed techniques has been run. In rule based algorithm, regular expressions and trigger terms list has been used to detect drug adverse side effects and discriminate them from disease symptoms. In the other hand, combination of lexical, syntactical, contextual and semantic features leads to the best results in SVM technique. The results show that SVM significantly performs better than rule based algorithm. However, the results of both algorithms are encouraging and a good foundation for future researches. Obviating the limitations and exploiting combined approaches would improve the results.