Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset

The fast growth of internet make objectionable web content such as pornography and violence easily explore to web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator blocking and Platform for Internet Content Selection checking are limi...

Full description

Saved in:
Bibliographic Details
Main Authors: Sam, Lee Zhi, Maarof, Mohd. Aizaini, Selamat, Ali, Shamsuddin, Siti Mariyam
Format: Conference or Workshop Item
Published: 2007
Subjects:
Online Access:http://eprints.utm.my/id/eprint/14359/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The fast growth of internet make objectionable web content such as pornography and violence easily explore to web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator blocking and Platform for Internet Content Selection checking are limited against today dynamic web content, hence content based analysis techniques with effective model are highly desired. This paper we propose textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We examine the entropy scheme with two other common term weighting schemes which are TFIDF and Glasgow. Those techniques are examined extensively with artificial neural network using small class dataset. We found that our proposed model archive better performance from the aspects of accuracy, convergence speed and stability.