Genetic Algorithm for Web Data Mining

The use of various search engines could influence the number of search results in the World Wide Web. Therefore, this study attempted to discover any association between the word types or the information types used to search through the World Wide Web using the available search engines. By doing...

Full description

Saved in:
Bibliographic Details
Main Author: Loo, Kevin Teow Aik
Format: Project Paper Report
Language:English
English
Published: 2001
Online Access:http://psasir.upm.edu.my/id/eprint/8675/1/FSKTM_2001_19_A.pdf
http://psasir.upm.edu.my/id/eprint/8675/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.8675
record_format eprints
spelling my.upm.eprints.86752012-12-12T06:39:44Z http://psasir.upm.edu.my/id/eprint/8675/ Genetic Algorithm for Web Data Mining Loo, Kevin Teow Aik The use of various search engines could influence the number of search results in the World Wide Web. Therefore, this study attempted to discover any association between the word types or the information types used to search through the World Wide Web using the available search engines. By doing so, it could assist the process of data mining for information in the World Wide Web. This study used a prototype program based on genetic algorithm to manipulate the initial set of data. Three sets of inputs were used to generate new populations based on the individual fitness. New strains of individuals from a new population were used to test the results obtained from the World Wide Web. Eight search engines used for this study were tested with two groups of words. All the eight words were used as keyword search in all the eight search engines, and the numbers of web pages returned by each search engines were collected. The total web pages based on the selected new individuals were calculated and tabulated. In order to find any association between the search word and the search engines combinations, the individuals were ranked based on the most web pages to the least according to each of the eight words. Results obtained through the creation of new populations by the prototype program showed that the average fitness of each population improves as new populations were created and new strains of individuals were created through this evolution process. The test on results obtained from the Internet showed that certain class of words could be associated by certain combination of search engines. 2001 Project Paper Report NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/8675/1/FSKTM_2001_19_A.pdf Loo, Kevin Teow Aik (2001) Genetic Algorithm for Web Data Mining. [Project Paper Report] English
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
English
description The use of various search engines could influence the number of search results in the World Wide Web. Therefore, this study attempted to discover any association between the word types or the information types used to search through the World Wide Web using the available search engines. By doing so, it could assist the process of data mining for information in the World Wide Web. This study used a prototype program based on genetic algorithm to manipulate the initial set of data. Three sets of inputs were used to generate new populations based on the individual fitness. New strains of individuals from a new population were used to test the results obtained from the World Wide Web. Eight search engines used for this study were tested with two groups of words. All the eight words were used as keyword search in all the eight search engines, and the numbers of web pages returned by each search engines were collected. The total web pages based on the selected new individuals were calculated and tabulated. In order to find any association between the search word and the search engines combinations, the individuals were ranked based on the most web pages to the least according to each of the eight words. Results obtained through the creation of new populations by the prototype program showed that the average fitness of each population improves as new populations were created and new strains of individuals were created through this evolution process. The test on results obtained from the Internet showed that certain class of words could be associated by certain combination of search engines.
format Project Paper Report
author Loo, Kevin Teow Aik
spellingShingle Loo, Kevin Teow Aik
Genetic Algorithm for Web Data Mining
author_facet Loo, Kevin Teow Aik
author_sort Loo, Kevin Teow Aik
title Genetic Algorithm for Web Data Mining
title_short Genetic Algorithm for Web Data Mining
title_full Genetic Algorithm for Web Data Mining
title_fullStr Genetic Algorithm for Web Data Mining
title_full_unstemmed Genetic Algorithm for Web Data Mining
title_sort genetic algorithm for web data mining
publishDate 2001
url http://psasir.upm.edu.my/id/eprint/8675/1/FSKTM_2001_19_A.pdf
http://psasir.upm.edu.my/id/eprint/8675/
_version_ 1643824087068311552
score 13.160551