A new ant based rule extraction algorithm for web classification
Methods to reduce the number of attributes and discretization are two important data pre-processing steps before the data can be used for classification activity. Web documents contain enormous number of attributes as compared to other type of data. Ant-Miner algorithm is also still lacking in effic...
Saved in:
Main Authors: | , |
---|---|
格式: | Monograph |
語言: | English English |
出版: |
Universiti Utara Malaysia
2011
|
主題: | |
在線閱讀: | http://repo.uum.edu.my/8136/1/Ku.pdf http://repo.uum.edu.my/8136/3/1.KU%20RUHANA%20KU%20MAHAMUD.pdf http://repo.uum.edu.my/8136/ http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000780133 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|
id |
my.uum.repo.8136 |
---|---|
record_format |
eprints |
spelling |
my.uum.repo.81362014-07-06T04:34:23Z http://repo.uum.edu.my/8136/ A new ant based rule extraction algorithm for web classification Ku-Mahamud, Ku Ruhana Saian, Rizauddin QA76 Computer software Methods to reduce the number of attributes and discretization are two important data pre-processing steps before the data can be used for classification activity. Web documents contain enormous number of attributes as compared to other type of data. Ant-Miner algorithm is also still lacking in efficiency, accuracy and rule simplicity because of the local minima problem.Therefore, the Ant-Miner algorithm needs to be improved by taking into consideration of the accuracy and rule simplicity criteria so that it could be used to classify Web documents data sets or any large data sets.The best attribute selection method for Web texts categorization is the combination of correlation-based evaluation with random search as the search method.However, this attribute selection method will not give the best performance in attributes reduction. Using Classifier-based attribute subset selection will reduce more attributes, but sacrifice the performance of the classifier.A hybrid ant colony optimization with simulated annealing algorithm to discover rules from data is proposed.The simulated annealing technique will minimize the problem of low quality discovered rule by an ant in a colony.The best rule for a colony will then be chosen and later the best rule among the colonies will be included in the rule set.The best rule for a colony will then be chosen and later the best rule among the colonies will be included in the rule set.The rule set is arranged in decreasing order of generation.Thirteen data sets which consist of discrete and continuous data were used to evaluate the performance of the proposed algorithm in terms of accuracy, number of rules and number of terms in the rules.Experimental results obtained from the proposed algorithm are comparable to the results of the Ant-Miner algorithm in terms of rule accuracy but are better in terms of rule simplicity. Universiti Utara Malaysia 2011 Monograph NonPeerReviewed application/pdf en http://repo.uum.edu.my/8136/1/Ku.pdf application/pdf en http://repo.uum.edu.my/8136/3/1.KU%20RUHANA%20KU%20MAHAMUD.pdf Ku-Mahamud, Ku Ruhana and Saian, Rizauddin (2011) A new ant based rule extraction algorithm for web classification. Project Report. Universiti Utara Malaysia. (Unpublished) http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000780133 |
institution |
Universiti Utara Malaysia |
building |
UUM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Utara Malaysia |
content_source |
UUM Institutionali Repository |
url_provider |
http://repo.uum.edu.my/ |
language |
English English |
topic |
QA76 Computer software |
spellingShingle |
QA76 Computer software Ku-Mahamud, Ku Ruhana Saian, Rizauddin A new ant based rule extraction algorithm for web classification |
description |
Methods to reduce the number of attributes and discretization are two important data pre-processing steps before the data can be used for classification activity. Web documents contain enormous number of attributes as compared to other type of data. Ant-Miner algorithm is also still lacking in efficiency, accuracy and rule simplicity because of the local minima problem.Therefore, the Ant-Miner algorithm needs to be improved by taking into consideration of the accuracy and rule simplicity criteria so that it could be used to classify Web documents data sets or any large data sets.The best attribute selection method for Web texts categorization is the combination of correlation-based evaluation with random search as the search method.However, this attribute selection method will not give the best performance in attributes reduction. Using Classifier-based attribute subset selection will reduce more attributes, but sacrifice the performance of the classifier.A hybrid ant colony optimization with simulated annealing algorithm to discover rules from data is proposed.The simulated annealing technique will minimize the problem of low quality discovered rule by an ant in a colony.The best rule for a colony will then be chosen and later the best rule among the colonies will be included in the rule set.The best rule for a colony will then be chosen and later the best rule among the colonies will be included in the rule set.The rule set is arranged in decreasing order of generation.Thirteen data sets which consist of discrete and continuous data were used to evaluate the performance of the proposed algorithm in terms of accuracy, number of rules and number of terms in the rules.Experimental results obtained from the proposed algorithm are comparable to the results of the Ant-Miner algorithm in terms of rule accuracy but are better in terms of rule simplicity. |
format |
Monograph |
author |
Ku-Mahamud, Ku Ruhana Saian, Rizauddin |
author_facet |
Ku-Mahamud, Ku Ruhana Saian, Rizauddin |
author_sort |
Ku-Mahamud, Ku Ruhana |
title |
A new ant based rule extraction algorithm for web classification |
title_short |
A new ant based rule extraction algorithm for web classification |
title_full |
A new ant based rule extraction algorithm for web classification |
title_fullStr |
A new ant based rule extraction algorithm for web classification |
title_full_unstemmed |
A new ant based rule extraction algorithm for web classification |
title_sort |
new ant based rule extraction algorithm for web classification |
publisher |
Universiti Utara Malaysia |
publishDate |
2011 |
url |
http://repo.uum.edu.my/8136/1/Ku.pdf http://repo.uum.edu.my/8136/3/1.KU%20RUHANA%20KU%20MAHAMUD.pdf http://repo.uum.edu.my/8136/ http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000http://lintas.uum.edu.my:8080/elmu/index.jsp?module=webopac-l&action=fullDisplayRetriever.jsp&szMaterialNo=0000780133 |
_version_ |
1644279744689078272 |
score |
13.149126 |