Prime-based method for interactive mining of frequent patterns

Over the past decade, an increasing number of efficient mining algorithms have been proposed to mine the frequent patterns by satisfying a user specified threshold called minimum support (minsup). However, determining an appropriate value for minsup to find proper frequent patterns in different appl...

Full description

Saved in:
Bibliographic Details
Main Author: Nadimi-Shahraki, Mohammad-Hossein
Format: Thesis
Language:English
Published: 2010
Online Access:http://psasir.upm.edu.my/id/eprint/19628/1/FSKTM_2010_10.pdf
http://psasir.upm.edu.my/id/eprint/19628/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.19628
record_format eprints
spelling my.upm.eprints.196282013-05-27T08:02:41Z http://psasir.upm.edu.my/id/eprint/19628/ Prime-based method for interactive mining of frequent patterns Nadimi-Shahraki, Mohammad-Hossein Over the past decade, an increasing number of efficient mining algorithms have been proposed to mine the frequent patterns by satisfying a user specified threshold called minimum support (minsup). However, determining an appropriate value for minsup to find proper frequent patterns in different applications is extremely difficult. Since rerunning the mining algorithms from scratch can be very time consuming, researchers have introduced interactive mining to find proper patterns by using the current mining model with various minsup. Thus far, a few efficient interactive mining algorithms have been proposed. However, their runtime do not fulfill the need of short runtime in real time applications especially where data is sparse and proper frequent patterns are mined with very low values of minsup. As response to the above-mentioned challenges, this study is devoted towards developing an interactive mining method based on prime number and its special characteristic “uniqueness” by which the content of the relevant data is transformed into a compact layout. At first, a general architecture for interactive mining is proposed consisting of two isolated components: mining model and mining process. Then, the proposed method is developed based on the architecture such that the mining model is constructed once, and it can be frequently mined by various minsup. In the mining model construction, the content of relevant data is captured by a novel tree structure called PC-tree with one database scan and mining materials are consequently formed. The PC-tree is a well-organized tree structure, which is systematically built based on descendant making introduced in this study. Moreover, this study introduces a mining algorithm called PC-miner to mine the mining model frequently with various values of minsup. It grows an effective candidate head set introduced in this study starting from the longest candidate patterns by using the Apriori principle. Meanwhile, during the growing of the candidate head set in each round, the longest candidate patterns are used to find maximal frequent patterns from which the frequent patterns can be derived. Moreover, the PC-miner reduces the number of candidate patterns and comparisons by using several pruning techniques. A comprehensive experimental analysis is conducted by several experiments and scenarios to evaluate the correctness and effectiveness of the proposed method especially for interactive mining. The experimental results verify that the proposed method constructs the mining model independent of minsup once and this enable the model to be frequently mined. The results also show that the proposed method mines frequent patterns correctly and efficiently. Moreover, the results verify that the proposed method speeds up interactive mining of frequent patterns over both sparse and dense datasets with more scalable total runtime for very low values of minsup over sparse datasets as compared to results from the previous work. 2010-12 Thesis NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/19628/1/FSKTM_2010_10.pdf Nadimi-Shahraki, Mohammad-Hossein (2010) Prime-based method for interactive mining of frequent patterns. PhD thesis, Universiti Putra Malaysia.
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description Over the past decade, an increasing number of efficient mining algorithms have been proposed to mine the frequent patterns by satisfying a user specified threshold called minimum support (minsup). However, determining an appropriate value for minsup to find proper frequent patterns in different applications is extremely difficult. Since rerunning the mining algorithms from scratch can be very time consuming, researchers have introduced interactive mining to find proper patterns by using the current mining model with various minsup. Thus far, a few efficient interactive mining algorithms have been proposed. However, their runtime do not fulfill the need of short runtime in real time applications especially where data is sparse and proper frequent patterns are mined with very low values of minsup. As response to the above-mentioned challenges, this study is devoted towards developing an interactive mining method based on prime number and its special characteristic “uniqueness” by which the content of the relevant data is transformed into a compact layout. At first, a general architecture for interactive mining is proposed consisting of two isolated components: mining model and mining process. Then, the proposed method is developed based on the architecture such that the mining model is constructed once, and it can be frequently mined by various minsup. In the mining model construction, the content of relevant data is captured by a novel tree structure called PC-tree with one database scan and mining materials are consequently formed. The PC-tree is a well-organized tree structure, which is systematically built based on descendant making introduced in this study. Moreover, this study introduces a mining algorithm called PC-miner to mine the mining model frequently with various values of minsup. It grows an effective candidate head set introduced in this study starting from the longest candidate patterns by using the Apriori principle. Meanwhile, during the growing of the candidate head set in each round, the longest candidate patterns are used to find maximal frequent patterns from which the frequent patterns can be derived. Moreover, the PC-miner reduces the number of candidate patterns and comparisons by using several pruning techniques. A comprehensive experimental analysis is conducted by several experiments and scenarios to evaluate the correctness and effectiveness of the proposed method especially for interactive mining. The experimental results verify that the proposed method constructs the mining model independent of minsup once and this enable the model to be frequently mined. The results also show that the proposed method mines frequent patterns correctly and efficiently. Moreover, the results verify that the proposed method speeds up interactive mining of frequent patterns over both sparse and dense datasets with more scalable total runtime for very low values of minsup over sparse datasets as compared to results from the previous work.
format Thesis
author Nadimi-Shahraki, Mohammad-Hossein
spellingShingle Nadimi-Shahraki, Mohammad-Hossein
Prime-based method for interactive mining of frequent patterns
author_facet Nadimi-Shahraki, Mohammad-Hossein
author_sort Nadimi-Shahraki, Mohammad-Hossein
title Prime-based method for interactive mining of frequent patterns
title_short Prime-based method for interactive mining of frequent patterns
title_full Prime-based method for interactive mining of frequent patterns
title_fullStr Prime-based method for interactive mining of frequent patterns
title_full_unstemmed Prime-based method for interactive mining of frequent patterns
title_sort prime-based method for interactive mining of frequent patterns
publishDate 2010
url http://psasir.upm.edu.my/id/eprint/19628/1/FSKTM_2010_10.pdf
http://psasir.upm.edu.my/id/eprint/19628/
_version_ 1643827093951217664
score 13.160551