Staff View: The Design of Pre-Processing Multidimensional Data Based on Component Analysis

The Design of Pre-Processing Multidimensional Data Based on Component Analysis

Increased implementation of new databases related to multidimensional data involving techniques to support efficient query process, create opportunities for more extensive research. Pre-processing is required because of lack of data attribute values, noisy data, errors, inconsistencies or outliers...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jasni, Mohamad Zain, Rahmat Widia, Sembiring
Format:	Article
Language:	English
Published:	Canadian Center of Science and Education 2011
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://umpir.ump.edu.my/id/eprint/2067/1/The_Design_of_Pre-Processing_Multidimensional_Data_Based_on_Component_Analysis-Journal-.pdf http://umpir.ump.edu.my/id/eprint/2067/ http://dx.doi.org/10.5539/cis.v4n3p106
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.ump.umpir.2067
record_format	eprints
spelling	my.ump.umpir.20672018-05-21T05:27:07Z http://umpir.ump.edu.my/id/eprint/2067/ The Design of Pre-Processing Multidimensional Data Based on Component Analysis Jasni, Mohamad Zain Rahmat Widia, Sembiring QA75 Electronic computers. Computer science Increased implementation of new databases related to multidimensional data involving techniques to support efficient query process, create opportunities for more extensive research. Pre-processing is required because of lack of data attribute values, noisy data, errors, inconsistencies or outliers and differences in coding. Several types of pre-processing based on component analysis will be carried out for cleaning, data integration and transformation, as well as to reduce the dimensions. Component analysis can be done by statistical methods, with the aim to separate the various sources of data into a statistical pattern independent. This paper aims to improve the quality of pre-processed data based on component analysis. RapidMiner is used for data pre-processing using FastICA algorithm. Kernel K-mean is used to cluster the pre-processed data and Expectation Maximization (EM) is used to model. The model was tested using wisconsin breast cancer datasets, lung cancer datasets and prostate cancer datasets. The result shows that the performance of the cluster vector value is higher and the processing time is shorter. Canadian Center of Science and Education 2011 Article PeerReviewed application/pdf en http://umpir.ump.edu.my/id/eprint/2067/1/The_Design_of_Pre-Processing_Multidimensional_Data_Based_on_Component_Analysis-Journal-.pdf Jasni, Mohamad Zain and Rahmat Widia, Sembiring (2011) The Design of Pre-Processing Multidimensional Data Based on Component Analysis. Computer and Information Science, 4 (3). pp. 106-115. ISSN 1913-8989 (Print); 1913-8997 (Online) http://dx.doi.org/10.5539/cis.v4n3p106
institution	Universiti Malaysia Pahang
building	UMP Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Pahang
content_source	UMP Institutional Repository
url_provider	http://umpir.ump.edu.my/
language	English
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Jasni, Mohamad Zain Rahmat Widia, Sembiring The Design of Pre-Processing Multidimensional Data Based on Component Analysis
description	Increased implementation of new databases related to multidimensional data involving techniques to support efficient query process, create opportunities for more extensive research. Pre-processing is required because of lack of data attribute values, noisy data, errors, inconsistencies or outliers and differences in coding. Several types of pre-processing based on component analysis will be carried out for cleaning, data integration and transformation, as well as to reduce the dimensions. Component analysis can be done by statistical methods, with the aim to separate the various sources of data into a statistical pattern independent. This paper aims to improve the quality of pre-processed data based on component analysis. RapidMiner is used for data pre-processing using FastICA algorithm. Kernel K-mean is used to cluster the pre-processed data and Expectation Maximization (EM) is used to model. The model was tested using wisconsin breast cancer datasets, lung cancer datasets and prostate cancer datasets. The result shows that the performance of the cluster vector value is higher and the processing time is shorter.
format	Article
author	Jasni, Mohamad Zain Rahmat Widia, Sembiring
author_facet	Jasni, Mohamad Zain Rahmat Widia, Sembiring
author_sort	Jasni, Mohamad Zain
title	The Design of Pre-Processing Multidimensional Data Based on Component Analysis
title_short	The Design of Pre-Processing Multidimensional Data Based on Component Analysis
title_full	The Design of Pre-Processing Multidimensional Data Based on Component Analysis
title_fullStr	The Design of Pre-Processing Multidimensional Data Based on Component Analysis
title_full_unstemmed	The Design of Pre-Processing Multidimensional Data Based on Component Analysis
title_sort	design of pre-processing multidimensional data based on component analysis
publisher	Canadian Center of Science and Education
publishDate	2011
url	http://umpir.ump.edu.my/id/eprint/2067/1/The_Design_of_Pre-Processing_Multidimensional_Data_Based_on_Component_Analysis-Journal-.pdf http://umpir.ump.edu.my/id/eprint/2067/ http://dx.doi.org/10.5539/cis.v4n3p106
_version_	1643664533598765056
score	13.160551

The Design of Pre-Processing Multidimensional Data Based on Component Analysis

Similar Items