Tools in data science for better processing

Analysing the data is an important part of a research in data science. There are many tools that can be used in analysing a data set to get the experiment results for classification, clustering and others. However, the researchers are concerned about how to increase the efficiency in analysing a dat...

Full description

Saved in:
Bibliographic Details
Main Authors: Hussien, Nur Syahela, Sulaiman, Sarina, Shamsuddin, Siti Mariyam
Format: Conference or Workshop Item
Language:English
Published: 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/62121/1/SarinaSulaiman2015_ToolsinDataScienceforBetterProcessing.pdf
http://eprints.utm.my/id/eprint/62121/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Analysing the data is an important part of a research in data science. There are many tools that can be used in analysing a data set to get the experiment results for classification, clustering and others. However, the researchers are concerned about how to increase the efficiency in analysing a data set. In this paper, three open source tools which are the Waikato Environment for Knowledge Analysis (WEKA), Konstanz Information Miner (KNIME) and Salford Predictive Modular (SPM) were compared to identify the better processing tools in evaluating the presented data. All of these tools have their own different characteristics. WEKA can handle pre-processing of data and then analyses it based on different algorithms. It is suitable to be used for classification, regression, clustering, association rules, and visualisation. The algorithms can be applied directly to a data set or called from its own Java code. KNIME is more inclined towards producing graphical view, while SPM is a highly accurate and ultra-fast analytics which also data mines platforms for any sizes, complexity or organisation. The results illustrate the tools capability in analysing data sets and evaluators in an efficient and effective manner.