Social media and stock market prediction: A big data approach

Big data is the collection of large datasets from traditional and digital sources to identify trends and patterns. The quantity and variety of computer data are growing exponentially for many reasons. For example, retailers are building vast databases of customer sales activity. Organizations are wo...

Full description

Saved in:
Bibliographic Details
Main Authors: Awan, Mazhar Javed, Mohd. Rahim, Mohd. Shafry, Nobanee, Haitham, Munawar, Ashna, Yasin, Awais, Mohd. Zain, Azlan
Format: Article
Language:English
Published: Tech Science Press 2021
Subjects:
Online Access:http://eprints.utm.my/id/eprint/94648/1/MohdShafry2021_SocialMediaandStockMarketPrediction.pdf
http://eprints.utm.my/id/eprint/94648/
http://dx.doi.org/10.32604/cmc.2021.014253
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.94648
record_format eprints
spelling my.utm.946482022-03-31T15:51:51Z http://eprints.utm.my/id/eprint/94648/ Social media and stock market prediction: A big data approach Awan, Mazhar Javed Mohd. Rahim, Mohd. Shafry Nobanee, Haitham Munawar, Ashna Yasin, Awais Mohd. Zain, Azlan QA75 Electronic computers. Computer science Big data is the collection of large datasets from traditional and digital sources to identify trends and patterns. The quantity and variety of computer data are growing exponentially for many reasons. For example, retailers are building vast databases of customer sales activity. Organizations are working on logistics financial services, and public social media are sharing a vast quantity of sentiments related to sales price and products. Challenges of big data include volume and variety in both structured and unstructured data. In this paper, we implemented several machine learning models through Spark MLlib using PySpark, which is scalable, fast, easily integrated with other tools, and has better performance than the traditional models. We studied the stocks of 10 top companies, whose data include historical stock prices, with MLlib models such as linear regression, generalized linear regression, random forest, and decision tree. We implemented naive Bayes and logistic regression classification models. Experimental results suggest that linear regression, random forest, and generalized linear regression provide an accuracy of 80%–98%. The experimental results of the decision tree did not well predict share price movements in the stock market. Tech Science Press 2021 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/94648/1/MohdShafry2021_SocialMediaandStockMarketPrediction.pdf Awan, Mazhar Javed and Mohd. Rahim, Mohd. Shafry and Nobanee, Haitham and Munawar, Ashna and Yasin, Awais and Mohd. Zain, Azlan (2021) Social media and stock market prediction: A big data approach. Computers, Materials and Continua, 67 (2). pp. 2569-2583. ISSN 1546-2218 http://dx.doi.org/10.32604/cmc.2021.014253
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Awan, Mazhar Javed
Mohd. Rahim, Mohd. Shafry
Nobanee, Haitham
Munawar, Ashna
Yasin, Awais
Mohd. Zain, Azlan
Social media and stock market prediction: A big data approach
description Big data is the collection of large datasets from traditional and digital sources to identify trends and patterns. The quantity and variety of computer data are growing exponentially for many reasons. For example, retailers are building vast databases of customer sales activity. Organizations are working on logistics financial services, and public social media are sharing a vast quantity of sentiments related to sales price and products. Challenges of big data include volume and variety in both structured and unstructured data. In this paper, we implemented several machine learning models through Spark MLlib using PySpark, which is scalable, fast, easily integrated with other tools, and has better performance than the traditional models. We studied the stocks of 10 top companies, whose data include historical stock prices, with MLlib models such as linear regression, generalized linear regression, random forest, and decision tree. We implemented naive Bayes and logistic regression classification models. Experimental results suggest that linear regression, random forest, and generalized linear regression provide an accuracy of 80%–98%. The experimental results of the decision tree did not well predict share price movements in the stock market.
format Article
author Awan, Mazhar Javed
Mohd. Rahim, Mohd. Shafry
Nobanee, Haitham
Munawar, Ashna
Yasin, Awais
Mohd. Zain, Azlan
author_facet Awan, Mazhar Javed
Mohd. Rahim, Mohd. Shafry
Nobanee, Haitham
Munawar, Ashna
Yasin, Awais
Mohd. Zain, Azlan
author_sort Awan, Mazhar Javed
title Social media and stock market prediction: A big data approach
title_short Social media and stock market prediction: A big data approach
title_full Social media and stock market prediction: A big data approach
title_fullStr Social media and stock market prediction: A big data approach
title_full_unstemmed Social media and stock market prediction: A big data approach
title_sort social media and stock market prediction: a big data approach
publisher Tech Science Press
publishDate 2021
url http://eprints.utm.my/id/eprint/94648/1/MohdShafry2021_SocialMediaandStockMarketPrediction.pdf
http://eprints.utm.my/id/eprint/94648/
http://dx.doi.org/10.32604/cmc.2021.014253
_version_ 1729703201766113280
score 13.187197