Leveraging Web Scraping to Gather Tourism Information Data

The influence of Information and Communication Technologies (ICT) on both individuals' daily lives and the economy is of significant importance. In this context, the tourism industry plays a crucial role, and it is essential to recognise the contributions of tourists in terms of sharing their e...

Full description

Saved in:
Bibliographic Details
Main Authors: Kamarazaman, Nadzirah, Mohamad Ali, Nazlena, Arshad, Haslina
Format: Article
Language:English
Published: UUM PRESS 2024
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/32088/1/JETH%2004%202024%2016-29.pdf
https://repo.uum.edu.my/id/eprint/32088/
https://e-journal.uum.edu.my/index.php/jeth/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uum.repo.32088
record_format eprints
spelling my.uum.repo.320882025-02-20T11:57:21Z https://repo.uum.edu.my/id/eprint/32088/ Leveraging Web Scraping to Gather Tourism Information Data Kamarazaman, Nadzirah Mohamad Ali, Nazlena Arshad, Haslina HV Social pathology. Social and public welfare The influence of Information and Communication Technologies (ICT) on both individuals' daily lives and the economy is of significant importance. In this context, the tourism industry plays a crucial role, and it is essential to recognise the contributions of tourists in terms of sharing their experiences through tourism websites. Analysing this data is key to improving future tourists' experiences. Therefore, the objective of this study is to employ web scraping to gather data on places of interest (POI) and user attributes, specifically in the state of Melaka via the TripAdvisor website. Melaka is chosen as it is one of the places recognised by the United Nations, Educational, Scientific and Cultural Organization (UNESCO). The study focuses on the 200 POI locations (UNESCO) Map, encompassing both Melaka's core and buffer zones. These POIs are categorised into four heritage types: built heritage, natural heritage, personal heritage, and living heritage, with some belonging to more than one category. For the data collection process, this study utilised the TripAdvisor website and extracted a total of 14 attributes. Specifically, 27282 user data entries were collected from 163 POIs in the core zone area, and 8305 data entries from 37 POIs in the buffer zone area. The data is managed and stored in various formats, including CSV, JSON, and Excel files in the repository. The data helps in the development of a tourism application. Furthermore, the tourism industry can benefit from this study by enhancing their services and conserving the cultural heritage UUM PRESS 2024-07 Article PeerReviewed application/pdf en cc4_by https://repo.uum.edu.my/id/eprint/32088/1/JETH%2004%202024%2016-29.pdf Kamarazaman, Nadzirah and Mohamad Ali, Nazlena and Arshad, Haslina (2024) Leveraging Web Scraping to Gather Tourism Information Data. Journal of Event, Tourism and Hospitality Studies (JETH), 4. pp. 16-29. ISSN eISSN 2805-4423 https://e-journal.uum.edu.my/index.php/jeth/
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutional Repository
url_provider http://repo.uum.edu.my/
language English
topic HV Social pathology. Social and public welfare
spellingShingle HV Social pathology. Social and public welfare
Kamarazaman, Nadzirah
Mohamad Ali, Nazlena
Arshad, Haslina
Leveraging Web Scraping to Gather Tourism Information Data
description The influence of Information and Communication Technologies (ICT) on both individuals' daily lives and the economy is of significant importance. In this context, the tourism industry plays a crucial role, and it is essential to recognise the contributions of tourists in terms of sharing their experiences through tourism websites. Analysing this data is key to improving future tourists' experiences. Therefore, the objective of this study is to employ web scraping to gather data on places of interest (POI) and user attributes, specifically in the state of Melaka via the TripAdvisor website. Melaka is chosen as it is one of the places recognised by the United Nations, Educational, Scientific and Cultural Organization (UNESCO). The study focuses on the 200 POI locations (UNESCO) Map, encompassing both Melaka's core and buffer zones. These POIs are categorised into four heritage types: built heritage, natural heritage, personal heritage, and living heritage, with some belonging to more than one category. For the data collection process, this study utilised the TripAdvisor website and extracted a total of 14 attributes. Specifically, 27282 user data entries were collected from 163 POIs in the core zone area, and 8305 data entries from 37 POIs in the buffer zone area. The data is managed and stored in various formats, including CSV, JSON, and Excel files in the repository. The data helps in the development of a tourism application. Furthermore, the tourism industry can benefit from this study by enhancing their services and conserving the cultural heritage
format Article
author Kamarazaman, Nadzirah
Mohamad Ali, Nazlena
Arshad, Haslina
author_facet Kamarazaman, Nadzirah
Mohamad Ali, Nazlena
Arshad, Haslina
author_sort Kamarazaman, Nadzirah
title Leveraging Web Scraping to Gather Tourism Information Data
title_short Leveraging Web Scraping to Gather Tourism Information Data
title_full Leveraging Web Scraping to Gather Tourism Information Data
title_fullStr Leveraging Web Scraping to Gather Tourism Information Data
title_full_unstemmed Leveraging Web Scraping to Gather Tourism Information Data
title_sort leveraging web scraping to gather tourism information data
publisher UUM PRESS
publishDate 2024
url https://repo.uum.edu.my/id/eprint/32088/1/JETH%2004%202024%2016-29.pdf
https://repo.uum.edu.my/id/eprint/32088/
https://e-journal.uum.edu.my/index.php/jeth/
_version_ 1825164513296515072
score 13.250246