Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal
The last few decades have witnessed a huge growth in the size of generated data; the total amount of information that can be saved by all of the world's technical devices is doubling about every 40 months since 1980s. From 2012 to the present, 2.5 exabytes (2.5 x 1018) bytes of information are...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2019
|
Subjects: | |
Online Access: | http://studentsrepo.um.edu.my/13338/1/Shatha_Ali.pdf http://studentsrepo.um.edu.my/13338/2/Shatha_Ali_Mohammed_Al%2DAshwal.pdf http://studentsrepo.um.edu.my/13338/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.um.stud.13338 |
---|---|
record_format |
eprints |
spelling |
my.um.stud.133382022-03-31T18:01:49Z Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal Shatha Ali , Mohammed Al-Ashwal QA75 Electronic computers. Computer science QA76 Computer software The last few decades have witnessed a huge growth in the size of generated data; the total amount of information that can be saved by all of the world's technical devices is doubling about every 40 months since 1980s. From 2012 to the present, 2.5 exabytes (2.5 x 1018) bytes of information are produced daily. Database systems have to adjust with this rapid data growth. The capabilities for storing the generated data are also available. The only concern now is how to retrieve the stored data when needed and in a timely and accurate manner. Many researchers have studied different approaches in the aspect of data retrieval, producing different ways that serves different scenarios. However, the most common way to speed up data retrieval is indexing. There are multiple types of indexing databases, but the most used ones in relational databases are the B-Tree and Bitmap index. These types of indexes speed up query response time, but with a price on storage and performance, as indexes need to be stored and maintained after each delete and write operation. Moreover, these indexes depend on indexing an attribute or two, and not the whole record, which make them limited to a limited number of queries that contain these attributes in the ‘where’ clause. This research proposed a covering index that depends on the priority of the records. It is known that data in a table are not in the same level of importance. Some records are more important than the others in a dataset. Some records need to be fetched in a timely manner, while others do not need to be retrieved very fast. Each company knows the criteria of important records, so it can decide the ranking of the records. Ranking of records can be done by using triggers or procedures. A procedure or trigger should be created to meet the company’s definition or criteria of the priority of the records. Once the records are prioritized, they are sorted according to the rank field. When a query is run, the records are scanned in an order according to their rank; the higher a record in the rank, the first it is going to be scanned. The Priority index overcomes the limitations of the classic indexes, as it does not need maintenance in each write or delete operation. Maintenance can be scheduled and made at night or weekends. Moreover, it can be useful for a variety of bounded queries as it indexes the whole record and not a single attribute. In addition, it is faster than the common index when querying the highly ranked records. The size of Priority index is also smaller than the size of the common indexes. This work required multiple experiments by running different types of queries on three tables; one indexed by B-Tree index, another one by Bitmap index, and the third by the proposed index. The outcome of the experiments show that Priority index is faster when retrieving highly ranked records, while the size of the Priority index is still smaller. 2019-04 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/13338/1/Shatha_Ali.pdf application/pdf http://studentsrepo.um.edu.my/13338/2/Shatha_Ali_Mohammed_Al%2DAshwal.pdf Shatha Ali , Mohammed Al-Ashwal (2019) Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal. Masters thesis, Universiti Malaya. http://studentsrepo.um.edu.my/13338/ |
institution |
Universiti Malaya |
building |
UM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaya |
content_source |
UM Student Repository |
url_provider |
http://studentsrepo.um.edu.my/ |
topic |
QA75 Electronic computers. Computer science QA76 Computer software |
spellingShingle |
QA75 Electronic computers. Computer science QA76 Computer software Shatha Ali , Mohammed Al-Ashwal Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal |
description |
The last few decades have witnessed a huge growth in the size of generated data; the total amount of information that can be saved by all of the world's technical devices is doubling about every 40 months since 1980s. From 2012 to the present, 2.5 exabytes (2.5 x 1018) bytes of information are produced daily. Database systems have to adjust with this rapid data growth. The capabilities for storing the generated data are also available. The only concern now is how to retrieve the stored data when needed and in a timely and accurate manner.
Many researchers have studied different approaches in the aspect of data retrieval, producing different ways that serves different scenarios. However, the most common way to speed up data retrieval is indexing. There are multiple types of indexing databases, but the most used ones in relational databases are the B-Tree and Bitmap index. These types of indexes speed up query response time, but with a price on storage and performance, as indexes need to be stored and maintained after each delete and write operation. Moreover, these indexes depend on indexing an attribute or two, and not the whole record, which make them limited to a limited number of queries that contain these attributes in the ‘where’ clause.
This research proposed a covering index that depends on the priority of the records. It is known that data in a table are not in the same level of importance. Some records are more important than the others in a dataset. Some records need to be fetched in a timely manner, while others do not need to be retrieved very fast. Each company knows the criteria of important records, so it can decide the ranking of the records.
Ranking of records can be done by using triggers or procedures. A procedure or trigger should be created to meet the company’s definition or criteria of the priority of the records. Once the records are prioritized, they are sorted according to the rank field. When a query is run, the records are scanned in an order according to their rank; the higher a record in the rank, the first it is going to be scanned. The Priority index overcomes the limitations of the classic indexes, as it does not need maintenance in each write or delete operation. Maintenance can be scheduled and made at night or weekends. Moreover, it can be useful for a variety of bounded queries as it indexes the whole record and not a single attribute. In addition, it is faster than the common index when querying the highly ranked records. The size of Priority index is also smaller than the size of the common indexes.
This work required multiple experiments by running different types of queries on three tables; one indexed by B-Tree index, another one by Bitmap index, and the third by the proposed index. The outcome of the experiments show that Priority index is faster when retrieving highly ranked records, while the size of the Priority index is still smaller.
|
format |
Thesis |
author |
Shatha Ali , Mohammed Al-Ashwal |
author_facet |
Shatha Ali , Mohammed Al-Ashwal |
author_sort |
Shatha Ali , Mohammed Al-Ashwal |
title |
Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal |
title_short |
Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal |
title_full |
Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal |
title_fullStr |
Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal |
title_full_unstemmed |
Accelerating data retrieval using index prioritization approach / Shatha Ali Mohammed Al-Ashwal |
title_sort |
accelerating data retrieval using index prioritization approach / shatha ali mohammed al-ashwal |
publishDate |
2019 |
url |
http://studentsrepo.um.edu.my/13338/1/Shatha_Ali.pdf http://studentsrepo.um.edu.my/13338/2/Shatha_Ali_Mohammed_Al%2DAshwal.pdf http://studentsrepo.um.edu.my/13338/ |
_version_ |
1738506696597176320 |
score |
13.209306 |