Review and empirical analysis of software effort estimation

The average software company spends a huge amount of its revenue on R&D for how to deliver software on time. Accurate software effort estimation is critical for successful project planning, resource allocation, and on-time delivery within budget for sustainable software development. However, bot...

Full description

Saved in:
Bibliographic Details
Main Authors: Rahman, Mizanur, Sarwar, Hasan, Kader, Md Abdul, Gonagalves, Teresa, Tin, Ting Tin
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2024
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/41601/1/Review%20and%20Empirical%20Analysis%20of%20Software%20Effort%20Estimation.pdf
http://umpir.ump.edu.my/id/eprint/41601/
https://doi.org/10.1109/ACCESS.2024.3404879
https://doi.org/10.1109/ACCESS.2024.3404879
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ump.umpir.41601
record_format eprints
spelling my.ump.umpir.416012024-07-31T01:53:36Z http://umpir.ump.edu.my/id/eprint/41601/ Review and empirical analysis of software effort estimation Rahman, Mizanur Sarwar, Hasan Kader, Md Abdul Gonagalves, Teresa Tin, Ting Tin QA75 Electronic computers. Computer science QA76 Computer software T Technology (General) TA Engineering (General). Civil engineering (General) The average software company spends a huge amount of its revenue on R&D for how to deliver software on time. Accurate software effort estimation is critical for successful project planning, resource allocation, and on-time delivery within budget for sustainable software development. However, both overestimation and underestimation pose significant challenges in software development, necessitating continuous improvement in estimation techniques. This study reviews recent machine learning approaches exploited to enhance software effort estimation (SEE) accuracy, focusing on research published between 2020 and 2023. The literature review employed an approach to identify pertinent research on machine learning techniques for software estimation efforts. Additionally, comparative experiments were conducted employing five commonly used ML methods: K-Nearest Neighbor, Support Vector Machine, Random Forest, Logistic Regression, and LASSO Regression. These techniques were assessed using five widely employed accuracy metrics such as Mean Squared Error (MSE), Mean Magnitude of Relative Error (MMRE), R-squared, Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE) on seven benchmark datasets (Albrecht, Desharnais, China, Kemerer, Mayazaki94, Maxwell, COCOMO). By carefully reviewing study quality, analyzing results across the literature, and rigorously evaluating experimental outcomes, clear conclusions were drawn about the most promising techniques for achieving state-of-the-art accuracy in estimating software effort. This study makes three key contributions to the field: firstly, it furnishes a thorough overview of recent machine learning research in software effort estimation (SEE); secondly, it provides data-driven guidance for researchers and practitioners to select optimal methods for accurate effort estimation; and thirdly, it demonstrates the performance of publicly available datasets through experimental analysis. Enhanced estimation supports the development of better predictive models for software project time, cost, and staffing needs. The findings aim to focus future research directions and tool development toward the most accurate machine learning approaches for modeling software development effort, costs, and delivery schedules. Institute of Electrical and Electronics Engineers Inc. 2024 Article PeerReviewed pdf en cc_by_nc_nd_4 http://umpir.ump.edu.my/id/eprint/41601/1/Review%20and%20Empirical%20Analysis%20of%20Software%20Effort%20Estimation.pdf Rahman, Mizanur and Sarwar, Hasan and Kader, Md Abdul and Gonagalves, Teresa and Tin, Ting Tin (2024) Review and empirical analysis of software effort estimation. IEEE Access. p. 1. ISSN 2169-3536. (In Press / Online First) (In Press / Online First) https://doi.org/10.1109/ACCESS.2024.3404879 https://doi.org/10.1109/ACCESS.2024.3404879
institution Universiti Malaysia Pahang Al-Sultan Abdullah
building UMPSA Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang Al-Sultan Abdullah
content_source UMPSA Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
topic QA75 Electronic computers. Computer science
QA76 Computer software
T Technology (General)
TA Engineering (General). Civil engineering (General)
spellingShingle QA75 Electronic computers. Computer science
QA76 Computer software
T Technology (General)
TA Engineering (General). Civil engineering (General)
Rahman, Mizanur
Sarwar, Hasan
Kader, Md Abdul
Gonagalves, Teresa
Tin, Ting Tin
Review and empirical analysis of software effort estimation
description The average software company spends a huge amount of its revenue on R&D for how to deliver software on time. Accurate software effort estimation is critical for successful project planning, resource allocation, and on-time delivery within budget for sustainable software development. However, both overestimation and underestimation pose significant challenges in software development, necessitating continuous improvement in estimation techniques. This study reviews recent machine learning approaches exploited to enhance software effort estimation (SEE) accuracy, focusing on research published between 2020 and 2023. The literature review employed an approach to identify pertinent research on machine learning techniques for software estimation efforts. Additionally, comparative experiments were conducted employing five commonly used ML methods: K-Nearest Neighbor, Support Vector Machine, Random Forest, Logistic Regression, and LASSO Regression. These techniques were assessed using five widely employed accuracy metrics such as Mean Squared Error (MSE), Mean Magnitude of Relative Error (MMRE), R-squared, Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE) on seven benchmark datasets (Albrecht, Desharnais, China, Kemerer, Mayazaki94, Maxwell, COCOMO). By carefully reviewing study quality, analyzing results across the literature, and rigorously evaluating experimental outcomes, clear conclusions were drawn about the most promising techniques for achieving state-of-the-art accuracy in estimating software effort. This study makes three key contributions to the field: firstly, it furnishes a thorough overview of recent machine learning research in software effort estimation (SEE); secondly, it provides data-driven guidance for researchers and practitioners to select optimal methods for accurate effort estimation; and thirdly, it demonstrates the performance of publicly available datasets through experimental analysis. Enhanced estimation supports the development of better predictive models for software project time, cost, and staffing needs. The findings aim to focus future research directions and tool development toward the most accurate machine learning approaches for modeling software development effort, costs, and delivery schedules.
format Article
author Rahman, Mizanur
Sarwar, Hasan
Kader, Md Abdul
Gonagalves, Teresa
Tin, Ting Tin
author_facet Rahman, Mizanur
Sarwar, Hasan
Kader, Md Abdul
Gonagalves, Teresa
Tin, Ting Tin
author_sort Rahman, Mizanur
title Review and empirical analysis of software effort estimation
title_short Review and empirical analysis of software effort estimation
title_full Review and empirical analysis of software effort estimation
title_fullStr Review and empirical analysis of software effort estimation
title_full_unstemmed Review and empirical analysis of software effort estimation
title_sort review and empirical analysis of software effort estimation
publisher Institute of Electrical and Electronics Engineers Inc.
publishDate 2024
url http://umpir.ump.edu.my/id/eprint/41601/1/Review%20and%20Empirical%20Analysis%20of%20Software%20Effort%20Estimation.pdf
http://umpir.ump.edu.my/id/eprint/41601/
https://doi.org/10.1109/ACCESS.2024.3404879
https://doi.org/10.1109/ACCESS.2024.3404879
_version_ 1822924550639190016
score 13.23648