Threading optimization of the AEMB microprocessor

This paper aims at explaining the architecture of the AEMB and optimizing its threading model. AEMB is a soft core, open source processor designed for FPGA implementation. AEMB uses a fine grained model that interleaves threads one instruction at a time with separate register sets for each thread. T...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohamed, M., Sebastian, P., Hiung, L.H., Ngiap, S.T.S.
Format: Conference or Workshop Item
Published: Institute of Electrical and Electronics Engineers Inc. 2014
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-84946686657&doi=10.1109%2fICCSCE.2014.7072785&partnerID=40&md5=7097e0ca4ddb80a25ee68922d96b4e35
http://eprints.utp.edu.my/31306/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utp.eprints.31306
record_format eprints
spelling my.utp.eprints.313062022-03-25T09:05:41Z Threading optimization of the AEMB microprocessor Mohamed, M. Sebastian, P. Hiung, L.H. Ngiap, S.T.S. This paper aims at explaining the architecture of the AEMB and optimizing its threading model. AEMB is a soft core, open source processor designed for FPGA implementation. AEMB uses a fine grained model that interleaves threads one instruction at a time with separate register sets for each thread. The chosen optimization is to change the current threading model to a Coarse-grained that switches threads on branch instructions. The advantage of this approach is that control hazards will be avoided in most cases hence stalling the pipeline till branch targets are calculated will be drastically decreased. This is quite an improvement over the previous core where the processor stalls for one cycle on any branch instruction encountered. The disadvantage to the coarse grained threading model is that data hazards that can't be forwarded can now cause the processor to stall up to three cycles in the worst case scenario compared to only one stall in the old model. As for Area consumption on FPGA, synthesis showed that the modified core utilizes double the number of LUTs that the original AEMB needs but there was no significant increase in the number of register. Further quantitative analysis is necessary to determine the total gain in performance by running the suitable benchmarks on both versions of the processor. © 2014 IEEE. Institute of Electrical and Electronics Engineers Inc. 2014 Conference or Workshop Item NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-84946686657&doi=10.1109%2fICCSCE.2014.7072785&partnerID=40&md5=7097e0ca4ddb80a25ee68922d96b4e35 Mohamed, M. and Sebastian, P. and Hiung, L.H. and Ngiap, S.T.S. (2014) Threading optimization of the AEMB microprocessor. In: UNSPECIFIED. http://eprints.utp.edu.my/31306/
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description This paper aims at explaining the architecture of the AEMB and optimizing its threading model. AEMB is a soft core, open source processor designed for FPGA implementation. AEMB uses a fine grained model that interleaves threads one instruction at a time with separate register sets for each thread. The chosen optimization is to change the current threading model to a Coarse-grained that switches threads on branch instructions. The advantage of this approach is that control hazards will be avoided in most cases hence stalling the pipeline till branch targets are calculated will be drastically decreased. This is quite an improvement over the previous core where the processor stalls for one cycle on any branch instruction encountered. The disadvantage to the coarse grained threading model is that data hazards that can't be forwarded can now cause the processor to stall up to three cycles in the worst case scenario compared to only one stall in the old model. As for Area consumption on FPGA, synthesis showed that the modified core utilizes double the number of LUTs that the original AEMB needs but there was no significant increase in the number of register. Further quantitative analysis is necessary to determine the total gain in performance by running the suitable benchmarks on both versions of the processor. © 2014 IEEE.
format Conference or Workshop Item
author Mohamed, M.
Sebastian, P.
Hiung, L.H.
Ngiap, S.T.S.
spellingShingle Mohamed, M.
Sebastian, P.
Hiung, L.H.
Ngiap, S.T.S.
Threading optimization of the AEMB microprocessor
author_facet Mohamed, M.
Sebastian, P.
Hiung, L.H.
Ngiap, S.T.S.
author_sort Mohamed, M.
title Threading optimization of the AEMB microprocessor
title_short Threading optimization of the AEMB microprocessor
title_full Threading optimization of the AEMB microprocessor
title_fullStr Threading optimization of the AEMB microprocessor
title_full_unstemmed Threading optimization of the AEMB microprocessor
title_sort threading optimization of the aemb microprocessor
publisher Institute of Electrical and Electronics Engineers Inc.
publishDate 2014
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-84946686657&doi=10.1109%2fICCSCE.2014.7072785&partnerID=40&md5=7097e0ca4ddb80a25ee68922d96b4e35
http://eprints.utp.edu.my/31306/
_version_ 1738657229100285952
score 13.160551