Threading optimization of the AEMB microprocessor
This paper aims at explaining the architecture of the AEMB and optimizing its threading model. AEMB is a soft core, open source processor designed for FPGA implementation. AEMB uses a fine grained model that interleaves threads one instruction at a time with separate register sets for each thread. T...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2014
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-84946686657&doi=10.1109%2fICCSCE.2014.7072785&partnerID=40&md5=7097e0ca4ddb80a25ee68922d96b4e35 http://eprints.utp.edu.my/31306/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utp.eprints.31306 |
---|---|
record_format |
eprints |
spelling |
my.utp.eprints.313062022-03-25T09:05:41Z Threading optimization of the AEMB microprocessor Mohamed, M. Sebastian, P. Hiung, L.H. Ngiap, S.T.S. This paper aims at explaining the architecture of the AEMB and optimizing its threading model. AEMB is a soft core, open source processor designed for FPGA implementation. AEMB uses a fine grained model that interleaves threads one instruction at a time with separate register sets for each thread. The chosen optimization is to change the current threading model to a Coarse-grained that switches threads on branch instructions. The advantage of this approach is that control hazards will be avoided in most cases hence stalling the pipeline till branch targets are calculated will be drastically decreased. This is quite an improvement over the previous core where the processor stalls for one cycle on any branch instruction encountered. The disadvantage to the coarse grained threading model is that data hazards that can't be forwarded can now cause the processor to stall up to three cycles in the worst case scenario compared to only one stall in the old model. As for Area consumption on FPGA, synthesis showed that the modified core utilizes double the number of LUTs that the original AEMB needs but there was no significant increase in the number of register. Further quantitative analysis is necessary to determine the total gain in performance by running the suitable benchmarks on both versions of the processor. © 2014 IEEE. Institute of Electrical and Electronics Engineers Inc. 2014 Conference or Workshop Item NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-84946686657&doi=10.1109%2fICCSCE.2014.7072785&partnerID=40&md5=7097e0ca4ddb80a25ee68922d96b4e35 Mohamed, M. and Sebastian, P. and Hiung, L.H. and Ngiap, S.T.S. (2014) Threading optimization of the AEMB microprocessor. In: UNSPECIFIED. http://eprints.utp.edu.my/31306/ |
institution |
Universiti Teknologi Petronas |
building |
UTP Resource Centre |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Petronas |
content_source |
UTP Institutional Repository |
url_provider |
http://eprints.utp.edu.my/ |
description |
This paper aims at explaining the architecture of the AEMB and optimizing its threading model. AEMB is a soft core, open source processor designed for FPGA implementation. AEMB uses a fine grained model that interleaves threads one instruction at a time with separate register sets for each thread. The chosen optimization is to change the current threading model to a Coarse-grained that switches threads on branch instructions. The advantage of this approach is that control hazards will be avoided in most cases hence stalling the pipeline till branch targets are calculated will be drastically decreased. This is quite an improvement over the previous core where the processor stalls for one cycle on any branch instruction encountered. The disadvantage to the coarse grained threading model is that data hazards that can't be forwarded can now cause the processor to stall up to three cycles in the worst case scenario compared to only one stall in the old model. As for Area consumption on FPGA, synthesis showed that the modified core utilizes double the number of LUTs that the original AEMB needs but there was no significant increase in the number of register. Further quantitative analysis is necessary to determine the total gain in performance by running the suitable benchmarks on both versions of the processor. © 2014 IEEE. |
format |
Conference or Workshop Item |
author |
Mohamed, M. Sebastian, P. Hiung, L.H. Ngiap, S.T.S. |
spellingShingle |
Mohamed, M. Sebastian, P. Hiung, L.H. Ngiap, S.T.S. Threading optimization of the AEMB microprocessor |
author_facet |
Mohamed, M. Sebastian, P. Hiung, L.H. Ngiap, S.T.S. |
author_sort |
Mohamed, M. |
title |
Threading optimization of the AEMB microprocessor |
title_short |
Threading optimization of the AEMB microprocessor |
title_full |
Threading optimization of the AEMB microprocessor |
title_fullStr |
Threading optimization of the AEMB microprocessor |
title_full_unstemmed |
Threading optimization of the AEMB microprocessor |
title_sort |
threading optimization of the aemb microprocessor |
publisher |
Institute of Electrical and Electronics Engineers Inc. |
publishDate |
2014 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-84946686657&doi=10.1109%2fICCSCE.2014.7072785&partnerID=40&md5=7097e0ca4ddb80a25ee68922d96b4e35 http://eprints.utp.edu.my/31306/ |
_version_ |
1738657229100285952 |
score |
13.160551 |