Evaluation and performance analysis of heterogeneous multicore cluster processor architecture

The advancement of silicon and wafer technology scaling in recent years have enabled the incorporation of different types of multiple processor cores clustered on a single die, example includes ARM's big.LITTLE in dual quadcore cluster to form octa-core single chip. The great success of this ar...

Full description

Saved in:
Bibliographic Details
Main Authors: On, O.J., Hussin, F.A.B.
Format: Conference or Workshop Item
Published: Institution of Engineering and Technology 2014
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-84939429529&doi=10.1049%2fcp.2014.1414&partnerID=40&md5=d32d1674217fff2a6f79f58c5ca85375
http://eprints.utp.edu.my/31738/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The advancement of silicon and wafer technology scaling in recent years have enabled the incorporation of different types of multiple processor cores clustered on a single die, example includes ARM's big.LITTLE in dual quadcore cluster to form octa-core single chip. The great success of this architecture has encouraged further development into manycore system utilizing this unique architecture. Despite various anticipation of highly improvised many "big.LITTLE" core, researcher has found some limitations including load balance inefficiency, inefficient scheduler and limitation to same ISA core per cluster of maximum four core for this big.LITTLE or equivalent architecture. In this paper we intend to analyse the performance of different manycore clustering methods, aimed to show the impact of different mixture of multicore cluster single-chip processor architecture. We run five benchmarks applications selected from PARSEC-2.1 and SPLASH-2 benchmark suite resembling various popular application for mobile devices, including signal and media processing, graphics, data mining, general and engineering as well as high-performance computing segments. The simulation results shows asymmetric multicore cluster architecture has the highest speedup for most of benchmark programs tested. This shows asymmetric multicore cluster capability of utilizing its mix-core processing strength to better improve task or workload processing. Despite the better throughput performance for homogeneous cluster, we observed this is true for only two programs, the remaining programs show similar performance for all three cluster configurations. The experimental results in this paper can serve as research reference in design space exploration, for processor designers on the necessary optimal design choices, thus potentially reduce design cost and time.