A general model of concurrency and its implementation as many-core dynamic RISC processors

Open Access
Authors
  • M. Lankamp
  • M.W. van Tol
  • L. Zhang
Publication date 2008
Host editors
  • W. Najjar
  • H. Blume
Book title 2008 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation
Book subtitle IC-SAMOS 2008, July 21-24, 2008, Samos, Greece : proceedings
ISBN
  • 9781424419852
Event International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (IC-SAMOS 2008), Samos, Greece
Pages (from-to) 1-9
Number of pages 9
Publisher Piscataway, NJ: IEEE
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
This paper presents a concurrent execution model and its micro-architecture based on in-order RISC processors, which schedules instructions from large pools of contextualised threads. The model admits a strategy for programming chip multiprocessors using parallelising compilers based on existing languages. The model is supported in the ISA by number of instructions to create and manage abstract concurrency. The paper estimates the cost of supporting these instructions in silicon. The model and its implementation uses dynamic parameterisation of concurrency creation, where a single instruction captures asynchronous remote function execution, mutual exclusion and the execution of a general concurrent loop structure and all associated communication. Concurrent loops may be dependent or independent, bounded or unbounded and may be nested arbitrarily. Hierarchical concurrency allows compilers to restructure and parallelise sequential code to meet the strict constraints on the model, which provide its freedom from deadlock and locality of communication. Communication is implicit in both the model and micro-architecture, due to the dynamic distribution of concurrency. The result is location-independent binary code that may execute on any number of processors. Simulation and analysis of the micro-architecture indicate that the model is a strong candidate for the exploitation of many-core processors. The results show near-linear speedup over two orders of magnitude of processor scaling, good energy efficiency and tolerance to large latencies in asynchronous operations. This is true for both independent threads as well as for reductions.
Document type Conference contribution
Language English
Published at https://doi.org/10.1109/ICSAMOS.2008.4664840
Downloads
293369.pdf (Final published version)
Permalink to this page
Back