Lifetime Estimation for Core-Failure Resilient Multi-Core Processors

Open Access
Authors
Publication date 2023
Book title MCSoC 2023
Book subtitle proceedings : 2023 16th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip : 18-21 December 2023, Singapore, Singapore
ISBN
  • 9798350393620
ISBN (electronic)
  • 9798350393613
Event IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)
Pages (from-to) 293-300
Publisher Los Alamitos, California: IEEE Computer Society
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Multi-core processors come with several cores integrated on a single die. They often work incessantly under high thermal stress, leading to severe wear-out. Server-class multi-cores already come with a mechanism to survive a core failure called Core Failure Resilience (CFR). Embedded multi-cores with CFR are already on the horizon. The surviving cores must take on an additional workload from their fellow failed core(s) under CFR. They must also operate on higher frequencies to continue meeting the target performance. However, this additional workload assignment further accelerates the wear-out of the surviving cores due to additional heat from higher frequency operation. Lifetime estimation frameworks rely on detailed simulations, which leads to long simulation times. These frameworks are unsuitable for the early stages of the design process as they cannot quickly evaluate many design points. Existing frameworks cannot estimate the Mean Time to Failure (MTTF) for multi-cores that include Core-Failure Resilient (CFR) capabilities. We introduce SLICER, the first framework for estimating the MTTF of CFR multi-cores. SLICER integrates with state-of-the-art tools HotSniper and MatEx for fast and accurate MTTF estimation.
Document type Conference contribution
Language English
Published at https://doi.org/10.1109/MCSoC60832.2023.00050
Other links https://github.com/sudam41/SLICER
Downloads
Permalink to this page
Back