Continuous circuit and wire miniaturization increasingly exert more pressure on the computer designers to address the issue of reliable operation in the presence of faults. Virtually all previous microarchitectural studies on processor reliability and yield improvement aim to solve the problem for architectural resources.
Faults in non-architectural resources received little attention because they do not aﬀect correctness. However, faults in non-architectural structures can degrade processor performance and may need to be addressed to ensure acceptable performance levels, in particular for applications where performance is of paramount importance, e.g. in real time systems that can not afford missing deadlines.
This work first quantifies the performance implications of faults in two non-architectural structures: a line-predictor and a return-address-stack.
A simulation based analysis of a high-end processor that experiences faults in 25% of the cells in the line-predictor and the return-address-stack revealed an average performance degradation of 5%. Next, we engineered a hardware protection scheme that combines a low-cost fault detection and repair through address remapping. This scheme can recover most of the performance loss when faults are present, while it rarely degrades performance when no faults exist.