13.1 Overview
Fault exceptions are exception types dedicated to error handling and are a part of the system exceptions in the Arm®v8-M architecture [1]. For example, in the last chapter I mentioned that an MPU violation could trigger either a MemManage Fault or a HardFault. In addition to those faults, Arm® Cortex®-M processors provide several other fault exceptions and hardware resources for:
- •
Managing fault events
- •
Analyzing fault exception (Fault Status Registers are available in Armv8-M Mainline only)
A summary of the fault exceptions available in Arm Cortex-M23 and Cortex-M33 processors are listed in Table 13.1.
Table 13.1. Fault exceptions availability in Armv8-M processors.
| Exception number | Exception name | Available in Cortex-M33 | Available in Cortex-M23 | Description |
|---|
| 3 | HardFault | Yes | Yes | For vector fetch fault and escalated faults |
|---|
| 4 | MemManage fault | Yes | No—escalates to HardFault | For faults related to the MPU and for faults related to the violation of access permissions in the default memory map |
|---|
| 5 | BusFault | Yes | No—escalates to HardFault | For faults related to bus level error responses and for faults caused by the access of registers in the private peripheral bus while in unprivileged state. |
|---|
| 6 | UsageFault | Yes | No—escalates to HardFault | Faults related to instruction operations/executions |
|---|
| 7 | SecureFault | Yes | No—escalates to HardFault | For faults related to TrustZone® security (this is new in Armv8-M) |
|---|
MemManage Fault, BusFault, UsageFault, and SecureFault are often referred to as configurable faults because they can be enabled/disabled by software, and because their exception priority levels are programmable.
In Chapter 4, the available fault exceptions were briefly introduced (Table 4.14). When a fault event is detected, the corresponding fault exception handler is executed. In Armv8-M Baseline (i.e., the Cortex-M23 processor), because there is only one fault exception available, all fault events trigger the HardFault handler. In Armv8-M Mainline processors (e.g., the Cortex-M33 processor), several fault exceptions are available and can, optionally, be enabled by software to deal with different types of faults.
Fault events can be triggered by many different reasons:
- •
Hardware failure—potentially caused by transient factors such as power instability, various forms of interference, issues with the environment that the system is operating in (e.g., the temperature range), and possibly, if there is a bug in the hardware.
- •
Software issues—these can be caused by software bugs, or by the system operating under undesirable conditions (e.g., the system crashing under a heavy processing load) or because of software vulnerabilities.
- •
User error—e.g., incorrect data input.
Traditionally, many microcontrollers integrated watchdog timers to detect operational timeout. The watchdog is a hardware peripheral that contains a counter, which once enabled, cannot be disabled or stopped. However, software can regularly reset the counter value to prevent the counter from reaching a timeout value. If the counter reaches the timeout value (i.e., if the software does not reset the watchdog counter before it times out), the watchdog timer automatically resets the system.
Although the watchdog timer can restart a system when the processor has crashed, there can, however, be a delay from the time the system stops functioning to the time the watchdog reset takes place. This is, potentially, undesirable because the system will not respond to hardware events, and additional data corruptions could take place during the delay.
The fault exceptions in Cortex-M processors allow remedial action to take place as quickly as possible after an issue has been detected. Once the fault exception handler executes, there are several ways the software can deal with the error. For example, it can:
- •
Safely stop the system.
- •
Inform users or other systems that it has encountered a problem and request the user to intervene.
- •
Carry out a self-reset.
- •
In the case of multitasking systems, terminate the offending tasks and then restart them.
- •
Carry out other remedial action to try and fix the problem, e.g., executing a floating-point instruction with the floating-point unit (FPU) turned off can cause an error, but this issue is easily solved by turning the FPU back on.
Depending on the type of fault detected, a system could carry out several operations from the list above in order to resolve the matter.
To detect the error type that was triggered in the fault handler, the Cortex-M33 processor has several Fault Status Registers (FSRs). The status bits inside these FSRs indicate the fault type that has been detected. Although it might not pin-point exactly when or where things had gone wrong, locating the source of the problem is made easier when these additional pieces of information are available. Additionally, in some instances, the faulting address is also captured by Fault Address Registers (FARs). Further information on FSRs and FARs are covered in Section 13.5.
When software is being developed, programming errors can lead to fault exceptions. To fix those errors, the information provided by the FSRs and the FARs can be used by software developers to identify these software issues. To make the analysis of the software issues easier, software developers can utilize a feature called instruction tracing. This feature can either be enabled by using the Embedded Trace Macrocell (ETM) or the Micro Trace Buffer (MTB) in the processor (Chapter 16, Sections 16.3.6 and 16.3.716.3.616.3.7). With instruction tracing support, software developers can extract the program flow before the fault exception occurred.
The fault exception mechanism also allows applications to be debugged safely. For example, when developing a motor control system, you can turn off the motor in the fault handlers before stopping the processor for debugging, instead of halting immediately and leaving the motor running.
Due to the requirement of keeping the processor very small (in terms of silicon area) and as low power as possible, the Cortex-M23 processor does not have the same level of fault diagnostic features as the Cortex-M33 processor. For example, features such as fault status registers and multiple fault exception handlers are not available in the Cortex-M23 processor, but instruction trace is supported, which greatly assists fault debugging.
Following on from my last paragraph, fault events in the Cortex-M23 processor are considered unrecoverable as there are no fault status registers to help the software determine the cause of the fault exception. Moreover, since there is no multiple fault handler, all fault events are handled by the HardFault handler. The only way, therefore, to handle the error is to stop the system, and optionally, report the error (e.g., via a user interface) and/or carry out a self-reset.