Fault Exception

In subject area: Engineering

Fault exception is defined as an event that occurs when an error, such as a divide by zero operation, is detected during program execution, triggering a specific handler to manage the error and potentially trace it back to the source of the fault.

AI generated definition based on: The Designer's Guide to the Cortex-m Processor Family, 2013

How useful is this definition?

Add to Mendeley

Chapters and Articles

You might find these chapters and articles relevant to this topic.

4.5.4 Fault handling

Several types of exceptions in the Cortex-M3 and Cortex®-M4 processors are fault handling exceptions. Fault exceptions are triggered when the processor detects an error such as the execution of an undefined instruction, or when the bus system returns an error response to a memory access. The fault exception mechanism allows errors to be detected quickly, and potentially allows the software to carry out remedial actions (Figure 4.27).

Image

FIGURE 4.27. Fault exceptions usages

By default the Bus Fault, Usage Fault, and Memory Management Fault are disabled and all fault events trigger the HardFault exception. However, the configurations are programmable and you can enable the three programmable fault exceptions individually to handle different types of faults. The HardFault exception is always enabled.

Fault exceptions can also be useful for debugging software issues. For example, the fault handler can automatically collect information and report to the user or other systems that an error has occurred and provide debug information. A number of fault status registers are available in the Cortex-M3 and Cortex-M4 processors, which provide hints about the error sources. Software developers can also examine these fault status registers using the debugger during software development.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B978012408082900004X

11.1 Fault Exception Overview

In ARM® processors, if a program goes wrong and the processor detects a fault, then a fault exception occurs. On the Cortex®-M0/M0+ processors, there is only one exception type that handles faults: the HardFault exception.

The HardFault exception is almost the highest priority exception type, with a priority level of −1. Only the Non-Maskable Interrupt (NMI) can preempt the HardFault exception. When the HardFault handler is triggered, we know that the microcontroller is in trouble and corrective action is needed. The HardFault handler is also useful for debugging software during the software development stage. By setting a breakpoint in the HardFault handler (or use a debug feature called vector catch to halt the processor at HardFault), the program execution stops when a fault occurs. By examining the content of the stack, often we can back trace the location of the fault and try to identify the reason for the failure.

This behavior is very different from most 8-bit and 16-bit microcontrollers. In these microcontrollers, often the only safety net is a watchdog timer. However, it takes time for a watchdog timer to trigger, and often there is no way to determine how the program went wrong.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128032770000114

9.8.4 DIV_0_TRP bit

When this bit is set, a Usage Fault exception is triggered when a divide by zero occurs in SDIV (signed divide) or UDIV (unsigned divide) instructions. Otherwise, the operation will complete with a quotient of 0.

If the Usage Fault handler is not enabled, the HardFault exception would be triggered (see chapter 12, section 12.1 and Figure 12.1).

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780124080829000099

13.1 Overview

Fault exceptions are exception types dedicated to error handling and are a part of the system exceptions in the Arm®v8-M architecture [1]. For example, in the last chapter I mentioned that an MPU violation could trigger either a MemManage Fault or a HardFault. In addition to those faults, Arm® Cortex®-M processors provide several other fault exceptions and hardware resources for:

Managing fault events

Analyzing fault exception (Fault Status Registers are available in Armv8-M Mainline only)

A summary of the fault exceptions available in Arm Cortex-M23 and Cortex-M33 processors are listed in Table 13.1.

Table 13.1. Fault exceptions availability in Armv8-M processors.

Exception numberException nameAvailable in Cortex-M33Available in Cortex-M23Description
3HardFaultYesYesFor vector fetch fault and escalated faults
4MemManage faultYesNo—escalates to HardFaultFor faults related to the MPU and for faults related to the violation of access permissions in the default memory map
5BusFaultYesNo—escalates to HardFaultFor faults related to bus level error responses and for faults caused by the access of registers in the private peripheral bus while in unprivileged state.
6UsageFaultYesNo—escalates to HardFaultFaults related to instruction operations/executions
7SecureFaultYesNo—escalates to HardFaultFor faults related to TrustZone® security (this is new in Armv8-M)

MemManage Fault, BusFault, UsageFault, and SecureFault are often referred to as configurable faults because they can be enabled/disabled by software, and because their exception priority levels are programmable.

In Chapter 4, the available fault exceptions were briefly introduced (Table 4.14). When a fault event is detected, the corresponding fault exception handler is executed. In Armv8-M Baseline (i.e., the Cortex-M23 processor), because there is only one fault exception available, all fault events trigger the HardFault handler. In Armv8-M Mainline processors (e.g., the Cortex-M33 processor), several fault exceptions are available and can, optionally, be enabled by software to deal with different types of faults.

Fault events can be triggered by many different reasons:

Hardware failure—potentially caused by transient factors such as power instability, various forms of interference, issues with the environment that the system is operating in (e.g., the temperature range), and possibly, if there is a bug in the hardware.

Software issues—these can be caused by software bugs, or by the system operating under undesirable conditions (e.g., the system crashing under a heavy processing load) or because of software vulnerabilities.

User error—e.g., incorrect data input.

Traditionally, many microcontrollers integrated watchdog timers to detect operational timeout. The watchdog is a hardware peripheral that contains a counter, which once enabled, cannot be disabled or stopped. However, software can regularly reset the counter value to prevent the counter from reaching a timeout value. If the counter reaches the timeout value (i.e., if the software does not reset the watchdog counter before it times out), the watchdog timer automatically resets the system.

Although the watchdog timer can restart a system when the processor has crashed, there can, however, be a delay from the time the system stops functioning to the time the watchdog reset takes place. This is, potentially, undesirable because the system will not respond to hardware events, and additional data corruptions could take place during the delay.

The fault exceptions in Cortex-M processors allow remedial action to take place as quickly as possible after an issue has been detected. Once the fault exception handler executes, there are several ways the software can deal with the error. For example, it can:

Safely stop the system.

Inform users or other systems that it has encountered a problem and request the user to intervene.

Carry out a self-reset.

In the case of multitasking systems, terminate the offending tasks and then restart them.

Carry out other remedial action to try and fix the problem, e.g., executing a floating-point instruction with the floating-point unit (FPU) turned off can cause an error, but this issue is easily solved by turning the FPU back on.

Depending on the type of fault detected, a system could carry out several operations from the list above in order to resolve the matter.

To detect the error type that was triggered in the fault handler, the Cortex-M33 processor has several Fault Status Registers (FSRs). The status bits inside these FSRs indicate the fault type that has been detected. Although it might not pin-point exactly when or where things had gone wrong, locating the source of the problem is made easier when these additional pieces of information are available. Additionally, in some instances, the faulting address is also captured by Fault Address Registers (FARs). Further information on FSRs and FARs are covered in Section 13.5.

When software is being developed, programming errors can lead to fault exceptions. To fix those errors, the information provided by the FSRs and the FARs can be used by software developers to identify these software issues. To make the analysis of the software issues easier, software developers can utilize a feature called instruction tracing. This feature can either be enabled by using the Embedded Trace Macrocell (ETM) or the Micro Trace Buffer (MTB) in the processor (Chapter 16, Sections 16.3.6 and 16.3.716.3.616.3.7). With instruction tracing support, software developers can extract the program flow before the fault exception occurred.

The fault exception mechanism also allows applications to be debugged safely. For example, when developing a motor control system, you can turn off the motor in the fault handlers before stopping the processor for debugging, instead of halting immediately and leaving the motor running.

Due to the requirement of keeping the processor very small (in terms of silicon area) and as low power as possible, the Cortex-M23 processor does not have the same level of fault diagnostic features as the Cortex-M33 processor. For example, features such as fault status registers and multiple fault exception handlers are not available in the Cortex-M23 processor, but instruction trace is supported, which greatly assists fault debugging.

Following on from my last paragraph, fault events in the Cortex-M23 processor are considered unrecoverable as there are no fault status registers to help the software determine the cause of the fault exception. Moreover, since there is no multiple fault handler, all fault events are handled by the HardFault handler. The only way, therefore, to handle the error is to stop the system, and optionally, report the error (e.g., via a user interface) and/or carry out a self-reset.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128207352000135

Enabling Fault Exceptions

The hard fault handler is always enabled and can only be disabled by setting the CPU FAULTMASK register. The other fault exceptions must be enabled in the System Control Block, “System Handler Control and State” register (SCB->SHCSR). The SCB->SHCSR register also contains Pend and Active bits for each fault exception.

We will look at the fault exceptions and tracking faults in Chapter 5 “Advanced Architecture Features.”

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780081006290000037

Abstract

This chapter introduces the fault exceptions that are available in Arm Cortex-M23 and Cortex-M33 processors. It also describes how information can be extracted from the fault status registers and from the exception stack frame to assist in the debugging of faults. In addition, this chapter also explains “lockup” and what software improvements can be made to make the system more robust.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128207352000135

12.1 Overview of fault exceptions

Electronic systems can go wrong from time to time. The problems could be bugs in the software, but in many cases they can be caused by external factors such as:

Unstable power supply

Electrical noise (e.g., noise from power lines)

Electromagnetic interference (EMI)

Electrostatic discharge

Extreme operation environment (e.g., temperature, mechanical vibrations)

Wearing out of components (e.g., Flash/EEPROMs devices, crystal oscillators, capacitors) caused by repetitive programming or high-low temperature cycles

Radiation (e.g., cosmic rays)

Usage issues (e.g., end users didn’t read the manual 鞆) or invalid external data input

All these issues could lead to failure in the programs running on the processors. In many simple microcontrollers, you can find features like a watchdog timer and Brown-Out Detector (BOD). The watchdog can be programmed to trigger if the counter is not cleared within a certain time, and can be used to generate a reset or Non-Maskable Interrupt (NMI). The BOD can be used to generate a reset if the supply voltage drops to a certain critical level.

You can find a watchdog timer and BOD in many ARM® microcontrollers as well. However, when a failure occurs and the processor stops responding, it might take a bit of time for the watchdog to kick in. For most applications this is not a problem, but for some safety critical applications, a 1msec delay can be a matter of life or death.

In order to allow problems to be detected as early as possible, the Cortex®-M processors have a fault exception mechanism included. If a fault is detected, a fault exception is triggered and one of the fault exception handlers is executed.

By default, all the faults trigger the HardFault exception (exception type number 3). This fault exception is available on all Cortex-M processors including the Cortex-M0 and Cortex-M0+ processors. Cortex-M3 and Cortex-M4 processors have three additional configurable fault exception handlers:

MemManage (Memory Management) Fault (exception type 4)

Bus Fault (exception type 5)

Usage Fault (exception type 6)

These exceptions are triggered if they are enabled, and if their priority is higher than the current exception priority level, as shown in Figure 12.1. These exceptions are called configurable fault exceptions, and have programmable exception priority levels (see section 7.9.5).

Image

FIGURE 12.1. Fault exceptions available in ARMv7-M architecture

The fault handlers can be used in a number of ways:

Shut down the system safely

Inform users or other systems that it encountered a problem

Carry out self-reset

In the case of multi-tasking systems, the offending tasks could be terminated and restarted

Other remedial actions can be carried out to try to fix the problem if possible (e.g., executing a floating point instruction with floating point unit turned off can cause an error, and can be easily solved by turning the floating point unit on)

Sometimes a system could carry out a number of different operations from the list above, depending on the type of fault detected.

To help detect what type of error was encountered in the fault handler, the Cortex-M3 and Cortex-M4 processors also have a number of Fault Status Registers (FSRs). The status bits inside these FSRs indicate the kind of fault detected. Although it might not pinpoint exactly when or where things went wrong, locating the source of the problem is made easier with these addition pieces of information. In addition, in some cases the faulting address is also captured by Fault Address Registers (FARs). More information about FSRs and FARs is given in section 12.4.

During software development, programming errors can also lead to fault exceptions. The information provided by the FARs can be very useful for software developers in identifying software issues in debugging.

The fault exception mechanism also allows applications to be debugged safely. For example, when developing a motor control system, you can shut down the motor by using the fault handlers before stopping the processor for debugging.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780124080829000129

Hard Fault

A hard fault can be raised in two ways. First, if a bus error occurs when the vector table is being read. Secondly, the hard fault exception is also reached through fault escalation. This means that if the usage, memory manager, or bus fault exceptions are disabled, or if the exception service does not have sufficient priority level, then the fault will escalate to a hard fault.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780080982960000037

Hard Fault

A Hard fault can be raised in two ways: first, if a bus error occurs when the vector table is being read; second, the Hard Fault exception is also reached through fault escalation. This means that if the Usage, Memory Manager, or Bus fault exceptions are active but their dedicated Interrupt Service Routines are not enabled, or if the exception service does not have a sufficient priority level, then the fault will escalate to a Hard fault.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780323854948000085

4.5 Exceptions and interrupts

4.5.1 What are exceptions?

Exceptions are events that cause changes to program flow. When one happens, the processor suspends the current executing task and executes a part of the program called the exception handler. After the execution of the exception handler is completed, the processor then resumes normal program execution. In the ARM® architecture, interrupts are one type of exception. Interrupts are usually generated from peripheral or external inputs, and in some cases they can be triggered by software. The exception handlers for interrupts are also referred to as Interrupt Service Routines (ISR).

In Cortex®-M processors, there are a number of exception sources:

Exceptions are processed by the NVIC. The NVIC can handle a number of Interrupt Requests (IRQs) and a Non-Maskable Interrupt (NMI) request. Usually IRQs are generated by on-chip peripherals or from external interrupt inputs though I/O ports. The NMI could be used by a watchdog timer or brownout detector (a voltage monitoring unit that warns the processor when the supply voltage drops below a certain level). Inside the processor there is also a timer called SysTick, which can generate a periodic timer interrupt request, which can be used by embedded OSs for timekeeping, or for simple timing control in applications that don’t require an OS.

The processor itself is also a source of exception events. These could be fault events that indicate system error conditions, or exceptions generated by software to support embedded OS operations. The exception types are listed in Table 4.9.

Table 4.9. Exception Types

Exception NumberCMSIS Interrupt NumberException TypePriorityFunction
1Reset−3 (Highest)Reset
2−14NMI−2Non-Maskable interrupt
3−13HardFault−1All classes of fault, when the corresponding fault handler cannot be activated because it is currently disabled or masked by exception masking
4−12MemManageSettableMemory Management fault; caused by MPU violation or invalid accesses (such as an instruction fetch from a non-executable region)
5−11BusFaultSettableError response received from the bus system; caused by an instruction prefetch abort or data access error
6−10Usage faultSettableUsage fault; typical causes are invalid instructions or invalid state transition attempts (such as trying to switch to ARM state in the Cortex-M3)
7–10Reserved
11−5SVCSettableSupervisor Call via SVC instruction
12−4Debug monitorSettableDebug monitor – for software based debug (often not used)
13Reserved
14−2PendSVSettablePendable request for System Service
15−1SYSTICKSettableSystem Tick Timer
16–2550–239IRQSettableIRQ input #0–239

Each exception source has an exception number. Exception numbers 1 to 15 are classified as system exceptions, and exceptions 16 and above are for interrupts. The design of the NVIC in the Cortex-M3 and Cortex-M4 processors can support up to 240 interrupt inputs. However, in practice the number of interrupt inputs implemented in the design is far less, typically in the range of 16 to 100. In this way the silicon size of the design can be reduced, which also reduces power consumption.

The exception number is reflected in various registers, including the IPSR, and it is used to determine the exception vector addresses. Exception vectors are stored in a vector table, and the processor reads this table to determine the starting address of an

Image

FIGURE 4.25. Various exception sources

exception handler during the exception entrance sequence. Note that the exception number definitions are different from interrupt numbers in the CMSIS device-driver library. In the CMSIS device-driver library, interrupt numbers start from 0, and system exception numbers have negative values.

As opposed to classic ARM processors such as the ARM7TDMI™, there is no FIQ (Fast Interrupt) in the Cortex-M processor. However, the interrupt latency of the Cortex-M3 and Corex-M4 is very low, only 12 clock cycles, so this does not cause problems.

Reset is a special kind of exception. When the processor exits from a reset, it executes the reset handler in Thread mode (rather than Handler mode as in other exceptions). Also the exception number in IPSR is read as zero.

4.5.2 Nested vectored interrupt controller (NVIC)

The NVIC is a part of the Cortex®-M processor. It is programmable and its registers are located in the System Control Space (SCS) of the memory map (see Figure 4.18). The NVIC handles the exceptions and interrupt configurations, prioritization, and interrupt masking. The NVIC has the following features:

Flexible exception and interrupt management

Nested exception/interrupt support

Vectored exception/interrupt entry

Interrupt masking

Flexible exception and interrupt management

Each interrupt (apart from the NMI) can be enabled or disabled and can have its pending status set or cleared by software. The NVIC can handle various types of interrupt sources:

Pulsed interrupt request – the interrupt request is at least one clock cycle long. When the NVIC receives a pulse at its interrupt input, the pending status is set and held until the interrupt gets serviced.

Level triggered interrupt request – the interrupt source holds the request high until the interrupt is serviced.

The signal level at the NVIC input is active high. However, the actual external interrupt input on the microcontroller could be designed differently and is converted to an active high signal level by on-chip logic.

Nested exception/interrupt support

Each exception has a priority level. Some exceptions, such as interrupts, have programmable priority levels and some others (e.g., NMI) have a fixed priority level. When an exception occurs, the NVIC will compare the priority level of this exception to the current level. If the new exception has a higher priority, the current running task will be suspended. Some of the registers will be stored on the stack memory, and the processor will start executing the exception handler of the new exception. This process is called “preemption.” When the higher priority exception handler is complete, it is terminated with an exception return operation and the processor automatically restores the registers from stack and resumes the task that was running previously. This mechanism allows nesting of exception services without any software overhead.

Vectored exception/interrupt entry

When an exception occurs, the processor will need to locate the starting point of the corresponding exception handler. Traditionally, in ARM® processors such as the ARM7TDMI™, software handles this step. The Cortex-M processors automatically locate the starting point of the exception handler from a vector table in the memory. As a result, the delays from the start of the exception to the execution of the exception handlers are reduced.

Interrupt masking

The NVIC in the Cortex-M3 and Cortex-M4 processors provide several interrupt masking registers such as the PRIMASK special register. Using the PRIMASK register you can disable all exceptions, excluding HardFault and NMI. This masking is useful for operations that should not be interrupted, like time critical control tasks or real-time multimedia codecs. Alternatively you can also use the BASEPRI register to select mask exceptions or interrupts which are below a certain priority level.

The CMSIS-Core provides a set of functions to make it easy to access various interrupt control functions. The flexibility and capability of the NVIC also make the Cortex-M processors very easy to use, and provide better a system response by reducing the software overhead in interrupt processing, which also leads to smaller code size.

4.5.3 Vector table

When an exception event takes place and is accepted by the processor core, the corresponding exception handler is executed. To determine the starting address of the exception handler, a vector table mechanism is used. The vector table is an array of word data inside the system memory, each representing the starting address of one exception type (Figure 4.26). The vector table is relocatable and the relocation is controlled by a programmable register in the NVIC called the Vector Table Offset Register (VTOR). After reset, the VTOR is reset to 0; therefore, the vector table is located at address 0x0 after reset.

Image

FIGURE 4.26. Exception types (LSB of exception vectors should be set to 1 to indicate Thumb state)

For example, if the reset is exception type 1, the address of the reset vector is 1 times 4 (each word is 4 bytes), which equals 0x00000004, and the NMI vector (type 2) is located at 2 x 4 = 0x00000008. The address 0x00000000 is used to store the starting value of the MSP.

The LSB of each exception vector indicates whether the exception is to be executed in the Thumb state. Since the Cortex®-M processors can support only Thumb instructions, the LSB of all the exception vectors should be set to 1.

4.5.4 Fault handling

Several types of exceptions in the Cortex-M3 and Cortex®-M4 processors are fault handling exceptions. Fault exceptions are triggered when the processor detects an error such as the execution of an undefined instruction, or when the bus system returns an error response to a memory access. The fault exception mechanism allows errors to be detected quickly, and potentially allows the software to carry out remedial actions (Figure 4.27).

Image

FIGURE 4.27. Fault exceptions usages

By default the Bus Fault, Usage Fault, and Memory Management Fault are disabled and all fault events trigger the HardFault exception. However, the configurations are programmable and you can enable the three programmable fault exceptions individually to handle different types of faults. The HardFault exception is always enabled.

Fault exceptions can also be useful for debugging software issues. For example, the fault handler can automatically collect information and report to the user or other systems that an error has occurred and provide debug information. A number of fault status registers are available in the Cortex-M3 and Cortex-M4 processors, which provide hints about the error sources. Software developers can also examine these fault status registers using the debugger during software development.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B978012408082900004X