# Nexus zkVM 3.0 Specification

Michel Abdalla, Arka Rai Choudhuri, Jens Groth, Yoichi Hirai, Ben Hoberman, Samuel Judson, Daniel Marin, Victor S. Miller, Duc Tri Nguyen, Evan Schott, and Kristian Sosnin

Nexus research@nexus.xyz

March 6, 2025

revised March 11, 2025

#### Abstract

We describe the architecture of the Nexus zkVM 3.0, the third iteration of the Nexus zero-knowledge virtual machine based on the Stwo prover. The Nexus zkVM is a general-purpose verifiable processor, designed to prove the correct execution of arbitrary computations. Specifically, the Nexus zkVM emulates virtual machine execution and produces succinct proofs of correct computation.

Like previous versions of the Nexus zkVM, the instruction set used by the Nexus Virtual Machine is based on the popular RISC-V instruction set architecture so that existing developer tooling can be used with little modification. Version 3.0, however, uses a Harvard architecture in which the program being executed resides in a read-only memory space separate from the data.

In addition to the architectural change, another major change to the zkVM frontend is that constraints are now specified using Algebraic Intermediate Representation (AIR) and can work over the Mersenne prime field M31 used by the Stwo prover.

In order to describe the new arithmetization, we split the constraints for the zkVM into several components:

- CPU: Responsible for fetching, decoding, and preparing instructions for execution
- Register memory: Manages access to the read-write registers
- Program memory: Manages access to the read-only byte-addressable program memory
- Data memory: Manages access to the read-write byte-addressable random-access memory
- Instruction execution: Deals with the actual execution of instructions

Since these components may depend on each other, we also specify the interaction between the different components used in the Nexus zkVM.

We also undertake a thorough experimental evaluation of our zkVM, and find that it sustains a proving rate of  $\sim 10$ -15 kHz given sufficient computational resources.

# Contents

| 1 | Intr | roduction 1                                                 |
|---|------|-------------------------------------------------------------|
|   | 1.1  | Design overview                                             |
|   | 1.2  | zkVM components                                             |
|   |      | 1.2.1 CPU component                                         |
|   |      | 1.2.2 Memory components                                     |
|   |      | 1.2.3 Execution component                                   |
|   |      | 1.2.4 Proof Generation                                      |
|   | 1.3  | Outline                                                     |
|   |      |                                                             |
| 2 |      | e Nexus zkVM machine architecture                           |
|   | 2.1  | Architectural overview                                      |
|   | 2.2  | Execution model                                             |
|   |      | 2.2.1 Two-pass tracing                                      |
|   |      | 2.2.2 Execution environment                                 |
|   |      | 2.2.3 Precompiles                                           |
|   | 2.3  | Memory layout                                               |
|   |      | 2.3.1 Registers                                             |
|   |      | 2.3.2 Well-known location pointers                          |
|   |      | 2.3.3 Program memory                                        |
|   |      | 2.3.4 Public input                                          |
|   |      | 2.3.5 Associated data                                       |
|   |      | 2.3.6 Public output and exit code                           |
|   |      | 2.3.7 Data memory                                           |
|   | 2.4  | Basic instruction set and binary encoding                   |
|   | 2.5  | Instruction encoding                                        |
| • | ъ    | <b>3.</b>                                                   |
| 3 |      | liminaries 14                                               |
|   | 3.1  | Stwo underlying algebraic structures                        |
|   | 3.2  | Offline Memory Checker                                      |
|   |      | 3.2.1 A simple offline memory checker                       |
|   | 0.0  | 3.2.2 A concrete proposal based on logups                   |
|   | 3.3  | Lookup tables for range checks                              |
|   | 3.4  | Lookup tables for bitwise operations                        |
|   |      | 3.4.1 Bitwise lookup trace elements                         |
|   |      | 3.4.2 Bitwise operation mapping function                    |
|   |      | 3.4.3 Bitwise operation helper functions                    |
|   |      | 3.4.4 Bitwise logup computations – 1 triple per row         |
|   |      | 3.4.5 Bitwise logup computations – multiple triples per row |
|   |      | 3.4.6 Bitwise lookup table constraints                      |
| 4 | CPI  | U component 22                                              |
| 4 | 4.1  | CPU trace elements                                          |
|   | 4.1  | CPU constraints assuming large fields                       |
|   | 4.2  | 4.2.1 Range checks                                          |
|   |      | 9                                                           |
|   |      | 4.2.2 Instruction flag constraints                          |
|   |      | 4.2.3 Arithmetic constraints                                |
|   | 4.0  | 4.2.4 Interactions with other components                    |
|   | 4.3  | CPU constraints assuming small fields                       |
|   |      | 4.3.1 Range checks                                          |
|   |      | 4.3.2 Instruction flag constraints                          |

|   |           | 4.3.3<br>4.3.4 | Arithmetic constraints                                             |   |
|---|-----------|----------------|--------------------------------------------------------------------|---|
| 5 | Reg       | rister n       | nemory component 46                                                | ß |
| • | 5.1       |                | and write operations                                               |   |
|   | 5.2       |                | er memory trace elements                                           |   |
|   | 5.2       | _              | er memory trace elements                                           |   |
|   | 5.3 - 5.4 | _              | · ·                                                                |   |
|   | -         | _              | v                                                                  |   |
|   | 5.5       | _              | er memory constraints assuming large fields                        |   |
|   |           | 5.5.1          | Range checks                                                       |   |
|   |           | 5.5.2          | Arithmetic constraints                                             |   |
|   |           | 5.5.3          | Logup computations                                                 |   |
|   | 5.6       | _              | er memory constraints assuming small fields                        |   |
|   |           | 5.6.1          | Range checks                                                       |   |
|   |           | 5.6.2          | Arithmetic constraints                                             |   |
|   |           | 5.6.3          | Logup computations                                                 |   |
|   | 5.7       | Consti         | eaints and logup computation for initial write and final read sets | 3 |
|   |           | 5.7.1          | Trace elements for initial write and final read sets               | 4 |
|   |           | 5.7.2          | Constraints for register address values                            | 4 |
|   |           | 5.7.3          | Logup computation for initial write and final read sets            | 4 |
| 6 | Pro       | gram ı         | nemory component 55                                                | 5 |
|   | 6.1       |                | am memory trace elements                                           |   |
|   | 6.2       | _              | am memory read operations                                          |   |
|   | 6.3       |                | am memory initialization                                           |   |
|   | 6.4       | _              | am memory interface                                                |   |
|   | 6.5       | _              | am memory alignment and addressing                                 |   |
|   | 6.6       |                | am memory constraints assuming large fields                        |   |
|   | 0.0       | 6.6.1          | Range checks                                                       |   |
|   |           | 6.6.2          | Arithmetic constraints                                             |   |
|   |           | 6.6.3          | Logup computations                                                 |   |
|   | 6.7       |                | am memory constraints assuming small fields                        |   |
|   | 0.7       | 6.7.1          | Range checks                                                       |   |
|   |           | 6.7.1          | Arithmetic constraints                                             |   |
|   |           | 6.7.2          |                                                                    |   |
|   | 6.8       |                |                                                                    |   |
|   | 0.8       |                | 0 1 1                                                              |   |
|   |           | 6.8.1          |                                                                    |   |
|   |           | 6.8.2          | Constraints for ensuring uniqueness of base address values         |   |
|   |           | 6.8.3          | Range checks                                                       |   |
|   |           | 6.8.4          | Logup computation for initial write and final read sets            | 2 |
| 7 | Dat       |                | ory component 65                                                   |   |
|   | 7.1       | Read a         | and write operations                                               | 3 |
|   | 7.2       | Data 1         | nemory trace elements                                              | 4 |
|   | 7.3       | Data r         | nemory initialization                                              | 5 |
|   | 7.4       | Data 1         | nemory interface                                                   | 5 |
|   | 7.5       | Data 1         | nemory constraints assuming large fields                           | 6 |
|   |           | 7.5.1          | Range checks                                                       | 6 |
|   |           | 7.5.2          | Arithmetic constraints                                             |   |
|   |           | 7.5.3          | Logup computations                                                 |   |
|   | 7.6       |                | nemory constraints assuming small fields                           |   |
|   |           | 7.6.1          | Range checks                                                       |   |

|   |            | 7.6.2 Arithmetic constraints                                            |
|---|------------|-------------------------------------------------------------------------|
|   |            | 7.6.3 Logup computations                                                |
|   | 7.7        | Constraints and logup computation for initial write and final read sets |
|   |            | 7.7.1 Trace elements for initial write and final read sets              |
|   |            | 7.7.2 Arithmetic constraints                                            |
|   |            | 7.7.3 Range checks                                                      |
|   |            | 7.7.4 Logup computation for initial write and final read sets           |
|   |            |                                                                         |
| 8 | Exe        | cution component 74                                                     |
|   | 8.1        | Instruction execution trace elements                                    |
|   | 8.2        | Instruction execution interface                                         |
|   | 8.3        | Basic Instruction Set: flags                                            |
|   | 8.4        | Basic Instruction Set: Common constraints                               |
|   |            | 8.4.1 Constraints assuming large fields                                 |
|   |            | 8.4.2 Constraints assuming small fields                                 |
|   | 8.5        | Basic Instruction Set: ALU Instructions                                 |
|   | 0.0        | 8.5.1 ADD Instruction                                                   |
|   |            | 8.5.2 SUB Instruction                                                   |
|   |            | 8.5.3 SLTU Instruction                                                  |
|   |            | 8.5.4 SLT Instruction                                                   |
|   |            | 8.5.5 SLL Instruction                                                   |
|   |            | 8.5.6 SRL Instruction                                                   |
|   |            | 8.5.7 SRA Instruction                                                   |
|   |            | 8.5.8 XOR Instruction                                                   |
|   |            | 8.5.9 AND Instruction                                                   |
|   |            | 8.5.10 OR Instruction                                                   |
|   | 8.6        | Basic Instruction Set: Branch Instructions                              |
|   | 0.0        | 8.6.1 BEQ Instruction                                                   |
|   |            | 8.6.2 BNE Instruction                                                   |
|   |            |                                                                         |
|   |            | 8.6.3 BLTU Instruction                                                  |
|   |            | 8.6.4 BLT Instruction                                                   |
|   |            | 8.6.5 BGEU Instruction                                                  |
|   | ~ <b>-</b> | 8.6.6 BGE Instruction                                                   |
|   | 8.7        | Basic Instruction Set: Load Instructions                                |
|   |            | 8.7.1 LB Instruction                                                    |
|   |            | 8.7.2 LH Instruction                                                    |
|   |            | 8.7.3 LW Instruction                                                    |
|   |            | 8.7.4 LBU Instruction                                                   |
|   |            | 8.7.5 LHU Instruction                                                   |
|   | 8.8        | Basic Instruction Set: Store Instructions                               |
|   |            | 8.8.1 SB Instruction                                                    |
|   |            | 8.8.2 SH Instruction                                                    |
|   |            | 8.8.3 SW Instruction                                                    |
|   | 8.9        | Basic Instruction Set: Jump Instructions                                |
|   |            | 8.9.1 JAL Instruction                                                   |
|   |            | 8.9.2 JALR Instruction                                                  |
|   | 8.10       | Basic Instruction Set: LUI and AUIPC Instructions                       |
|   |            | 8.10.1 LUI Instruction                                                  |
|   |            | 8.10.2 AUIPC Instruction                                                |
|   | 8.11       | Basic Instruction Set: System Instructions                              |
|   |            | 8.11.1 ECALL Instruction                                                |
|   |            | 8.11.2 EBREAK Instruction 12                                            |

| 9 | $\mathbf{E}\mathbf{x}\mathbf{p}$ | perimental evaluation   | 128 |
|---|----------------------------------|-------------------------|-----|
|   | 9.1                              | Proving microbenchmarks | 128 |
|   | 9.2                              | End-to-end benchmarks   | 129 |

### 1 Introduction

Nexus is a scientific endeavor whose main mission is to build a distributed infrastructure for verifiable computation, thus enabling proofs of correct execution of arbitrary programs, regardless of the size of the computation, choice of programming language, or computer architecture. One of the main tools for building such a distributed infrastructure is the Nexus zkVM, a zero-knowledge virtual machine that emulates virtual machine execution and produces succinct proofs of correct computation.

In [Nex24], Nexus introduced the first version of the Nexus zkVM, Nexus zkVM v1.0, which presents a general-purpose recursive zero-knowledge proof system enabling incrementally verifiable computation (IVC) [Val08], a powerful primitive that allows proving the correct execution of arbitrarily large computations. To achieve this goal, Nexus zkVM v1.0 relied on recursive proof systems, such as Nova [KST22], SuperNova [KS22], and CycleFold [KS23]. In the second iteration of the Nexus zkVM, Nexus zkVM v2.0, Nexus additionally included the HyperNova proof system [KS24] as well as an integration with Jolt [AST24], a zkVM frontend based on the Lasso lookup argument [STW24a].

In this document, we specify the Nexus zero-knowledge Virtual Machine, Nexus zkVM v3.0, which uses the Stwo prover [STW24b] in the backend.

Like previous versions of the Nexus zkVM [Nex24], the instruction set used by the Nexus Virtual Machine is based on the RISC-V RV32I instruction set [RIS19], so existing developer tooling can be used with little modification. However, several important changes are being introduced:

- The new zkVM version now uses a (modified) Harvard architecture in which the program being executed resides in a read-only memory space separate from the data.
- Similarly to Jolt [AST24], the new Nexus zkVM is designed around a "pay only for what you
  use" memory architecture, where unused memory does not need to be proven. In particular, the
  Nexus zkVM now operates on a two-pass tracing architecture, first executing the program to
  obtain statistics about the resultant memory usage, and then executing it again in a modified
  Harvard architecture with a fixed-memory organization determined from the statistics of the
  first execution.
- The backend now uses the Stwo prover by StarkWare, which is based on the Circle STARK
  protocol [HLP24].
- The constraints are now specified using Algebraic Intermediate Representation (AIR) and can work over the Mersenne prime field m31 used by the Stwo prover.

In this document, we focus on the description of the frontend for the new Nexus zkVM, which is responsible for transforming the program execution into an arithmetic representation of the execution which is suitable for the Stwo backend prover.

### 1.1 Design overview

In order to describe Nexus zkVM 3.0, we divide the specification into several components, each of which is responsible for dealing with a particular task:

- CPU: Responsible for fetching, decoding, and preparing instructions for execution
- Register memory: Manages access to the read-write registers
- Program memory: Manages access to the read-only byte-addressable program memory
- Data memory: Manages access to the read-write byte-addressable random-access memory
- Execution: Deals with the actual execution of instructions

Moreover, since the different components may depend on each other, we also specify the interaction between the different components used in the zkVM.

Component representation: In our specification, each component will represented in terms of trace matrices and constraints.

- Trace matrix: This is a matrix of wire values used by the component. In particular, each trace column is a vector of wire values that share some common attribute. For example, we may have one column that keeps track of the instruction opcode and another column that keeps track of the value of the destination register used in each cycle.
- List of Constraints: These constraints resemble the gate functionalities. Each constraint is applied to a subset of rows over a subset of columns of the trace matrix. For example, we may require that the sum of the first and second columns equal the third column for all rows. In our specification, the following types of constraints are used:
  - Finite field arithmetic constraints: Addition and multiplication in some finite field.
  - Lookup Constraints, including both unindexed and indexed lookup relations.

Finite field representation of wire values: The Nexus zkVM represents each wire value as a field element in the Mersenne prime field m31 used by the Stwo prover.

### 1.2 zkVM components

As stated above, there are 5 components in the Nexus zkVM: a *CPU* component, 3 memory components to handle accesses to the *program*, *register*, and *data* memories, and an *execution* component.

### 1.2.1 CPU component

The CPU component of the Nexus zkVM is the component that emulates the behavior of the CPU of the virtual machine, and hence, plays a central role. In particular, this component is responsible for fetching, decoding, and preparing instructions for execution.

In order to achieve these goals, the CPU component performs the following tasks in each CPU cycle:

- Interacts with the program memory component to fetch the next instruction at the address pointed by the program counter;
- Decodes the instruction and check the correctness of its format;
- Interacts with the register memory component to read the values associated with the instructions operands, when necessary;
- Interacts with the execution component to execute the instruction; and
- Interacts with the register memory component to update register contents, when necessary.

The enforcement of the correctness of tasks, such as the sign extension of immediate values or the encoding of instructions, is handled exclusively by the CPU component.

### 1.2.2 Memory components

The Nexus zKVM has different components for handling the behavior of the three types of memory used by the Nexus Virtual Machine: the *program memory*, the *register memory* and *data memory*. While the program memory is a read-only memory space that stores the program being executed, both the register and data memories are read-write random-access memories of sizes 32 and  $2^{32}$ , respectively. Both the program memory and the data memory are byte-addressable, while the register memory stores 32-bit words.

In order to maintain the consistency of the accesses to the register and data memories, the Nexus zkVM uses well-known offline memory checking techniques [BEG<sup>+</sup>94], which we recall in Section 3.2. The main advantage of this technique is that one does not need to keep track of the actual status of the running memory. Instead, the memory checking algorithm only keeps a trace digest of memory accesses, which is inexpensive and can be updated at a cost that is independent of the size of the memory.

In the case of the program memory component, a simpler memory checking technique can be used to maintain the consistency of the memory accesses. In particular, instead of keeping a timestamp for each memory cell, it suffices to associate a counter to each memory cell to keep track of the number of times that each cell has been read.

In this version of the Nexus zkVM, the computation of the digest is implemented using logarithmic derivatives aka logups [EKRN24, Hab22].

### 1.2.3 Execution component

The final component of the Nexus zkVM is the execution component, which is responsible for enforcing the correct execution of the instructions supported by the Nexus Virtual Machine.

Currently, this component provides support for the Nexus Virtual Machine instruction set described in Section 2, which is closely related to the RISC-V RV32I instruction set in the Volume I, Unprivileged Specification version 20191213 [RIS19]. As a result, existing tooling for RISC-V RV32I can usually be used without modification.

#### 1.2.4 Proof Generation

While we have described the various components and constraints separately, it is important to note that the zkVM produces a *single* proof that attests to the correct computation of *all* the components. When describing the constraints for the various components, we use many different variables – some of which appear across all components, and some that are specific to individual components. Since there is only a single proof that is generated, there is in turn only a single trace matrix, with each of these variables corresponding to one (or more) column(s) in the trace matrix. To ensure that only the relevant variables are considered for each component, the constraints utilize component-specific *flag* variables, which effectively ignore columns corresponding to irrelevant variables.

### 1.3 Outline

- Section 2 provides an overview of the new zkVM architecture, highlighting, in particular, the design features of the machine architecture, such as the new memory layout and the set of supported environment calls. This section also introduces a new two-pass tracing mechanism that is used to improve the memory usage of the Nexus zkVM. Section 2 also recalls the Nexus Virtual Machine instruction set and its encoding.
- Section 3 explains a few useful tools and concepts used in the new zkVM, such as offline memory checking and its implementation based on logups. This section also discusses lookup tables for range checks and bitwise operations, which are used throughout the new Nexus zkVM design.
- Section 4 specifies the CPU component that is responsible for fetching, decoding, and preparing
  instructions for execution.
- Section 5 details the read-write register memory component, which is responsible for managing
  access to the registers.
- Section 6 describes the program memory component, which is responsible for managing access to the read-only program memory.
- Section 7 reviews the specification of the read-write data memory component, which is responsible for managing access to the data memory.
- Section 8 specifies the instruction execution component, which is responsible for enforcing constraints that guarantee the correct execution of the instructions supported by the Nexus Virtual Machine.
- Section 9 reviews the results of an experimental evaluation of our zkVM, including both microbenchmarks focusing on prover performance and end-to-end benchmarks that capture the wall-clock time performance that would be observed by an end-user.

• On page 136, we list all the trace variables used in our constraints, along with the pages where they appear.

### 2 The Nexus zkVM machine architecture

The Nexus zkVM defines a map zkVM:  $(P, \mathsf{cfg}, x, y) \mapsto (z, \pi)$  where P is a program,  $\mathsf{cfg}$  is the machine configuration which specifies details such as memory size and maximum execution time, x is a public input, y is a private input, z is the output (or an error), and  $\pi$  is a succinct proof. This proof asserts that when running P in configuration  $\mathsf{cfg}$  on inputs x, y the output is z, or  $P[\mathsf{cfg}](x,y) = z$ . In isolation we can consider the zkVM's machine architecture zkVMMA:  $(P,\mathsf{cfg},x,y)\mapsto z$ . The primary use of this machine architecture is to support the verifiable computation of  $(z,\pi)$ . However, it is a well-defined virtual machine in its own right, and we describe it in detail here. In particular, we highlight the design features of the machine architecture that — though idiosyncratic for traditional general purpose computation — are useful for verifiable computation, such as the particular memory layout and the set of supported environment calls.

### 2.1 Architectural overview

The  $zkVM_{MA}$  is built around an instruction set, not the same as, but close enough to the RISC-V RV32I instruction set in the Volume I, Unprivileged Specification version 20191213 [RIS19] that existing tooling can be used with usually no modification. The primary difference is the lack of a few supported instructions, such as fence and ebreak. The  $zkVM_{MA}$  uses a modified Harvard architecture (similar to the one adopted by Jolt [AST24]), in which the program, data, and input-output memory segments are distinct and permissioned with respect to whether they can be read from, written to, or both. The  $zkVM_{MA}$  has 32-bit words, 32 registers  $x0, x1, \ldots, x31$ , and memory addresses range over  $[0, 2^{32} - 1]$ .

As a virtual machine, the  $\mathsf{zkVM}_{\mathsf{MA}}$  is ultimately a host that executes guest programs. Guest programs will usually be compiled with the Nexus  $\mathsf{zkVM}$  runtime. This runtime,  $\mathsf{nexus-rt}$ , is a mixed assembly/Rust program into which the guest program is linked and which provides the guest program with a suite of helpful macros and methods. Although the  $\mathsf{zkVM}_{\mathsf{MA}}$  does not strictly require use of this runtime, its use greatly eases program development by ensuring compliance with the modified Harvard architecture. The runtime does so by abstracting away the most idiosyncratic features of the  $\mathsf{zkVM}_{\mathsf{MA}}$ , such as its environment calls and its two distinct input interfaces: the public input segment containing x and the private input tapes containing y. The inputs x and y are separated to distinguish between public and private information when executing the full, proving  $\mathsf{zkVM}$ . Since x is part of the public initialization of the  $\mathsf{zkVM}_{\mathsf{MA}}$  and must be known to the verifier, we incorporate it into (its own segment of) the memory of the machine. Conversely, y is kept external in the environment of the  $\mathsf{zkVM}_{\mathsf{MA}}$ , and must be accessed by the guest program via an environment call.

A (guest) program is a sequence of RISC-V instructions. From a practical perspective, we envision P being given to  $\mathsf{zkVM}_{\mathsf{MA}}$  encoded as an ELF file and then  $\mathsf{cfg}, x,$  and y being given to P on invocation, resulting in the output z. Each instruction is specified via an opcode and takes up to three arguments, one of which can be an immediate value. The opcode and possible arguments are:

- opcode is a 7-bit string defining the instruction;
- func3 and func7 are optional bits that further specify the instruction;
- rd is a register selector specifying the destination register;
- rs1 is a register selector specifying the first operand;
- rs2 is a register selector specifying the second operand;
- imm is an immediate value (5, 12 or 20 bits depending on the opcode).

Each instruction is encoded as a 32-bit-long string, starting with 7-bit-long opcode string, followed by an encoding of the arguments, whose format varies with the instruction type. At initialization, all the

general-purpose registers are set to 0. The program counter pc is set to the entrypoint of the binary being executed as specified in its ELF representation. The first instruction to be executed will be the one stored at that position in the program memory. The program counter pc is always advanced by 4 bytes after the execution of each instruction, unless the instruction itself sets the value of pc.

### 2.2 Execution model

### 2.2.1 Two-pass tracing

Following Jolt, the Nexus zkVM is designed around a 'only prove what you use' memory architecture, where unused memory does not need to be proven. As a trade-off, and again like Jolt, the zkVM requires that the amounts of memory used by most of the program segments (all but the heap) must be known before execution — even though the sizes of the stack and output are often execution-dependent. To avoid this chicken-and-egg problem, the  $zkVM_{MA}$  operates on a two-pass tracing architecture: the program is first executed in a (mostly) traditional Harvard architecture, and statistics are kept as to the resultant memory usage. The program P is then executed again using the same x, y in a modified Harvard architecture with a fixed-memory organization determined from the statistics of the first execution, which is more conducive to proving.

Formally, we consider only the second of these passes to be the zkVM<sub>MA</sub> that defines what a 'correct' execution is. The first pass is considered a usability optimization, but we describe it as well throughout this section for completeness. The first pass can even be skipped and a fixed memory model can be used from the beginning, which may also be useful when the execution-dependent information about memory usage inherent in the 'only prove what you use' model constitutes an unacceptable privacy leakage that relegates the machine from being a zkVM into just a so-called 'succinctVM'.

#### 2.2.2 Execution environment

Through environment calls (i.e., the RISC-V ecall instruction), the executed program can interact with the execution environment. The  $zkVM_{MA}$  supports six environment calls. Two of these calls are used for debugging (logging) and optimization (cycle counting), and these are no-ops during the second, proven tracing. The other four, OverwriteStackPointer, OverwriteHeapPointer, ReadFromPrivateInput, and Exit, functionally affect the traced program execution. The first pair of these calls, OverwriteStackPointer and OverwriteHeapPointer, are used by the runtime to move the heap and stack pointers to their reserved memory segments (see Section 2.3) before control is handed over to the user-supplied guest program. The other two, ReadFromPrivateInput and Exit, are more effectual to the result (z) of the execution itself. As the name indicates, ReadFromPrivateInput reads a byte of the private input y off the private input tape into the registers. These private input bytes are otherwise unchecked — like all other register values they will form part of the private witness, and the prover does not otherwise constrain them.

Meanwhile, Exit is used to immediately halt the program's execution. Although Exit takes an input for debugging purposes during the first tracing pass, this value is ignored in proving: instead, the contents of the exit code and public output memory segments at the time the exit call is made form the output of the program. Exit code 0 is used by the runtime to indicate a successful execution, while code 1 is used to indicate a panic thrown by the program. The runtime also exposes a convenience function that allows users to have guest programs exit with a supplied exit code, analogous to the exit system calls (like Rust's std::process::exit(code: i32) -> !) provided by most programming languages. It is important to note that a non-zero exit code indicates only a failure of the guest program, and the zkVM will prove such executions the same as it proves ones that halt successfully. This allows the zkVM to be used to prove faulty, buggy, or otherwise dangerous executions of programs, enabling applications of zero-knowledge verifiable computation to, e.g., coordinated disclosure of security vulnerabilities as foreseen by the DARPA SIEVE program [SIE21].



Figure 1: The memory architectures of the two-pass model. During the first Harvard pass shown in (a), the memory is split into a variable-sized read-write RAM containing the program, heap, and stack, a read-only public input (prefixed by a word is containing the input size), a write-only public output (prefixed by a word ec to where the exit code is written), and a no-access associated data section. In the modified Harvard pass shown in (b), the memory segments are now unified into a single linear fixed-size address space, albeit with the same reading and writing permissions structure (omitted from the figure). A space is reserved for the CPU registers at the beginning of this address space, but individual provers can choose whether to identify the registers with those addresses or maintain them in a separate namespace (such as the x0-x31 naming used by the RISC-V spec).

### 2.2.3 Precompiles

In addition to the RISC-V RV32I instruction set, the  $zkVM_{MA}$  is also designed to support precompiles, which are custom instructions that implement more complex functionality such as cryptographic hash functions or elliptic curve operations. The runtime supports defining and linking in precompiles within a compiled ELF binary, as well as integrating with an extensibility hardpoint exposed by zkVM that invokes the precompile library to both (a) execute the computation during tracing; and (b) provide suitable constraints for use to prove the correctness of the instruction evaluation.

These custom instructions are generated by the runtime using an LLVM directive (.insn), and so are treated as first-class instructions within zkVM<sub>MA</sub> for any execution they are linked in for, rather than being invoked via environment call. Precompiles offer exceptional performance benefits when the cost of constraining their outputs can be expressed much more simply than as the sum of constraints for the (usually long) list of instructions that computing their output natively would involve.

### 2.3 Memory layout

Fig. 1 shows the memory layout of the  $zkVM_{MA}$  for both the first (a) and second (b) passes. Each memory has three attributes: in which address space it exists, with what permission structure (readwrite, read-only, write-only, or no access), and whether it is of fixed or variable size. For the first pass, the organization of the memory is a mostly traditional Harvard architecture, with five distinct address spaces: (i) the cpu registers, (ii) the length (is) and content of the public input, (iii) the error code (ec) and content of the public output, (iv) the associated data, and (v) the RAM containing both the program and data segments (including the stack and heap). Other than the joining of the program and data memories this forms a relatively traditional Harvard architecture. Each of these memory segments is discussed in greater detail in the remainder of the section. In terms of permissions and sizes, (i) is read-write and fixed, (ii) is read-only and fixed, (iii) is write-only and variable, (iv) is no-access and fixed, and (v) is variable and mixed read-only (for static global variables) and read-write (for the remaining global variables and the entire data memory). Further, for (v) the guest program itself has no access to its instructions: they can only be accessed by the CPU.

For the second tracing pass, the memories are combined into a fixed-size, linear memory layout with one single, unified address space ranging over  $[0, 2^{32} - 1]$ . As this is the memory layout used

in tracing, the  $zkVM_{MA}$  is best understood as a modified Harvard architecture, as the permission structures on the segments are maintained but they no longer exist in distinct memories. Also, a small additional read-only 8-byte segment containing well-known (wk) location pointers is introduced to enable the runtime to successfully access the now relocated public input and output segments.

### 2.3.1 Registers

The  $\mathsf{zkVM}_{\mathsf{MA}}$  has 32 registers that hold 32-bit word values. The registers are addressed by 5-bit register selectors  $\{\mathsf{x0},\ldots,\mathsf{x31}\}$ . We use the convention that the variables rs1 and rs2 refer to 'source' register selectors for values that are read in an instruction but left unchanged, while rd is a 'destination' register selector referring to a value that may be changed. We write  $R[\ ]$  for the array of current register values, e.g.,  $R[\mathrm{rd}]$  refers to the value of the register indicated by the 5-bit register selector rd. Register  $\mathsf{x0}$  always holds the value 0, i.e., if an instruction updates  $R[\mathrm{rd}=\mathsf{x0}]$  to a non-zero word the instruction will go through but the register will immediately be reset to  $R[\mathsf{x0}]=0$ .

When tracing the program the  $zkVM_{MA}$  stores the state of the registers separate from its record of memory operations. However, it also reserves the addresses 0x00 through 0x7F, so that prover integrations are able to identify the registers with those addresses should the prover need to consider the entire state of the machine to lie within a single address space (as, e.g., does Jolt [AST24]). For example, a prover might treat R[x1] as being stored in the word beginning at 0x04. At present however this is unused – the zkVM treats the registers as existing in an independent namespace, and the first 0x80 addresses as containing zero-initialized, non-writeable memory.

### 2.3.2 Well-known location pointers

When tracing, the  $zkVM_{MA}$  also reserves two words of memory, the first from 0x80 to 0x83 and the second from 0x84 to 0x87, which contain pointers to other memory locations for use by the runtime to manage input and output handling. These pointers existing in a well-known location for the runtime to access enables the  $zkVM_{MA}$  to dynamically situate the input and output memory segments within the unified, linear architecture while still allowing the guest program easy access to their contents, as further discussed below. The pointers contained within these two memory locations are fixed by the  $zkVM_{MA}$  during initialization and can be considered part of the configuration cfg of the zkVM.

### 2.3.3 Program memory

The zkVM<sub>MA</sub> executes a program P encoded in an ELF binary. The core of such a binary is the program consisting of a sequence of instructions and supporting read-only data (such as that contained in the .rodata segment). Before the zkVM<sub>MA</sub> executes P, this data must be loaded into a read-only segment of program memory, and after that remains unchanged during execution. All instructions are encoded as 32-bit words and addressed via a 32-bit program counter pc. The program memory is byte-addressable. However, since each opcode is 32-bits long, the program memory enforces 4-byte-memory alignment and raises an instruction-address-misaligned exception whenever this condition is not satisfied. We write  $P[\ ]$  for the array of program instructions. The zkVM<sub>MA</sub> will start executing P at the entrypoint specified in the ELF, *i.e.*, if the entrypoint is 0 then the first instruction to be executed is P[0]. The program must have at least 1 instruction and its size |P| cannot exceed  $2^{32}$ . If a program counter pc  $\geq |P|$  is used, the zkVM<sub>MA</sub> will raise an exception. The zkVM<sub>MA</sub> halts execution when either pc reaches an unimplemented instruction or when an exit environment call is made.

Additionally, the binary may contain read-write segments, such as the .bss and .data segments commonly used by programming languages to store global variables. Being writable, these segments must also be private over the course of the execution post-initialization. As such, the  $zkVM_{MA}$  functionally treats these segments as a small part of the RAM that is non-contiguous with it, but with additional constraints to guarantee they are initialized as specified in the binary. In order to simplify management of the 'dual-nature' of these segments as both part of the program but also writeable, the

 $zkVM_{MA}$  keeps the program memory and data memory in the same address space during the first-pass tracing, but with distinct permissions for the writeable vs. read-only components. Those permissions are maintained in the unified address space of the linear memory model used in the second pass.

### 2.3.4 Public input

The public input segment  $PI[\ ]$  contains an input of length n bytes, where PI[k] for  $k \in [0,n)$  stores the kth byte of the public input x. It is prefaced by a four-byte (one 32-bit word) segment  $IS[\ ]$ , that stores n so that the guest program can determine the length of the input made available to it. To read the kth byte of the input the runtime loads k' = k + offset into a register R[rs1], and then invokes a custom instruction  $\min$  rin rd rs1. During the first tracing pass,  $\inf$  segment, setting R[rd] = PI[R[rs1]] = PI[k'] = PI[k] as needed. During the second tracing pass – which is the one proven –  $\min$  is treated as a pseudoinstruction equivalent to 1b, and the offset is loaded from the first word of the well-known segment (addresses 0x80 through 0x83 in the unified address space of the linear memory model). So if, e.g., in the unified address space the contents of PI begin at address p, then the k-p will set offset k-p and the ensuing load at k' = k + offset = k + p will be from the memory location containing the k-th byte of the input.

During both tracing passes, the contents of PI and IS are treated as read-only – the  $\mathsf{zkVM}_{\mathsf{MA}}$  will halt if an attempt is made by the guest program to write to those memory segments. When proving with the  $\mathsf{zkVM}$ , in the memory checking the prover constrains the input segment at the beginning of the execution to be its claimed value (see Section 7). During verification the values those constraints refer to (that is, the public input and its length) are provided to the circuit by the verifier, and so the constraints will not be satisfied if a malicious prover uses an input x when proving but claims to the verifier to have used  $x' \neq x$  as the input instead.

#### 2.3.5 Associated data

For many zkVM use cases it can be useful to be able to bind arbitrary contextual information about an execution or the context of the proving into the proof itself. For example, from the perspective of the zkVM the program is just a compiled binary. By incorporating the hash of the program as originally written in a high-level language – such as Rust or Python – the proof can carry a reference to that code for use by the verifier, such as for auditing its functional correctness or the correctness of its compilation. In the particular case where the program forms a standalone software package, such as a Cargo crate or Python wheel, then binding in the hash of that package can even relate the proof to the broader software ecosystem.

To support binding arbitrary external information into the proof, the  $\mathsf{zkVM}_{\mathsf{MA}}$  contains an associated data memory segment that the prover can populate with an arbitrary bytestring. This segment is no-access within the Harvard architecture – it can neither be written to nor read from during an execution. But, the verifier otherwise treats it as a public input segment with checked contents that can then be used post-verification for application-focused infrastructure built on top of the proof, such as the aforementioned auditing. We place the associated data before the public output and exit code, so that the memory regions after it form a contiguous space of writeable segments.

### 2.3.6 Public output and exit code

In principle, the public output and exit code work in much the same way as the public input, except the relevant custom instruction is wou rs1 rs2 – writing the content of R[rs2] into the R[rs1]-th byte of the output and interpreted on the second pass as sb – and the offset is loaded from the second word of the well-known segment (addresses 0x84 through 0x87 in the unified address space of the linear memory model). Otherwise, the most significant difference is that the single word exit code segment  $EC[\ ]$  and the public output segment  $PO[\ ]$  are write-only, rather than read-only. During the

first tracing pass the public output segment can grow arbitrarily (up to addressing limits) to support additional written output. The length of the resultant output is then reserved ahead of time for the memory segment for the second and proven tracing pass.

### 2.3.7 Data memory

The  $\mathsf{zkVM}_{\mathsf{MA}}$  includes a RAM, denoted  $M[\ ]$ . The RAM contains bytes indexed by 32-bit addresses such that  $M[\mathsf{addr}]$  refers to the current byte value at address addr. The data memory size |M| can be at most  $2^{32}$  so that the entire memory is addressable. Addresses are reduced modulo |M| to handle overflows, i.e.,  $M[\mathsf{addr}] = M[\mathsf{addr} \bmod |M|]$ . Due to this reduction, for consistency the same |M| must be used for both evaluation passes, but as |M| is only explicitly determined at the end of the first pass based on its maximum memory usage, this is guaranteed by design. Instructions can read from the data memory, write to the data memory, or leave the data memory untouched. The primary use of the data memory is to store the stack and the heap. During the first tracing pass, its size is variable and the stack and heap grow towards each other, as is standard. During the second tracing pass the stack and heap still grow towards each other, but are given fixed-size segments within which to do so. As a consequence, the size of the data memory is limited to only what is needed, enabling quicker proving and smaller proofs in the  $\mathsf{zkVM}$  as only memory that is used is 'paid for' by needing to prove its contents.

### 2.4 Basic instruction set and binary encoding

The Nexus Virtual Machine (NVM) instruction set extends the RISC-V RV32I Instruction Set Architecture by introducing an RV32Nexus extension that captures new instructions (especially future precompiles). Note that fence has no effect since our machine only has one CPU core (a.k.a. Hart in RISC-V terminology). The ebreak instruction is unsupported by the zkVM<sub>MA</sub>, but the zkVM is nonetheless able to prove traces that utilize the instruction as its constraints are nearly identical to that of ecall. The unimp instruction is supported by neither the zkVM<sub>MA</sub> nor the zkVM as a whole, though for completeness the decoding circuits do include some handling for it. The NVM instruction set is summarized in Table 1 and the binary encoding of all instructions specified in Table 2.

The Nexus VM will enforce 1-byte alignment for the data memory and 4-byte alignments for the program memory. In particular, the program counter must be a multiple of 4. For instructions which operate over half-words or words, we will load one byte at a time and reassemble them into half-words or words. Nevertheless, as indicated in Table 1, load-half (1h) and load-word (1w) instructions assume respectively 2-byte and 4-byte alignments and raise an exception whenever this condition is not met.

If the opcode is invalid, we will raise an exception. If there exists an ISA overflow in address calculation, we will also throw an exception.

As it is standard for RISC-V programs, a program halts whenever the unimp instruction is loaded into the CPU. The output of the program can then be read from the dedicated memory location used for outputs.

### 2.5 Instruction encoding

In order to abstract away the details of the encoding of virtual machine instructions, zkVMs often adopt an abstract representation of these instructions inside the prover. For instance, in Jolt [AST24] each instruction is viewed as a 5-tuple (opcode, rs1, rs2, rd, imm) denoting an operation code, two source registers, a destination register, and an immediate value. Although they may lack complete formal specifications, the code and documentation of some other zkVM projects show similar encoding approaches, demonstrating it to be a common arithmetization pattern.<sup>1</sup>

<sup>&</sup>lt;sup>1</sup>For example, in Valida (https://lita.gitbook.io/lita-documentation/architecture/valida-zk-vm/technical-design-vm) each instruction is encoded as six field elements, denoting the instruction operation code, three possible operands, and two flags. These flags indicate respectively whether the second and third operands are

Table 1: Summary of the Nexus Virtual Machine Instruction Set, where operations are mod  $2^{32}$  and sext indicates a sign extension. In an arithmetic shift, the sign bit is copied into the vacated upper bits. In a logical shift, zero is copied into the vacated bits.

| Instruction mnemonic | Arguments             | Description of functionality                                                             |
|----------------------|-----------------------|------------------------------------------------------------------------------------------|
|                      |                       | no operation, implemented as (addi x0 x0 0)                                              |
| nop                  | $\operatorname{rd} i$ |                                                                                          |
| lui                  | $\operatorname{rd} i$ | loads $i$ into top 20 bits of $R[rd]$ , fills lower 12 bits with 0's                     |
| auipc                |                       | loads i into top 20 bits of $R[rd]$ , fills lower 12 bits with 0's, adds $pc$ to $R[rd]$ |
| jal                  | $\operatorname{rd} i$ | stores $pc + 4$ into $R[rd]$ , jumps to $pc + \text{sext}(i)$                            |
| jalr                 | rd rs1 i              | stores $pc + 4$ into $R[rd]$ , jumps to $(R[rs1] + sext(i)) \& 0xFFFFFFE$                |
| ecall                |                       | system call (see Table 3)                                                                |
| ebreak               |                       | system call (see Table 3)                                                                |
| fence                | $pr\ sc\ fm$          | mapped to nop                                                                            |
| unimp                |                       | the machine halts when encountering this instruction; $pc$ is not updated                |
| beq                  | rs1 rs2 $i$           | branches to $pc + \text{sext}(i)$ if $(R[rs1] = R[rs2])$                                 |
| bne                  | rs1 rs2 $i$           | branches to $pc + \text{sext}(i)$ if $(R[\text{rs1}] \neq R[\text{rs2}])$                |
| blt                  | rs1 rs2 $i$           | branches to $pc + \text{sext}(i)$ if $(R[rs1] < R[rs2])$ (signed comparison)             |
| bge                  | rs1 rs2 $i$           | branches to $pc + \text{sext}(i)$ if $(R[rs1] \ge R[rs2])$ (signed comparison)           |
| bltu                 | rs1 rs2 $i$           | branches to $pc + \text{sext}(i)$ if $(R[rs1] < R[rs2])$ (unsigned comparison)           |
| bgeu                 | rs1 rs2 $i$           | branches to $pc + \text{sext}(i)$ if $(R[rs1] \ge R[rs2])$ (unsigned comparison)         |
| 1b                   | rd rs1 $i$            | loads the sign extension of the byte at $M[R[rs1] + sext(i)]$ into $R[rd]$               |
| lh l                 | rd rs1 $i$            | loads the sign extension of the half-word at $M[R[rs1] + sext(i)]$ into $R[rd]$          |
| lw                   | rd rs1 $i$            | loads the word at $M[R[rs1] + sext(i)]$ into $R[rd]$                                     |
| 1bu                  | rd rs1 $i$            | loads the zero extension of the byte at $M[R[rs1] + sext(i)]$ into $R[rd]$               |
| 1hu                  | rd rs1 $i$            | loads the zero extension of the half-word at $M[R[rs1] + sext(i)]$ into $R[rd]$          |
| sb                   | rs1 rs2 $i$           | stores the first byte of $R[rs2]$ at $M[R[rs1] + sext(i)]$                               |
| sh                   | rs1 rs2 $i$           | stores the first half-word of $R[rs2]$ at $M[R[rs1] + sext(i)]$                          |
| sw                   | rs $1 \text{ rs} 2 i$ | stores $R[rs2]$ at $M[R[rs1] + sext(i)]$                                                 |
| addi                 | rd rs1 $i$            | sets $R[rd]$ to $R[rs1] + sext(i)$                                                       |
| slti                 | rd rs1 $i$            | sets $R[rd] = 1$ if $(R[rs1] < sext(i))$ and 0 otherwise (signed comparison)             |
| sltiu                | rd rs1 $i$            | sets $R[rd] = 1$ if $(R[rs1] < sext(i))$ and 0 otherwise (unsigned comparison)           |
| xori                 | rd rs1 $i$            | sets $R[rd]$ to the bitwise XOR of $R[rs1]$ and $sext(i)$                                |
| ori                  | rd rs1 $i$            | sets $R[rd]$ to the bitwise OR of $R[rs1]$ and $sext(i)$                                 |
| andi                 | rd rs1 $i$            | sets $R[rd]$ to the bitwise AND of $R[rs1]$ and $sext(i)$                                |
| slli                 | rd rs1 $i$            | sets $R[rd]$ to $R[rs1] \ll (i \& 0x0000001F)$ (logical shift)                           |
| srli                 | rd rs1 $i$            | sets $R[rd]$ to $R[rs1] \gg (i \& 0x0000001F)$ (logical shift)                           |
| srai                 | rd rs1 $i$            | sets $R[rd]$ to $R[rs1] \gg (i \& 0x0000001F)$ (arithmetic shift)                        |
| add                  | rd rs1 rs2            | sets $R[rd]$ to $R[rs1] + R[rs2]$                                                        |
| sub                  | rd rs1 rs2            | sets $R[rd]$ to $R[rs1] - R[rs2]$                                                        |
| sll                  | rd rs1 rs2            | sets $R[rd]$ to $R[rs1] \ll (R[rs2] \& 0x0000001F)$ (logical shift)                      |
| slt                  | rd rs1 rs2            | sets $R[rd] = 1$ if $(R[rs1] < R[rs2])$ and 0 otherwise (signed comparison)              |
| sltu                 | rd rs1 rs2            | sets $R[rd] = 1$ if $(R[rs1] < R[rs2])$ and 0 otherwise (unsigned comparison)            |
| xor                  | rd rs1 rs2            | sets $R[rd]$ to the bitwise XOR of $R[rs1]$ and $R[rs2]$                                 |
| srl                  | rd rs1 rs2            | sets $R[rd]$ to $R[rs1] \gg (R[rs2] \& 0x0000001F)$ (logical shift)                      |
| sra                  | rd rs1 rs2            | sets $R[rd]$ to $R[rs1] \gg (R[rs2] \& 0x0000001F)$ (arithmetic shift)                   |
| or                   | rd rs1 rs2            | sets $R[rd]$ to the bitwise OR of $R[rs1]$ and $R[rs2]$                                  |
| and                  | rd rs1 rs2            | sets $R[rd]$ to the bitwise AND of $R[rs1]$ and $R[rs2]$                                 |

Table 2: Binary Encoding of Nexus Virtual Machine Instructions, where  $\langle d \rangle$ ,  $\langle s_1 \rangle$ , and  $\langle s_2 \rangle$  denote respectively the binary representation of the 5-bit register selectors rd, rs1, rs2, and  $\langle i \rangle [a:b]$  denotes the binary representation of the bits a through b of the immediate value i.

| Instruction Arguments mnemonic |                     |     | Binary Encodings (Bits 31–0) |                                                          |                             |                       |                     |                              |         |
|--------------------------------|---------------------|-----|------------------------------|----------------------------------------------------------|-----------------------------|-----------------------|---------------------|------------------------------|---------|
| lui                            | rd                  | i   |                              |                                                          |                             |                       | $\langle d \rangle$ | 0110111                      |         |
| auipc                          | $^{\mathrm{rd}}$    | i   |                              |                                                          | $\langle i \rangle [31:12]$ |                       |                     | $\langle d \rangle$          | 0010111 |
| jal                            | $^{\mathrm{rd}}$    | i   |                              | $\langle i  angle$                                       | 20 10:1 11 1                | 9:12                  |                     | $\langle d \rangle$          | 1101111 |
| jalr                           | $^{\mathrm{rd}}$    | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0]                        | $\langle s_1 \rangle$ | 000                 | $\langle d \rangle$          | 1100111 |
| ecall                          |                     |     |                              | 0000000                                                  | 000000                      | 00000                 | 000                 | 00000                        | 1110011 |
| ebreak                         |                     |     |                              | 0000000                                                  | 00001                       | 00000                 | 000                 | 00000                        | 1110011 |
| fence                          | pr                  | sc  | fm                           | fm $sc$ $pr$                                             |                             | 00000                 | 000                 | 00000                        | 0001111 |
| unimp                          |                     |     |                              | 0000000                                                  | 000011                      | 00000                 | 001                 | 00000                        | 1110011 |
| beq                            | rs1                 | rs2 | i                            | $\langle i \rangle [12 10:5] \qquad \langle s_2 \rangle$ |                             | $\langle s_1 \rangle$ | 000                 | $\langle i \rangle [4:1 11]$ | 1100011 |
| bne                            | rs1                 | rs2 | i                            | $\langle i \rangle [12 10:5]$                            | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 001                 | $\langle i \rangle [4:1 11]$ | 1100011 |
| blt                            | rs1                 | rs2 | i                            | $\langle i \rangle [12 10:5]$                            | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 100                 | $\langle i \rangle [4:1 11]$ | 1100011 |
| bge                            | rs1                 | rs2 | i                            | $\langle i \rangle [12 10:5]$                            | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 101                 | $\langle i \rangle [4:1 11]$ | 1100011 |
| bltu                           | rs1                 | rs2 | i                            | $\langle i \rangle [12 10:5]$                            | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 110                 | $\langle i \rangle [4:1 11]$ | 1100011 |
| bgeu                           | rs1                 | rs2 | i                            | $\langle i \rangle [12 10:5]$                            | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 111                 | $\langle i \rangle [4:1 11]$ | 1100011 |
| lb                             | rd                  | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0]                        | $\langle s_1 \rangle$ | 000                 | $\langle d \rangle$          | 0000011 |
| lh                             | $_{\mathrm{rd}}$    | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0                         | $\langle s_1 \rangle$ | 001                 | $\langle d \rangle$          | 0000011 |
| lw                             | $^{\mathrm{rd}}$    | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0                         | $\langle s_1 \rangle$ | 010                 | $\langle d \rangle$          | 0000011 |
| lbu                            | $^{\mathrm{rd}}$    | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0]                        | $\langle s_1 \rangle$ | 100                 | $\langle d \rangle$          | 0000011 |
| lhu                            | $^{\mathrm{rd}}$    | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0]                        | $\langle s_1 \rangle$ | 101                 | $\langle d \rangle$          | 0000011 |
| sb                             | rs1                 | rs2 | i                            | $\langle i \rangle [11:5]$                               | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 000                 | $\langle i \rangle [4:0]$    | 0100011 |
| sh                             | rs1                 | rs2 | i                            | $\langle i \rangle [11:5]$                               | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 001                 | $\langle i \rangle [4:0]$    | 0100011 |
| sw                             | rs1                 | rs2 | i                            | $\langle i \rangle [11:5]$                               | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 010                 | $\langle i \rangle [4:0]$    | 0100011 |
| addi                           | rd                  | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0]                        | $\langle s_1 \rangle$ | 000                 | $\langle d \rangle$          | 0010011 |
| slti                           | $_{\rm rd}$         | rs1 | i                            | $\langle i \rangle [1]$                                  |                             | $\langle s_1 \rangle$ | 010                 | $\langle d \rangle$          | 0010011 |
| sltiu                          | rd                  | rs1 | i                            | $\langle i \rangle [1]$                                  |                             | $\langle s_1 \rangle$ | 011                 | $\langle d \rangle$          | 0010011 |
| xori                           | rd                  | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0                         | $\langle s_1 \rangle$ | 100                 | $\langle d \rangle$          | 0010011 |
| ori                            | $_{\mathrm{rd}}$    | rs1 | i                            | $\langle i \rangle [1]$                                  |                             | $\langle s_1 \rangle$ | 110                 | $\langle d \rangle$          | 0010011 |
| andi                           | $^{\mathrm{rd}}$    | rs1 | i                            | $\langle i \rangle [1]$                                  | 1:0]                        | $\langle s_1 \rangle$ | 111                 | $\langle d \rangle$          | 0010011 |
| slli                           | rd                  | rs1 | i                            | 0000000                                                  | $\langle i \rangle [4:0]$   | $\langle s_1 \rangle$ | 001                 | $\langle d \rangle$          | 0010011 |
| srli                           | $^{\mathrm{rd}}$    | rs1 | i                            | 0000000                                                  | $\langle i \rangle [4:0]$   | $\langle s_1 \rangle$ | 101                 | $\langle d \rangle$          | 0010011 |
| srai                           | $^{\mathrm{rd}}$    | rs1 | i                            | 0100000                                                  | $\langle i \rangle [4:0]$   | $\langle s_1 \rangle$ | 101                 | $\langle d \rangle$          | 0010011 |
| add                            | rd                  | rs1 | rs2                          | 0000000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 000                 | $\langle d \rangle$          | 0110011 |
| sub                            | rd                  | rs1 | rs2                          | 0100000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 000                 | $\langle d \rangle$          | 0110011 |
| sll                            | $_{\mathrm{rd}}$    | rs1 | rs2                          | 0000000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 001                 | $\langle d \rangle$          | 0110011 |
| slt                            | rd                  | rs1 | rs2                          | 0000000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 010                 | $\langle d \rangle$          | 0110011 |
| sltu                           | $^{\mathrm{rd}}$    | rs1 | rs2                          | 0000000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 011                 | $\langle d \rangle$          | 0110011 |
| xor                            | $\operatorname{rd}$ | rs1 | rs2                          | 0000000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 100                 | $\langle d \rangle$          | 0110011 |
| srl                            | $^{\mathrm{rd}}$    | rs1 | rs2                          | 0000000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 101                 | $\langle d \rangle$          | 0110011 |
| sra                            | $_{\mathrm{rd}}$    | rs1 | rs2                          | 0100000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 101                 | $\langle d \rangle$          | 0110011 |
| or                             | $^{\mathrm{rd}}$    | rs1 | rs2                          | $0000000$ $\langle s_2 \rangle$                          |                             | $\langle s_1 \rangle$ | 110                 | $\langle d \rangle$          | 0110011 |
| and                            | $\operatorname{rd}$ | rs1 | rs2                          | 0000000                                                  | $\langle s_2 \rangle$       | $\langle s_1 \rangle$ | 111                 | $\langle d \rangle$          | 0110011 |

Table 3: Behavior of the ecall and ebreak instructions in the Nexus Virtual Machine.

| R[x17] value | Description of the ecall / ebreak functionality           | pc update              |
|--------------|-----------------------------------------------------------|------------------------|
| 0x200        | system call to write to the memory, used for debugging    | $pc \leftarrow pc + 4$ |
| 0x201        | system call to halt the virtual machine, similar to unimp | pc is not updated      |
| 0x400        | system call to read from private input,                   | $pc \leftarrow pc + 4$ |
|              | loads a 32-bit value onto $R[x10]$                        |                        |
| 0x401        | system call to obtain the current cycle count             | $pc \leftarrow pc + 4$ |
| 0x402        | system call to overwrite the stack pointer,               | $pc \leftarrow pc + 4$ |
|              | loads a 32-bit value onto $R[x2]$                         |                        |
| 0x403        | system call to overwrite the heap pointer,                | $pc \leftarrow pc + 4$ |
|              | loads a 32-bit value onto $R[x10]$                        |                        |

Our zkVM follows the same pattern and represents each instruction by a tuple of 5 field elements, denoting the instruction opcode opcode, three possible operands (op-a, op-b, op-c), and a flag imm-c indicating whether the third operand is an immediate value or register address.

The following table shows how to map Nexus Virtual Machine instructions to the above encoding.

| immediate | values | or | offsets. |
|-----------|--------|----|----------|
|           |        |    |          |

 $\label{thm:condition} \begin{tabular}{l} Table 4: Representation of Nexus Virtual Machine Instructions in the format (opcode, op-a, op-b, op-c, imm-c). \end{tabular}$ 

| NVM Instruction |              | Instruction Encoding |              |                |         |       |  |
|-----------------|--------------|----------------------|--------------|----------------|---------|-------|--|
| mnemonic        | Arguments    | opcode               | op-a         | op-b           | op-c    | imm-c |  |
| nop             |              |                      | mappe        | ed to (addi x0 | 0 x0 0) |       |  |
| lui             | rd $i$       | LUI                  | rd           | 0              | i       | 1     |  |
| auipc           | rd $i$       | AUIPC                | rd           | 0              | i       | 1     |  |
| jal             | rd $i$       | JAL                  | rd           | 0              | i       | 1     |  |
| jalr            | rd rs1 $i$   | JALR                 | rd           | rs1            | i       | 1     |  |
| ecall           |              | ECALL                | 0   x2   x10 | x17            | 0       | 1     |  |
| ebreak          |              | EBREAK               | 0   x2   x10 | x17            | 0       | 1     |  |
| fence           | $pr\ sc\ fm$ |                      | r            | napped to no   | p       |       |  |
| unimp           |              | UNIMP                | 0            | 0              | 0       | 1     |  |
| beq             | rs1 rs2 $i$  | BEQ                  | rs1          | rs2            | i       | 1     |  |
| bne             | rs1 rs2 $i$  | BNE                  | rs1          | rs2            | i       | 1     |  |
| blt             | rs1 rs2 $i$  | BLT                  | rs1          | rs2            | i       | 1     |  |
| bge             | rs1 rs2 $i$  | BGE                  | rs1          | rs2            | i       | 1     |  |
| bltu            | rs1 rs2 $i$  | BLTU                 | rs1          | rs2            | i       | 1     |  |
| bgeu            | rs1 rs2 $i$  | BGEU                 | rs1          | rs2            | i       | 1     |  |
| lb              | rd rs1 $i$   | LB                   | rd           | rs1            | i       | 1     |  |
| lh              | rd rs1 $i$   | LH                   | rd           | rs1            | i       | 1     |  |
| lw              | rd rs1 $i$   | LW                   | rd           | rs1            | i       | 1     |  |
| lbu             | rd rs1 $i$   | LBU                  | rd           | rs1            | i       | 1     |  |
| lhu             | rd rs1 $i$   | LHU                  | rd           | rs1            | i       | 1     |  |
| sb              | rs1 rs2 $i$  | SB                   | rs1          | rs2            | i       | 1     |  |
| sh              | rs1 rs2 $i$  | SH                   | rs1          | rs2            | i       | 1     |  |
| SW              | rs1 rs2 $i$  | SW                   | rs1          | rs2            | i       | 1     |  |
| addi            | rd rs1 $i$   | ADD                  | rd           | rs1            | i       | 1     |  |
| slti            | rd rs1 $i$   | SLT                  | rd           | rs1            | i       | 1     |  |
| sltiu           | rd rs1 $i$   | SLTU                 | rd           | rs1            | i       | 1     |  |
| xori            | rd rs1 $i$   | XOR                  | rd           | rs1            | i       | 1     |  |
| ori             | rd rs1 $i$   | OR                   | rd           | rs1            | i       | 1     |  |
| andi            | rd rs1 $i$   | AND                  | rd           | rs1            | i       | 1     |  |
| slli            | rd rs1 $i$   | SLL                  | rd           | rs1            | i       | 1     |  |
| srli            | rd rs1 $i$   | SRL                  | rd           | rs1            | i       | 1     |  |
| srai            | rd rs1 $i$   | SRA                  | rd           | rs1            | i       | 1     |  |
| add             | rd rs1 rs2   | ADD                  | rd           | rs1            | rs2     | 0     |  |
| sub             | rd rs1 rs2   | SUB                  | rd           | rs1            | rs2     | 0     |  |
| sll             | rd rs1 rs2   | SLL                  | rd           | rs1            | rs2     | 0     |  |
| slt             | rd rs1 rs2   | SLT                  | rd           | rs1            | rs2     | 0     |  |
| sltu            | rd rs1 rs2   | SLTU                 | rd           | rs1            | rs2     | 0     |  |
| xor             | rd rs1 rs2   | XOR                  | rd           | rs1            | rs2     | 0     |  |
| srl             | rd rs1 rs2   | SRL                  | rd           | rs1            | rs2     | 0     |  |
| sra             | rd rs1 rs2   | SRA                  | rd           | rs1            | rs2     | 0     |  |
| or              | rd rs1 rs2   | OR                   | rd           | rs1            | rs2     | 0     |  |
| and             | rd rs1 rs2   | AND                  | rd           | rs1            | rs2     | 0     |  |
|                 |              |                      |              |                |         |       |  |

## 3 Preliminaries

In this section, we explain a few useful tools and concepts used in the context of the new zkVM.

### 3.1 Stwo underlying algebraic structures

This specification does not describe the inner details of the Stwo prover but we briefly recall here the finite fields used in Stwo, which are relevant for the arithmetization of our circuits and the specification of constraints.

Fields: There are three main fields used in the stwo and and stwo-cairo repositories for the Stwo prover:

m31: This is the base prime field  $\mathbb{F}_p$  used in the stwo and and stwo-cairo repositories. The size of the field is  $p = 2^{31} - 1$ , which is a Mersenne prime.

cm31: This is the field  $\mathbb{F}'$  which is a complex extension of  $\mathbb{F}_p$ , also denoted by  $\mathbb{F}_p(i)$ . Note that the complex extension is also equivalent to  $\mathbb{F}_p[X]/(X^2+1)$ . Further, each element in  $\mathbb{F}'$  can be written as a+ib, with addition and subtraction performed in a straightforward componentwise manner.

**Multiplication:** 
$$(a_1 + ib_1)(a_2 + ib_2) = (a_1a_2 - b_1b_2) + i(a_1b_2 + a_2b_1)$$

**Inverse:** 
$$(a+ib)^{-1} = (a-ib)(a^2+b^2)^{-1}$$
 for  $a+ib \neq 0$ 

The components themselves are multiplied using the underlying field multiplication of  $\mathbb{F}_p$ .

qm31: This is the field  $\mathbb{F}''$ , which is the quadratic extension of  $\mathbb{F}'$ . This can also be written as  $\mathbb{F}'(\phi)$ , where  $\phi$  is the root of the equation  $X^2 = 2 + i$ , with  $\mathbb{F}''$  equivalent to  $\mathbb{F}'[X]/(X^2 - 2 - i)$ . Each element of  $\mathbb{F}''$  can be written as  $(a+ib) + \phi(c+id)$ . For simplicity, this can be written as  $u + \phi v$ , where u = (a+ib) and v = (c+id).

**Multiplication:** 
$$(u_1 + \phi v_1)(u_2 + \phi v_2) = (u_1u_2 + (2+i)v_1v_2) + \phi(u_1v_2 + u_2v_1)$$

**Inverse:** 
$$(u + \phi v)^{-1} = (u - \phi v)(u^2 - (2+i)v^2)^{-1}$$
 for  $u + \phi v \neq 0$ 

Note that the above operations, such as multiplication between u and v above are done as in the multiplication in  $\mathbb{F}'$ . Further, note that this field is of size  $p^4 \approx 2^{124}$ .

In the open source code, m31 and qm31 are often referred to as the 'BaseField' and 'SecureField'. Notice that SecureField is large enough to be used in randomized checks with negligible probability of cheating.

Circle Groups: One of the most important components in STARK proofs [BBHR18] is the ability to compute FFT over some cyclic group. While a prime field  $\mathbb{F}_p$  already has a cyclic multiplicative subgroup of order p-1, for efficiency reasons, we would like to work with a subgroup with an order that can be divided by a large power of two  $2^k$ . Unfortunately, for m31 of size  $p=2^{31}-1$ , the cyclic group is not of such a form since  $p-1=2(2^{30}-1)$  giving a maximal even power  $2^k$  with k=1.

A recent work [HLP24] shows how to get around this by instead working in the 'circle group'. The main insight is that one can represent every point on this circle group as a pair of underlying field elements such that with an appropriately chosen generator, the cycle group has order of the form  $2^k$ . For instance, with elements from m31, one can generate a cyclic circle group of order  $2^{30}$ .

### 3.2 Offline Memory Checker

Offline memory checking allows us to keep track of memory and register read/write consistency. In this section, we briefly recall the offline memory checking problem and an efficient solution for it.

**Definition 1 (Offline Memory Checking)** The offline memory checking problem relates to a user of an external untrusted memory and how they can ensure consistency between accesses without keeping a copy of the entire memory. In our case the zkVM can be seen as the user that in the execution trace sees claims about register values and memory values that are hard to verify just using local information in a single row.

The user issues a list containing n memory accesses to the untrusted memory of size m with strictly monotonically increasing timestamps timer. We denote by  $M_{Init}$  the initial state of memory. Each memory access is in one of the following forms:

- Read<sub>M</sub>(addr, timer)  $\rightarrow$  val: In this case, the untrusted memory M should return the memory cell value val stored at address addr at time timer to the user.
- Write<sub>M</sub>(addr, val, timer): In this case, the untrusted memory M is instructed to rewrite the memory cell value stored at address addr and time timer with the value val.

The offline memory checker observes the communication between the end user and the untrusted memory, and after the user issues all n memory accesses, the offline checker makes a decision at the end. We require two properties from the offline memory checker:

Completeness: If for every read operation, the untrusted memory returns the tuple last written to that location, then the checker always outputs 1.

Soundness: If the untrusted memory ever returns a value val\* for a memory read such that val\* does not equal the value val last written to the address addr (or the initial value if never written to), then the checker outputs 0, with overwhelming probability.

In Section 3.2.1, we describe a simple offline memory checker, which satisfies both completeness and soundness, but not the efficiency requirement since the algorithm keeps track of the status of the running memory.

In order to make it efficient, it suffices to keep a trace digest of the memory accesses, which is cheap and can be updated at a cost that independent of the size of the entire memory. In Section 3.2.2, we describe how the Nexus zkVM does so using logarithmic derivatives aka logups [EKRN24, Hab22].

### 3.2.1 A simple offline memory checker

We will maintain a global counter ts, which gets incremented at every clock cycle, along with two sets, a "read set" (RS) and a "write set" (WS). The purpose of the counter ts is to stamp each memory access with a unique identifier. The purpose of RS and WS is to record a trace of the memory access pattern.

We will also augment the memory to include an additional timestamp for each cell. More specifically, we now view the memory as a vector consisting of triples of the form (addr, val, t) where addr is the memory address, val is the associated value, and t is a timestamp which indicates the last time the address was read or updated. The offline memory checker initializes  $RS = \emptyset$ ,  $WS = M_{Init}$  and ts = 0.

To deal with a read instruction at address addr, the offline memory checker performs the following steps:

- 1. Read the triple (addr, val, t) from the untrusted memory, where all three values are treated as non-deterministic advice (witness) from the untrusted memory.
- 2. Verify that  $t \in [0, ts 1]$
- 3. Add (addr, val, t) to RS;
- 4. Update the counter to ts = ts + 1;

- 5. Add (addr, val, ts) to WS;
- 6. Write (addr, val, ts) to the untrusted memory.

To deal with a write instruction which overwrites the value at address addr into val', the offline memory checker performs the following steps:

- 1. Read the triple (addr, val, t) from the untrusted memory;
- 2. Verify that  $t \in [0, ts 1]$
- 3. Add (addr, val, t) to RS;
- 4. Update the counter to ts = ts + 1;
- 5. Add (addr, val', ts) to WS;
- 6. Write (addr, val', ts) to the untrusted memory.

Let's denote by  $M_{\text{Init}}$  the initial state of memory with all timestamps set to 0, and  $M_{\text{Final}}$  the final state of the memory with the final timestamps. The decision predicate for the offline memory checker is:

$$\mathsf{RS} \cup M_{\mathsf{Final}} \stackrel{?}{=} \mathsf{WS} \cup M_{\mathsf{Init}}$$

In order to understand the intuition as to why this algorithm satisfy completeness and soundness, first observe that the read and write sets maintained by the offline checker preserve two important invariants: (1) Every element added to RS and WS is unique because the timestamp of that element is always set to the current global counter, which is incremented after each memory access; (2) For each key used, RS "trails" WS by exactly the last write operation. In other words, we always have RS  $\subset$  WS.

Let WS and RS denote the multi-sets maintained by the offline memory checker. If for every read operation, the untrusted memory returns the tuple last written to that location, then we have  $RS \cup M_{Final} = WS \cup M_{Init}$  and completeness follows.

Moreover, if the untrusted memory ever returns a value  $val^*$  for a memory read such that  $val^*$  does not equal the value val last written to the address addr (or the initial value if never written to), then there does not exist any set M such that  $RS \cup M = WS \cup M_{Init}$  and soundness follows.

#### 3.2.2 A concrete proposal based on logups

As mentioned above, the simple algorithm described above keeps track of the status of the running memory, which is inefficient. In order to make it efficient, it suffices to maintain a digest of the read and write sets instead of the full sets. The advantage is that the digests can be updated at a cost that is independent of the sizes of the sets.

In this version of the Nexus zkVM, the computation of the digest is implemented using logarithmic derivatives aka logups [EKRN24, Hab22], which we now describe.

Even though the Nexus zkVM has different categories of memory (e.g., program memory, register memory, and random access memory), this section focuses on the implementation of the algorithm for the case of random access memory. Let us for simplicity assume each row can do one of three possible memory operations.

- Load a value from the RAM. The row will have values t, a, v, t', where (if derived from an honest execution trace) t is the timestamp of the CPU cycle being executed, a is an address, v is a memory value read from address a, and t' represents the last time the address a was accessed.
- Store a value in the RAM. The row will have values t, a, v, v', t', where (if derived from an honest execution trace) t is the timestamp of the CPU cycle being executed, a is an address, v is a memory value written to address a, and t' represents the last time the address a was accessed and the prior value is v'.
- Not access the RAM.

Now, let us for simplicity assume these elements can be represented as single field elements in a finite field  $\mathbb{F}$  (we deal later with the case where they have to be represented as multiple field elements)

of size p. Moreover, let  $\mathbb{E}$  be an extension field of size  $p^e$ , where  $e \geq 3$ . And let f be an injective map  $f:(t,a,v)\in\mathbb{F}^3\to\mathbb{E}$ . When using the Nexus zkVM with the Stwo prover,  $\mathbb{F}$  and  $\mathbb{E}$  will correspond to m31 and qm31, respectively.

The high level strategy of the proof for memory consistency is

- Let  $WS_{init}$  be a set of tuples (t = 0, a, v) specifying initial values to some of the memory addresses, and let  $RS_{final}$  be a claimed end state of the memory specified via tuples (t, a, v), where (in the case of an honest trace) t is the last time the address was written to.
- Commit to the table representing the execution trace (or at least commit to the memory access values given above)
- Get a logup challenge  $z \in \mathbb{E}$  via Fiat-Shamir on the initial and final states and the commitments to the trace
- Going through all the rows, compute the read set and write set logup sums as follows
  - Set

$$\sigma_{RS} := 0$$
 and  $\sigma_{WS} := 0$ 

- For a row with a read (t, a, v) on an address last touched at t' (note t' < t, which can be verified via local constraints on the row) increment

$$\sigma_{\mathsf{RS}} += rac{1}{f(t', a, v) + z} \qquad ext{and} \qquad \sigma_{\mathsf{WS}} += rac{1}{f(t, a, v) + z}$$

- For a row with a write (t, a, v) on an address last touched at t' having value v' (again t' < t can be verified via local constraints on the row) increment

$$\sigma_{\mathsf{RS}} \mathrel{+}= rac{1}{f(t',a,v')+z} \qquad ext{and} \qquad \sigma_{\mathsf{WS}} \mathrel{+}= rac{1}{f(t,a,v)+z}$$

- For a row not touching memory, make no changes to the logup sums
- Return  $\sigma_{RS}, \sigma_{WS} \in \mathbb{E}$
- The proof  $\pi$  for correct execution will include evidence to convince the verifier that
  - For each row that some local constraints are satisfied, e.g., type checks on a, v, t, v', t' via range proofs, and for reads and writes that t' < t
  - The Fiat-Shamir challenge z is computed based on the initial memory and final memory and commitments
  - The logup sums are computed correctly wrt z.
- The verifier will check

$$\sum_{(t=0,a,v)\in \mathsf{WS}_{\mathsf{init}}} \frac{1}{f(a,v,t)+z} + \sigma_{\mathsf{WS}} \ = \ \sigma_{\mathsf{RS}} + \sum_{(t,a,v)\in \mathsf{RS}_{\mathsf{final}}} \frac{1}{f(t,a,v)+z}.$$

### 3.3 Lookup tables for range checks

Several parts of the Nexus zkVM design require 8-bit range checks, whose goal is to check whether a given field element lies in the range  $[0, 2^8 - 1]$ . In order to implement 8-bit range checks, we follow the MidenVM design based on logups [Mid].

More precisely, let  $\mathbf{x}$  be a trace column containing 8-bit elements which need to be range checked. In order to implement this range check, we will include two columns ( $\mathbf{v}$  and  $\mathbf{m}$ ) to the trace containing the list of values  $\mathbf{v}[i] \in \{0, \dots, 255\}$  in the range together with their multiplicities  $\mathbf{m}[i]$ , where  $i \geq 0$  specifies the row index. In our design, these values will be in the m31 field.

In addition to the columns (x, v, m), we also include an interaction column  $x_{range}$ , which will keep track of the running sums used by the logup argument, whose elements will be in the larger extension

field. We also assume that columns (v,m) might contain padding rows with the values (255,0), which will be enforced via constraints.

Let  $\alpha$  be a random value chosen by the verifier after the prover commits to the execution trace of the program. The value of  $x_{range}$  at row i will be computed as follows:

$$\mathbf{x}_{\mathtt{range}}[i] = \frac{1}{(\mathtt{x}[0] + \alpha)} - \frac{\mathtt{m}[0]}{(\mathtt{v}[0] + \alpha)} + \ldots + \frac{1}{(\mathtt{x}[i] + \alpha)} - \frac{\mathtt{m}[i]}{(\mathtt{v}[i] + \alpha)}.$$

This implies that the difference in the  $x_{range}$  column between consecutive rows will be:

$$\mathbf{x}_{\texttt{range}}[i] - \mathbf{x}_{\texttt{range}}[i-1] = \frac{1}{(\mathbf{x}[i] + \alpha)} - \frac{\mathbf{m}[i]}{(\mathbf{v}[i] + \alpha)}.$$

Since several of the 8-bit elements that we want to range check are in fact limbs of a larger 32bit value, we assume in this case that we can maintain a single  $x_{range}$  column for all the limbs for efficiency reasons. More precisely, let  ${\tt x}$  be a 32-bit element that has been split into 4 8-bit limbs  $(x^{(1)}, x^{(2)}, x^{(3)}, x^{(4)})$ . In this case, the difference in the  $x_{range}$  column between consecutive rows will

$$\mathtt{x}_{\mathtt{range}}[i] - \mathtt{x}_{\mathtt{range}}[i-1] = \frac{1}{(\mathtt{x}^{(1)}[i] + \alpha)} + \frac{1}{(\mathtt{x}^{(2)}[i] + \alpha)} + \frac{1}{(\mathtt{x}^{(3)}[i] + \alpha)} + \frac{1}{(\mathtt{x}^{(4)}[i] + \alpha)} - \frac{\mathtt{m}[i]}{(\mathtt{v}[i] + \alpha)}.$$

Adding up multiple terms of  $1/(\text{something} + \alpha)$  in the secure field requires some additional auxiliary columns. Those additional columns are implemented by, for example, the Stwo logup library.

Though we could further optimize and reuse the same range check for multiple large elements in the trace, we assume that each 32-bit value x will have separate multiplicity and  $x_{\tt range}$  columns. The column v could however be shared across different values being range checked.

Remark 3.1 Though the more general design allows for some of the numbers in the range to be omitted, we opt here for the simpler construction in which all the values in the 8-bit range are listed in range check columns. This optimization however would make sense for 16-bit range checks.

Boundary constraints Let n be the index of the last row.

- v[0] = 0
- v[n] = 255•  $x_{range}[n] = 0$

Transition constraints  $(0 < i \le n)$ :

- $\begin{array}{l} \bullet \quad (\mathtt{v}[i] \mathtt{v}[i-1]) \cdot (\mathtt{v}[i] \mathtt{v}[i-1] 1) = 0 \\ \bullet \quad \mathtt{x}_{\mathtt{range}}[i] \mathtt{x}_{\mathtt{range}}[i-1] = \frac{1}{(\mathtt{x}^{(1)}[i] + \alpha)} + \frac{1}{(\mathtt{x}^{(2)}[i] + \alpha)} + \frac{1}{(\mathtt{x}^{(3)}[i] + \alpha)} + \frac{1}{(\mathtt{x}^{(4)}[i] + \alpha)} \frac{\mathtt{m}[i]}{(\mathtt{v}[i] + \alpha)} \\ \bullet \quad (\mathtt{v}[i] \mathtt{v}[i-1] 1) \cdot \mathtt{m}[i] = 0 \end{array}$

#### 3.4 Lookup tables for bitwise operations

The execution component of the Nexus zkVM design also makes use of 4-bit lookup tables to implement the bitwise operations XOR, AND, and OR.

Let (a, b, c) represent three trace columns for 4-bit values for which we want to verify that a bit-op b = c, where bit-op  $\in \{\oplus, \&, |\}$  denotes one of the bitwise operations (XOR, AND, OR). In this section, we describe how to implement this check using logups. In order to do so, the main idea is to create a table indexed by (a, b) containing field elements representing valid triples of the form (a, b, a bit-op b). Then, when given a triple (a, b, c), we can check whether that triple is valid by simply checking whether the field element representing the triple (a, b, c) is present in the table.

In the actual implementation below, we will work with the binary decompositions of the values a and b since this will make it easier to map triples of the form (a, b, a bit-op b) to a single field element in order to compute logups.

### 3.4.1 Bitwise lookup trace elements

In order to implement 4-bit lookup tables for bitwise operations on a triple (a,b,c), we will first add columns  $a0,\ldots,a3,b0,\ldots,b3$  to the trace, where  $(a0,\ldots,a3)$  and  $(b0,\ldots,b3)$  correspond to the binary decomposition of the values that we intend to search. These trace columns will be common to all bitwise operations.

Note that we do not need to introduce trace columns corresponding to the binary decomposition of c since all the entries in the lookup table will correspond to field elements representing triples of the form (a,b,a bit-op b) for bit-op  $\in \{\oplus,\&,|\}$ . For these triples, the value c is a known function of the values  $(a0,\ldots,a3)$  and  $(b0,\ldots,b3)$ . Likewise, we do not need to explicitly introduce a trace column for the index being searched since this index is also a known function of the  $(a0,\ldots,a3)$  and  $(b0,\ldots,b3)$ .

In addition to the binary decompositions for a and b, we will add multiplicity columns for each bitwise operation:  $m_{xor}$ ,  $m_{and}$ ,  $m_{or}$ . In our design, the values will  $(a0, \ldots, a3)$ ,  $(b0, \ldots, b3)$ ,  $m_{xor}$ ,  $m_{and}$ ,  $m_{or}$  are in the m31 field.

Finally, we will also introduce trace columns  $\mathtt{digest}_{\mathtt{xor}}$ ,  $\mathtt{digest}_{\mathtt{and}}$ ,  $\mathtt{digest}_{\mathtt{or}}$  to capture the logup computation for bitwise operations. Unlike the other columns defined above, these are elements of the secure extension field.

Hence, the following set of trace elements will be needed by the lookup component of bitwise operations:

- a, b, c: trace columns corresponding to the 4-bits values for which we want to verify c = a bit-op b
- a0,..., a3: the bit decomposition for the a component of the index value
- b0,...,b3: the bit decomposition for the b component of the index value
- $m_{xor}, m_{and}, m_{or}$ : multiplicity values for a given index value of the lookup table
- digest<sub>xor</sub>, digest<sub>and</sub>, digest<sub>or</sub>: logarithmic derivatives for the bitwise lookup tables

### 3.4.2 Bitwise operation mapping function

Let (a,b,c) represent three 4-bit values for which we want to verify that c=a bit-op b, where  $a \in [0,2^4-1]$ ,  $b \in [0,2^4-1]$ ,  $c \in [0,2^4-1]$ . In order to check whether a triple of the form (a,b,c) is part of the lookup table for a given operation, we first need to convert these triples into a field element in the secure extension field. In order to do so, we define a function triple-to-field(a,b,c) as follows:

triple-to-field(a, b, c) := 
$$a + b \cdot z + c \cdot z^2$$
,

where z is a challenge chosen by the verifier after seeing the commitments to the trace.

#### 3.4.3 Bitwise operation helper functions

In order to compute a logup contribution related to a triple (a,b,c), we define a few helper functions to capture the expected values of (a,b,c) as a function of the values  $(a0,\ldots,a3)$ ,  $(b0,\ldots,b3)$  and the bitwise operation. These intermediate functions are defined as follows:

- $a-func(a0,...,a3) = a0 + a1 \cdot 2 + ... + a3 \cdot 2^3$
- b-func(b0,...,b3) = b0 + b1  $\cdot$  2 + ... + b3  $\cdot$  2<sup>3</sup>
- $bit-xor-func(ai,bi) = ai + bi 2 \cdot ai \cdot bi$
- bit-and-func(ai, bi) = ai · bi

```
bit-or-func(ai, bi) = ai + bi - ai · bi
tuple-xor-func(a0, ..., a3, b0, ..., b3) =
bit-xor-func(a0, b0) + bit-xor-func(a1, b1) · 2 + ... + bit-xor-func(a3, b3) · 2³
tuple-and-func(a0, ..., a3, b0, ..., b3) =
bit-and-func(a0, b0) + bit-and-func(a1, b1) · 2 + ... + bit-and-func(a3, b3) · 2³
tuple-or-func(a0, ..., a3, b0, ..., b3) =
bit-or-func(a0, b0) + bit-or-func(a1, b1) · 2 + ... + bit-or-func(a3, b3) · 2³
xor-bits-to-field(a0, ..., a3, b0, ..., b3) =
triple-to-field(a-func(a0, ..., a3, b0, ..., b3))
and-bits-to-field(a-func(a0, ..., a3, b0, ..., b3) =
triple-to-field(a-func(a0, ..., a3, b0, ..., b3))
```

• or-bits-to-field(a0,...,a3,b0,...,b3) = triple-to-field(a-func(a0,...,a3),b-func(b0,...,b3),tuple-or-func(a0,...,a3,b0,...,b3))

Note that, when when ai and bi are bits, it follows that

```
• bit-xor-func(ai, bi) = ai ⊕ bi
```

- bit-and-func(ai, bi) = ai & bi
- bit-or-func(ai, bi) = ai | bi.

### 3.4.4 Bitwise logup computations – 1 triple per row

Let  $\alpha$  be a random value chosen by the verifier after the prover commits to the execution trace of the program.

Let (a[i], b[i], c[i]),  $(a0[i], \dots, a3[i])$ ,  $(b0[i], \dots, b3[i])$ ,  $(digest_{xor}[i], m_{xor}[i])$ ,  $(digest_{and}[i], m_{and}[i])$ ,  $(digest_{or}[i], m_{or}[i])$  represent respectively the values of the trace columns (a, b, c),  $(b0, \dots, b3)$ ,  $(digest_{xor}, m_{xor})$ ,  $(digest_{and}, m_{and})$ ,  $(digest_{or}, m_{or})$  at row i. Using the intermediate functions above, the value of the bitwise digests at row i must satisfy the following constraints for  $op \in \{xor, and, or\}$ :

$$\begin{split} \operatorname{digest}_{\operatorname{op}}[i] &= \frac{1}{(\operatorname{triple-to-field}(\operatorname{a}[0],\operatorname{b}[0],\operatorname{c}[0]) + \alpha)} - \\ &= \frac{\operatorname{m_{\operatorname{op}}}[0]}{(\operatorname{op-bits-to-field}(\operatorname{a0}[0],\ldots,\operatorname{a3}[0],\operatorname{b0}[0],\ldots,\operatorname{b3}[0]) + \alpha)} + \ldots + \\ &= \frac{1}{(\operatorname{triple-to-field}(\operatorname{a}[i],\operatorname{b}[i],\operatorname{c}[i]) + \alpha)} - \\ &= \frac{\operatorname{m_{\operatorname{op}}}[i]}{(\operatorname{op-bits-to-field}(\operatorname{a0}[i],\ldots,\operatorname{a3}[i],\operatorname{b0}[i],\ldots,\operatorname{b3}[i]) + \alpha)} \end{split}$$

This implies that the difference in the  $digest_{op}$  column between consecutive rows for  $op \in \{xor, and, or\}$  is as follows:

$$\begin{aligned} \operatorname{digest}_{\operatorname{op}}[i] - \operatorname{digest}_{\operatorname{op}}[i-1] &= & \frac{1}{(\operatorname{triple-to-field}(\operatorname{a}[i],\operatorname{b}[i],\operatorname{c}[i]) + \alpha)} - \\ &= & \frac{\operatorname{m_{\operatorname{op}}}[i]}{(\operatorname{op-bits-to-field}(\operatorname{a0}[i],\ldots,\operatorname{a3}[i],\operatorname{b0}[i],\ldots,\operatorname{b3}[i]) + \alpha)} \end{aligned}$$

If we use a preprocessed trace, the lookup tables will contain  $(A, B, C_{xor}, C_{and}, C_{or})$  triples where  $0 \le A, B < 256, C_{op} = A$  op B for  $op \in \{xor, and, or\}$ . Using these,  $digest_{op}[i]$  at row i for

 $op \in \{xor, and, or\}$  satisfies

$$\begin{split} \operatorname{digest_{op}}[i] &= \frac{1}{(\operatorname{triple-to-field}(\mathbf{a}[0], \mathbf{b}[0], \mathbf{c}[0]) + \alpha)} - \\ &= \frac{\underset{\mathbf{m_{op}}[0]}{\max_{\mathbf{p}}[0]}}{(\operatorname{triple-to-field}(A[0], B[0], C_{\mathbf{op}}[0]) + \alpha)} + \ldots + \\ &= \frac{1}{(\operatorname{triple-to-field}(\mathbf{a}[i], \mathbf{b}[i], \mathbf{c}[i]) + \alpha)} - \\ &= \frac{\underset{\mathbf{m_{op}}[i]}{\max_{\mathbf{p}}[i]}}{(\operatorname{triple-to-field}(A[i], B[i], C_{\mathbf{op}}[i]) + \alpha)} \end{split}$$

This implies that the difference in the digest column between consecutive rows will be:

$$\begin{aligned} \operatorname{digest}_{\operatorname{op}}[i] - \operatorname{digest}_{\operatorname{op}}[i-1] &= \frac{1}{(\operatorname{triple-to-field}(\operatorname{a}[i],\operatorname{b}[i],\operatorname{c}[i]) + \alpha)} - \\ &= \frac{\operatorname{m_{\operatorname{op}}}[i]}{(\operatorname{triple-to-field}(A[i],B[i],C_{\operatorname{op}}[i]) + \alpha)} \end{aligned}$$

### 3.4.5 Bitwise logup computations – multiple triples per row

Since we may need to check multiple triples for bitwise operations per row, it is straightforward to extend the logup computation in the previous section to handle that case. Let  $(a_1, b_1, c_1), \ldots, (a_k, b_k, c_k)$  denote k triples being checked in the same row for an operation  $op \in \{xor, and, or\}$ . Let  $m_{op}$  be the multiplicity for the entry  $(a0, \ldots, a3, b0, \ldots, b3)$ . In this case, the difference in the  $digest_{op}$  column between consecutive rows will be:

$$\begin{aligned} \operatorname{digest}_{\operatorname{op}}[i] - \operatorname{digest}_{\operatorname{op}}[i-1] &= \frac{1}{(\operatorname{triple-to-field}(\operatorname{a}_1[i],\operatorname{b}_1[i],\operatorname{c}_1[i]) + \alpha)} + \ldots + \\ &= \frac{1}{(\operatorname{triple-to-field}(\operatorname{a}_k[i],\operatorname{b}_k[i],\operatorname{c}_k[i]) + \alpha)} - \\ &= \frac{\operatorname{m}_{\operatorname{op}}[i]}{(\operatorname{op-bits-to-field}(\operatorname{a0}[i],\ldots,\operatorname{a3}[i],\operatorname{b0}[i],\ldots,\operatorname{b3}[i]) + \alpha)} \end{aligned}$$

Moreover, if we assume the use of a preprocessed trace, the difference in the  ${\tt digest_{op}}$  column between consecutive rows will be:

$$\begin{aligned} \operatorname{digest}_{\operatorname{op}}[i] - \operatorname{digest}_{\operatorname{op}}[i-1] &= \frac{1}{(\operatorname{triple-to-field}(\operatorname{a}_1[i],\operatorname{b}_1[i],\operatorname{c}_1[i]) + \alpha)} + \ldots + \\ &= \frac{1}{(\operatorname{triple-to-field}(\operatorname{a}_k[i],\operatorname{b}_k[i],\operatorname{c}_k[i]) + \alpha)} - \\ &= \frac{\operatorname{m}_{\operatorname{op}}[i]}{(\operatorname{triple-to-field}(A[i],B[i],C_{\operatorname{op}}[i]) + \alpha)} \end{aligned}$$

#### 3.4.6 Bitwise lookup table constraints

**Boundary constraints**: Let n be the index of the last row, and let k represent the number of triples being looked up for  $op \in \{xor, and, or\}$ :

$$\begin{array}{ll} \operatorname{digest}_{\operatorname{op}}[0] & = & \frac{1}{(\operatorname{triple-to-field}(\mathtt{a}_1[0],\mathtt{b}_1[0],\mathtt{c}_1[0]) + \alpha)} + \ldots + \\ & \frac{1}{(\operatorname{triple-to-field}(\mathtt{a}_k[0],\mathtt{b}_k[0],\mathtt{c}_k[0]) + \alpha)} - \\ & \frac{\mathtt{m}_{\operatorname{op}}[0]}{(\operatorname{op-bits-to-field}(\mathtt{a0}[0],\ldots,\mathtt{a3}[0],\mathtt{b0}[0],\ldots,\mathtt{b3}[0]) + \alpha)} \end{array}$$

### Transition constraints:

**Remark 3.2** The constraints above do not assume that lookup table entries are unique. While it suffices for a honest prover to create a separate entry for each value being looked up together with the corresponding multiplicity and pad unused rows with values  $(a0, \ldots, a3, b0, \ldots, b3, m) = (0, \ldots, 0)$ , a dishonest prover may create repeated entries in the lookup table with different non-zero multiplicity values.

- For instance, a dishonest prover could create an entry  $(a0, ..., a3, b0, ..., b3, m_1)$  in one row and an entry  $(a0, ..., a3, b0, ..., b3, m_2)$  in a different row. This is not a problem since the proof will only go through if  $m_1 + m_2$  matches the correct multiplicity for the entry (a0, ..., a3, b0, ..., b3). If  $m_1 + m_2$  does not match the correct multiplicity for that entry, the logup check will fail.
- Though it is possible to add additional constraints to enforce that each entry can have at most 1 non-zero value for the multiplicity, this does not seem needed.

# 4 CPU component

The CPU component is responsible for fetching, decoding, and preparing instructions for execution. In particular, this involves:

- Reading the next instruction from the program memory using the program counter
- Decode the instruction and check the correctness of its format
- Read values associated with operands from the register memory
- Execute the instruction using the instruction execution component

While most of the tasks described above will be handled separately by different components of the zkVM, such as the register or program memory, the instruction decoding and format verification task will be handled exclusively by the CPU component. For the remaining tasks, the main job of the CPU component is to specify the values used to call these components.

**Remark 4.1** In order to facilitate the understanding of the current specification, we often describe relevant constraints in two steps:

- In a first step, we describe all constraints under the assumption that the underlying finite field used by the arithmetization is sufficiently large to represent 32-bit elements.
- Then, in a second step, we describe the actual constraints used by our implementation assuming small finite fields, such as the m31 field used by the Stwo prover.

During this second step, variables representing large elements will be split into limbs representing their components. When doing so, we often have to introduce additional variables and constraints to capture cases which did not have to be considered in the large field case.

**Section outline**: The remainder of this section is organized as follows:

- Section 4.1 defines the main trace elements used by the CPU component, including instructions flags.
- Sections 4.2 and 4.3 describe the constraints for the CPU component assuming, respectively, large and small fields. More precisely:
  - Sections 4.2.1 and 4.3.1 describe range checks.
  - Sections 4.2.2 and 4.3.2 specify constraints related to instruction flags. This includes constraints for defining combined instruction types, such as is-type-r and is-type-s, and for enforcing that only one instruction flag can be set to 1 in a given row.
  - Sections 4.2.3 and 4.3.3 provide arithmetic constraints related to instruction decoding and pc and clk operations. This includes constraints for ensuring that (1) pc and clk are initialized and incremented correctly, (2) the instruction word value matches the values in the instruction operands and opcodes, and (3) the instruction flag being set matches the instruction opcode.
  - Sections 4.2.4 and 4.3.4 specify the CPU interactions with other components.

#### 4.1 CPU trace elements

We will create the following set of trace elements for the CPU component:

- clk: The current execution time
- pc: The current value of the program counter register
- pc-next: The next value of the program counter register
- opcode: The opcode defining the instruction
- op-a: The address of the first operand of the instruction
- op-b: The address of the second operand of the instruction
- op-c: The address of the third operand of the instruction
- op-b-flag: A flag indicating whether operand op-b is used
- imm-c: A flag indicating whether operand op-c is an immediate value
- instr-val: A 32-bit word encoding the instruction stored at the pc address
- a-val: the value of operand op-a
- b-val: the value of operand op-b
- c-val: the value of operand op-c

In addition to the values defined above, we also define selector flags for each instruction supported by the Nexus zkVM and additional flag used for padding:

- is-lui: a selector flag which indicates an lui operation
- is-auipc: a selector flag which indicates an auipc operation
- is-jal: a selector flag which indicates an jal operation
- is-jalr: a selector flag which indicates an jalr operation
- is-ecall: a selector flag which indicates an ecall operation
- is-ebreak: a selector flag which indicates an ebreak operation
- is-fence: a selector flag which indicates an fence operation
  is-unimp: a selector flag which indicates an unimp operation
- is-beq: a selector flag which indicates an beq operation
- is-bne: a selector flag which indicates an bne operation
  - 23

- is-blt: a selector flag which indicates an blt operation
- is-bge: a selector flag which indicates an bge operation
- is-bltu: a selector flag which indicates an bltu operation
- is-bgeu: a selector flag which indicates an bgeu operation
- is-lb: a selector flag which indicates an lb operation
- is-lh: a selector flag which indicates an lh operation
- is-lw: a selector flag which indicates an lw operation
- is-lbu: a selector flag which indicates an lbu operation
- is-lhu: a selector flag which indicates an lhu operation
- is-sb: a selector flag which indicates an sb operation
- is-sh: a selector flag which indicates an sh operation
- is-sw: a selector flag which indicates an sw operation
- is-add: a selector flag which indicates an add or addi operation
- is-sub: a selector flag which indicates an sub operation
- is-sll: a selector flag which indicates an sll or slli operation
- is-slt: a selector flag which indicates an slt or slti operation
- is-sltu: a selector flag which indicates an sltu or sltiu operation
- is-xor: a selector flag which indicates an xor or xori operation
- is-srl: a selector flag which indicates an srl or srli operation
- is-sra: a selector flag which indicates an sra or srai operation
- is-or: a selector flag which indicates an or or ori operation
- is-and: a selector flag which indicates an and or andi operation
- is-pad: a selector flag which is used for padding, not a computational step

While the instruction flags are used to indicate whether the current row in the trace corresponds to a particular instruction, the <code>is-pad</code> flag is used when additional padding rows need to be added to the trace to make the total number of rows a power of two (which is required by the <code>Stwo</code> prover backend).

The value of each of the flags defined above should be either 0 or 1 and exactly one of these should be set to 1 in each row. Moreover, we also assume that padding rows should only appear at the end of the trace. Hence, once set to 1, the value of the is-pad flag should remain 1. We will be adding constraints to enforce these conditions.

Remark 4.2 • We do not define separate flags for ALU instructions with immediate values (such as addi) since other flags (such as is-add together with imm-c) can play the same role.

- We do not currently support fence instructions and assume that these are mapped to  $nop \equiv (addi \ x0 \ x0 \ 0)$  beforehand.
- unimp instructions are also not currently supported since the halting functionality is handled via system calls (see Table 3), though for completeness the decoding constraints do include some handling for it.

Convention on the specification of op-c: When imm-c = 0, op-c denotes a register address and its value is expected to be in the range  $[0, 2^5 - 1]$ . However, when imm-c = 1, the immediate value can be broken into several parts depending on the exact instruction being decoded with its total length varying between 5 and 20 bits, as described in Table 2. When this happens, the convention being adopted in this specification is that op-c will be equal to the value of this immediate after its different parts are listed in the correct order, but without taking other bits not present in the instruction into account. With this convention, op-c will always be a value in the range  $[0, 2^{20} - 1]$  and can therefore be represented by a single m31 field element. It is only when c-val is computed from op-c that the predetermined value of other bit positions are taken into account.

For instance, in the case of the jal instruction, bit 0 of the immediate value is assumed to be 0 and the remaining bits are specified via four parts (op-c1-10, op-c11, op-c12-19, op-c20) representing

respectively bits 1–10, 11, 12–19, and 20 of the immediate value. Hence, we computing op-c from (op-c1-10, op-c11, op-c12-19, op-c20), we simply check that op-c = op-c1-10+op-c11· $2^{10}$ +op-c12-19· $2^{11}$ + op-c20· $2^{19}$  and disregard the fact that bit 0 of the immediate is supposed to be 0. The fact that bit 0 is 0 and that op-c20 represents the sign bit of the immediate value is however taken into account when computing c-val from (op-c1-10, op-c11, op-c12-19, op-c20).

### 4.2 CPU constraints assuming large fields

### 4.2.1 Range checks

```
• \operatorname{clk} \in \left[0,2^{32}-1\right]
• (\operatorname{op-b-flag})(1-\operatorname{op-b-flag})=0
• (\operatorname{imm-c})(1-\operatorname{imm-c})=0

• (\operatorname{imm-c})(1-\operatorname{imm-c})=0

# \operatorname{pc} \in \left[0,2^{32}-1\right] - guaranteed via program memory checking

# \operatorname{pc-next} \in \left[0,2^{32}-1\right] - implied by arithmetic constraints

# \operatorname{instr-val} \in \left[0,2^{32}-1\right] - performed in the in the program memory component

# \operatorname{op-a} range check - guaranteed via register memory checking / arithmetic constraints

# \operatorname{op-b} range check - guaranteed via register memory checking when \operatorname{op-b-flag}=1

# \operatorname{op-c} range check - guaranteed via register memory checking when \operatorname{imm-c}=0

# \operatorname{a-val} range check - guaranteed via register memory checking when \operatorname{op-b-flag}=1

# \operatorname{op-val} range check - guaranteed via register memory checking when \operatorname{op-b-flag}=1

# \operatorname{c-val} range check - guaranteed via register memory checking when \operatorname{op-b-flag}=1
```

**Remark 4.3** • Flags for instruction types and constraints for specific instructions are specified separately below.

- Range checks when imm-c = 1 are being specified in the sections for the relevant instructions.
- Range checks for op-a and a-val are guaranteed via register memory checking except for instructions for which op-a is not defined. In such cases, we add constraints to guarantee that op-a = 0 and a-val = 0.
- fence is not supported and should be mapped to  $nop \equiv (addi x0 x0 0)$ .

### 4.2.2 Instruction flag constraints

**Remark 4.4** Since all constraints in the remaining of this section are for the same row, we do not explicitly write down the row index i when describing these constraints.

```
/\!\!/ Enforcing instruction flags are either 0 or 1
  (\mathtt{is}\mathtt{-lui}) \cdot (1 - \mathtt{is}\mathtt{-lui}) = 0
                                                 ⊳ lui instruction
   (is-auipc) \cdot (1 - is-auipc) = 0

    ▷ auipc instruction

   (is-jal) \cdot (1-is-jal) = 0

⇒ jal instruction

    (is-jalr) \cdot (1-is-jalr) = 0

⇒ jalr instruction

    (is-ecall) \cdot (1-is-ecall) = 0 \triangleright ecall instruction
    (is-ebreak) \cdot (1-is-ebreak) = 0 \triangleright ebreak instruction
    (is-fence) \cdot (1-is-fence) = 0 \triangleright fence instruction
    (is-unimp) \cdot (1 - is-unimp) = 0

    □ unimp instruction

    (is-beq) \cdot (1 - is-beq) = 0
                                            \triangleright beq instruction
    (\texttt{is-bne}) \cdot (1 - \texttt{is-bne}) = 0
                                                 > bne instruction
    (\mathtt{is\text{-}blt})\cdot(1-\mathtt{is\text{-}blt})=0

    blt instruction

   (\texttt{is-bge}) \cdot (1 - \texttt{is-bge}) = 0
                                                 ▷ bge instruction
    (\mathtt{is\text{-}bltu}) \cdot (1-\mathtt{is\text{-}bltu}) = 0

⊳ bltu instruction

    (\mathtt{is\text{-}bgeu}) \cdot (1-\mathtt{is\text{-}bgeu}) = 0

    bgeu instruction

    (\texttt{is-lb}) \cdot (1-\texttt{is-lb}) = 0
                                                   ▷ lb instruction
    (\mathtt{is\text{-}lh})\cdot(1-\mathtt{is\text{-}lh})=0
                                                   ▷ lh instruction
    (\mathtt{is-lw}) \cdot (1 - \mathtt{is-lw}) = 0
                                                   > lw instruction
    (\mathtt{is\text{-}lbu})\cdot(1-\mathtt{is\text{-}lbu})=0
                                                   > 1bu instruction
    (\mathtt{is\text{-}lhu})\cdot(1-\mathtt{is\text{-}lhu})=0
                                                    > 1hu instruction
```

```
(\texttt{is-sb}) \cdot (1 - \texttt{is-sb}) = 0

    ⇒ sb instruction

          (\mathtt{is}\mathtt{-sh})\cdot(1-\mathtt{is}\mathtt{-sh})=0

⊳ sh instruction

          (\mathtt{is\text{-}sw})\cdot(1-\mathtt{is\text{-}sw})=0
                                                                                                                             > sw instruction
          (is-add) \cdot (1-is-add) = 0
                                                                                                                             > add or addi instruction
          (\mathtt{is\text{-}sub}) \cdot (1 - \mathtt{is\text{-}sub}) = 0
                                                                                                                             > sub instruction
          (is-sll) \cdot (1-is-sll) = 0

⊳ sll or slli instruction

          (is-slt) \cdot (1 - is-slt) = 0

⊳ slt or slti instruction

          (is-sltu) \cdot (1 - is-sltu) = 0

⊳ sltu or sltiu instruction

          (is-xor) \cdot (1 - is-xor) = 0
                                                                                                                             > xor or xori instruction
          (is-srl) \cdot (1 - is-srl) = 0
                                                                                                                             > srl or srli instruction
          (is-sra) \cdot (1 - is-sra) = 0
                                                                                                                             > sra or srai instruction
          (\texttt{is-or}) \cdot (1 - \texttt{is-or}) = 0

    or or ori instruction

          (is-and) \cdot (1 - is-and) = 0
                                                                                                                              > and or andi instruction
          (\texttt{is-pad}) \cdot (1 - \texttt{is-pad}) = 0

    □ used for padding

/\!\!/ Enforcing exactly one instruction flag is set to 1
       is-lui + is-auipc + is-jal + is-jalr + is-ecall + is-ebreak + is-unimp + is-break + is-unimp + is-break + is
          {\tt is-beq+is-blt+is-bge+is-bltu+is-bgeu} + \\
          is-lb+is-lh+is-lw+is-lbu+is-lhu+is-sb+is-sh+is-sw+is-lhu+is-lhu+is-sb+is-sh+is-sw+is-lhu+is-lhu+is-sb+is-sh+is-sw+is-lhu+is-sb+is-sh+is-sw+is-sh+is-sw+is-sh+is-sw+is-sh+is-sw+is-sh+is-sw+is-sh+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-
          {\tt is-add+is-sub+is-sll+is-slt+is-sltu+}
          is-xor + is-srl + is-sra + is-or + is-and + is-pad = 1
// Matching flag with instruction opcode
      (\texttt{is-lui}) \cdot (\texttt{opcode} - \texttt{LUI}) = 0
                                                                                                                             (is-auipc) \cdot (opcode - AUIPC) = 0
                                                                                                                                 (is-jal) \cdot (opcode - JAL) = 0
                                                                                                                                 (is-jalr) \cdot (opcode - JALR) = 0
                                                                                                                                 (is-ecall) \cdot (opcode - ECALL) = 0 \triangleright ecall instruction
         (is-ebreak) \cdot (opcode - EBREAK) = 0 > ebreak instruction
          (is-fence) \cdot (opcode - FENCE) = 0 \triangleright fence instruction
          (is-unimp) \cdot (opcode - UNIMP) = 0 \triangleright unimp instruction
          (is-beq) \cdot (opcode - BEQ) = 0

    beq instruction

          (\texttt{is-bne}) \cdot (\texttt{opcode} - \texttt{BNE}) = 0

    bne instruction

    bne instruct
          (is-blt) \cdot (opcode - BLT) = 0
                                                                                                                                 \triangleright blt instruction
          (is-bge) \cdot (opcode - BGE) = 0

    bge instruction

          (is-bltu) \cdot (opcode - BLTU) = 0

    bltu instruction

          (is-bgeu) \cdot (opcode - BGEU) = 0

    bgeu instruction

          (is-lb) \cdot (opcode - LB) = 0
                                                                                                                              > 1b instruction
          (is-lh) \cdot (opcode - LH) = 0
                                                                                                                              > 1h instruction
          (\texttt{is-lw}) \cdot (\texttt{opcode} - \texttt{LW}) = 0

⊳ lw instruction

          (is-lbu) \cdot (opcode - LBU) = 0
                                                                                                                          ▷ lbu instruction
          (\texttt{is-lhu}) \cdot (\texttt{opcode} - \texttt{LHU}) = 0

    □ lhu instruction

          (\texttt{is-sb}) \cdot (\texttt{opcode} - \texttt{SB}) = 0

⊳ sb instruction

          (\texttt{is-sh}) \cdot (\texttt{opcode} - \texttt{SH}) = 0

⊳ sh instruction

          (\texttt{is-sw}) \cdot (\texttt{opcode} - \texttt{SW}) = 0
                                                                                                                              > sw instruction
          (is-add) \cdot (opcode - ADD) = 0
                                                                                                                               > add or addi instruction
          (is-sub) \cdot (opcode - SUB) = 0
                                                                                                                               > sub instruction
          (is-sll) \cdot (opcode - SLL) = 0

⊳ sll or slli instruction

          (is-slt) \cdot (opcode - SLT) = 0

⊳ slt or slti instruction

          (is-sltu) \cdot (opcode - SLTU) = 0

⊳ sltu or sltiu instruction

          (is-xor) \cdot (opcode - XOR) = 0
                                                                                                                                 (is-srl) \cdot (opcode - SRL) = 0
                                                                                                                                 \triangleright srl or srli instruction
          (is-sra) \cdot (opcode - SRA) = 0

⊳ sra or srai instruction
           (is-or) \cdot (opcode - OR) = 0
                                                                                                                                 \triangleright or or ori instruction
           (is-and) \cdot (opcode - AND) = 0
                                                                                                                                 > and or andi instruction
```

#### Remark 4.5 In the constraints defined above:

• LUI, AUIPC,..., OR, AND are assumed to be predefined constants for the opcodes.

- The constraints above actually refer to these constants as field elements.
- For instance, if XYZ = 0b000100, then XYZ = 4 in the constraint equation.

```
// Instruction types

hicksim Type U
  is-type-u = is-lui + is-auipc

    □ Type J - JAL instruction

  is-type-j = is-jal
  is-load = is-lb + is-lh + is-lw + is-lbu + is-lhu
                                                                         is-type-s = is-sb + is-sh + is-sw

    □ Type S - Store instructions

    □ Type B - Branch instructions

 is-type-b = is-beq + is-bne + is-blt + is-bge + is-bltu + is-bgeu
  is-type-sys = is-ecall + is-ebreak
                                                                         // Type R instructions
  is-type-r = (1 - imm-c) \cdot (is-add + is-sub + is-slt + is-sltu + is-xor + is-or + is-and + is-sll + is-srl + is-sra)
  is-alu = is-add + is-sub + is-slt + is-sltu + is-xor + is-or + is-and + is-sll + is-srl + is-sra
  is-alu-imm-shift = imm-c \cdot (is-sll + is-srl + is-sra)
  is-alu-imm-no-shift = imm-c \cdot (is-add + is-slt + is-sltu + is-xor + is-or + is-and)
# Type I instructions with non-shift immediate values
  is-type-i-no-shift = is-load + is-alu-imm-no-shift + is-jalr
// Type I instructions
• is-type-i = is-load + is-alu-imm-no-shift + is-alu-imm-shift + is-jalr
```

Remark 4.6 Since the constraints above guarantee that only 1 instruction flag can be set to 1 in a clock cycle, it follows that the sum of any subset of these flags will be equal to 1 if one of the flags in the subset is set to 1 and 0 otherwise. As a result, there is no need to introduce additional constraints to enforce that combined instruction types, such as is-type-u or is-type-s, are in the range  $\{0,1\}$  since these are defined as a sum of instruction types. Moreover, since imm-c  $\{0,1\}$ , there is no need to enforce the range  $\{0,1\}$  for variables such as is-type-r or is-alu-imm-shift, since these variables are equal to the product of imm-c or 1 – imm-c by a sum of instruction types.

#### 4.2.3 Arithmetic constraints

The main checks to be performed by the CPU component are that:

- pc and clk are correctly initialized and incremented;
- the instruction word value matches the values in the instruction operands and opcodes;
- the corrected instruction flag being set matches the instruction opcode; and
- only one of the instruction flags is set.

While the constraints related to the last of these checks are described in Section 4.2.2, the arithmetic constraints used to enforce the remaining checks are specified in this section.

For numbers in binary, we will need to convert them first to a field element, using a standard binary-to-integer conversion (e.g.,  $0b0000011 \equiv 2^1 + 2^0 \equiv 3$ ,  $0b1100011 \equiv 2^6 + 2^5 + 2^1 + 2^0 \equiv 99$ ).

### **Boundary constraints**

#### Multi-row constraints

```
/\!\!/ Transition constraints for the program counter pc[i] for row i>0 unless is-pad[i]=1
\bullet \quad (1-\mathtt{is-first}[i]) \cdot (1-\mathtt{is-pad}[i]) \cdot (\mathtt{pc}[i]-\mathtt{pc-next}[i-1]) = 0
/\!\!/ Transition constraints for {
m clk}[i] for row i>0
• clk[i] = clk[i-1] + 1
/\!\!/ Ensuring that is-pad remains 1 once it is set to 1 for row i>0
\bullet \quad (1-\mathtt{is}\mathtt{-first}[i]) \cdot (1-\mathtt{is}\mathtt{-pad}[i]) \cdot (\mathtt{is}\mathtt{-pad}[i-1]) = 0
```

#### Common constraints

Remark 4.7 Since all constraints in the remaining of this section are for the same row, we do not explicitly write down the row index i when describing these constraints.

```
# Ensuring that pc is a multiple of 4
\bullet \quad \operatorname{pc-aux} \cdot 4 - \operatorname{pc}
/\!\!/ Enforcing pc-aux \in \left[0,2^{30}-1\right]
• pc-aux \in [0, 2^{30} - 1]
\# Ensuring that op-b-flag = 1 for all instructions except lui, auipc, jal, unimp
• (is-sb+is-sh+is-sw+is-lb+is-lh+is-lw+is-lbu+is-lhu+is-jalr+is-lw+is-lbu+is-lhu+is-jalr+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw+is-lw
            is-add+is-sub+is-slt+is-sltu+is-xor+is-or+is-and+is-sll+is-srl+is-sra+is-and+is-sll+is-srl+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+is-sra+
           \verb|is-beq+is-bne+is-blt+is-bge+is-bltu+is-bgeu+is-ecall+is-ebreak-op-b-flag| = 0
/\!\!/ Ensuring that imm-c = 1 for non-ALU instructions
/\!\!/ Notice that imm-c can be 0 or 1 for ALU instructions
• (is-lui + is-auipc + is-jal + is-jalr + is-ecall + is-ebreak +
           \mathtt{is\text{-}sb} + \mathtt{is\text{-}sh} + \mathtt{is\text{-}sw} + \mathtt{is\text{-}lb} + \mathtt{is\text{-}lh} + \mathtt{is\text{-}lw} + \mathtt{is\text{-}lbu} + \mathtt{is\text{-}lhu} +
           is-beq + is-bne + is-blt + is-bge + is-bltu + is-bgeu)(1 - imm-c) = 0
/\!\!/ Enforcing imm-c = 0 for sub
• (is-sub) \cdot imm-c = 0
// Determining a-val-effective
// a-val-effective-flag is an auxiliary flag
\# a-val-effective-flag-aux and a-val-effective-flag-aux-inv are non-zero auxiliary variables
/\!\!/ a-val-effective-flag = 1 indicates op-a is non-zero
/\!\!/ a-val-effective-flag =0 indicates op-a is zero
ullet op-a · a-val-effective-flag-aux = a-val-effective-flag
/\!\!/ Ensuring a-val-effective-flag-aux \neq 0
ullet a-val-effective-flag-aux \cdot a-val-effective-flag-aux-inv =1
/\!\!/ Enforcing a-val-effective-flag \in \{0,1\}
• (a-val-effective-flag) \cdot (1-a-val-effective-flag) = 0
# Enforcing relation between a-val and a-val-effective
ullet a-val \cdot a-val-effective-flag = a-val-effective
Type U – LUI and AUIPC instructions
```

```
// Type U instructions - lui and auipc
# op-a is ranged checked via reg memory checking
/\!\!/ op-b is set to 0
// op-c range check follows from other range checks below
// Making sure that op-c is consistent with the immediate parts
 is-type-u \cdot (op-c12-31 - op-c) = 0
// Setting lower 12 bits to 0 in order to compute c-val from op-c
```

```
• is-type-u \cdot (op-c12-31 \cdot 2<sup>12</sup> - c-val) = 0
# Range checking the different immediate parts
• is-type-u · (op-c12-31 \in [0, 2^{20} - 1])
# Checking instruction format for type U instructions
• (is-lui) · (0b0110111 + op-a · 2^7 + op-c · 2^{12} - instr-val) = 0
• (is-auipc) · (0b0010111 + op-a · 2^7 + op-c · 2^{12} - instr-val) = 0
/\!\!/ Enforcing op-b = 0 for lui, auipc
• (is-type-u) \cdot (op-b) = 0
/\!\!/ Enforcing b-val = 0 for lui, auipc, jal
• (is-type-u) \cdot (b-val) = 0
Type J – JAL instruction
// Type J instruction - jal
# op-a is ranged checked via reg memory checking
/\!\!/ op-b is set to 0
// op-c range check follows from other range checks below
# Making sure that op-c is consistent with the immediate parts
• is-type-j \cdot (op-c1-10 + op-c11 \cdot 2^{10} + op-c12-19 \cdot 2^{11} + op-c20 \cdot 2^{19} - op-c) = 0
/\!\!/ Setting lower bit to 0 and performing sign extension to compute c-val from op-c
 \bullet \quad \text{is-type-j} \cdot (\text{op-c1-10} \cdot 2 + \text{op-c11} \cdot 2^{11} + \text{op-c12-19} \cdot 2^{12} + \text{op-c20} \cdot 2^{20} \cdot (2^{12} - 1) - \text{c-val}) = 0 
# Range checking the different immediate parts
• is-type-j \cdot (op-c1-10 \in [0, 2^{10} - 1])
• is-type-j \cdot (op-c11) \cdot (1 - op-c11) = 0
• is-type-j · (op-c12-19 \in [0, 2^8 - 1])
• is-type-j \cdot (op-c20) \cdot (1 - op-c20) = 0
// Checking instruction format for type J instructions
 \text{ (is-jal)} \cdot (\texttt{0b1101111} + \texttt{op-a} \cdot 2^7 + \texttt{op-c12-19} \cdot 2^{12} + \texttt{op-c11} \cdot 2^{20} + \texttt{op-c1-10} \cdot 2^{21} + \texttt{op-c20} \cdot 2^{31} - \texttt{instr-val}) = 0 
/\!\!/ Enforcing op-b = 0 for type J instructions
• (is-type-j) \cdot (op-b) = 0
/\!\!/ Enforcing b-val = 0 for type J instructions
• (is-type-j) \cdot (b-val) = 0
Type S - Store instructions sb, sh, sw
// Type S - Store instructions sb, sh, sw
// Making sure that op-c is consistent with the immediate parts
• is-type-s \cdot (op-c0-4 + op-c5-10 \cdot 2<sup>5</sup> + op-c11 \cdot 2<sup>11</sup> - op-c) = 0
# Performing sign extension to compute c-val from op-c
• is-type-s \cdot (op-c0-4 + op-c5-10 \cdot 2<sup>5</sup> + op-c11 \cdot 2<sup>11</sup> \cdot (2<sup>21</sup> - 1) - c-val) = 0
# Range checking the different immediate parts
• is-type-s \cdot (op-c0-4 \in [0, 2^5 - 1])
• is-type-s \cdot (op-c5-10 \in [0, 2^6 - 1])
• is-type-s \cdot (op-c11) \cdot (1 - op-c11) = 0
Load instructions + JALR + ALU instructions with non-shift immediate values
/\!\!/ Load instructions - lb, lh, lw, lbu, lhu
/\!\!/ ALU instructions with non-shift immediate values - add, slt, sltu, xor, and instructions with imm-c = 1
# op-a, op-b are ranged checked via reg memory checking
# Making sure that op-c is consistent with the immediate parts
```

```
• (is-load + is-alu-imm-no-shift + is-jalr) \cdot (op-c0-10 + op-c11 \cdot 2<sup>11</sup> - op-c) = 0
# Performing sign extension to compute c-val from op-c
\bullet \quad (\mathtt{is-load} + \mathtt{is-alu-imm-no-shift} + \mathtt{is-jalr}) \cdot (\mathtt{op-c0-10} + \mathtt{op-c11} \cdot 2^{11} \cdot (2^{21} - 1) - \mathtt{c-val}) = 0
 # Range checking the different immediate parts
      (is-load + is-alu-imm-no-shift + is-jalr) \cdot (op-c0-10 \in [0, 2^{11} - 1])
     (is-load + is-alu-imm-no-shift + is-jalr) \cdot (op-c11) \cdot (1 - op-c11) = 0
// Checking instruction format for load instructions
     (is-1b) \cdot (0b0000011 + op-a \cdot 2^7 + 0b000 \cdot 2^{12} + op-b \cdot 2^{15} + op-c \cdot 2^{20} - instr-val) = 0
      (is-lh) \cdot (0b0000011 + op-a \cdot 2^7 + 0b001 \cdot 2^{12} + op-b \cdot 2^{15} + op-c \cdot 2^{20} - instr-val) = 0
      (is-lw) \cdot (0b0000011 + op-a \cdot 2^7 + 0b010 \cdot 2^{12} + op-b \cdot 2^{15} + op-c \cdot 2^{20} - instr-val) = 0
      (is-lbu) \cdot (0b0000011 + op-a \cdot 2^7 + 0b100 \cdot 2^{12} + op-b \cdot 2^{15} + op-c \cdot 2^{20} - instr-val) = 0
     (\text{is-lhu}) \cdot (\text{0b0000011} + \text{op-a} \cdot 2^7 + \text{0b101} \cdot 2^{12} + \text{op-b} \cdot 2^{15} + \text{op-c} \cdot 2^{20} - \text{instr-val}) = 0
# Checking instruction format for the jalr instruction
• (is-jalr) \cdot (0b1100111 + op-a \cdot 2^7 + 0b000 \cdot 2^{12} + op-b \cdot 2^{15} + op-c \cdot 2^{20} - instr-val) = 0
\slash\hspace{-0.4em} Checking format for ALU Instructions with non-shift immediate values
     \begin{array}{l} (\mathtt{is\text{-add}}) \cdot (\mathtt{imm\text{-}c}) \cdot (\mathtt{0}\mathtt{b}\mathtt{0}\mathtt{0}\mathtt{1}\mathtt{0}\mathtt{0}\mathtt{1}\mathtt{1} + \mathtt{op\text{-}a} \cdot 2^7 + \mathtt{0}\mathtt{b}\mathtt{0}\mathtt{0}\mathtt{0} \cdot 2^{12} + \mathtt{op\text{-}b} \cdot 2^{15} + \mathtt{op\text{-}c} \cdot 2^{20} - \mathtt{instr\text{-}val}) = 0 \\ (\mathtt{is\text{-}slt}) \cdot (\mathtt{imm\text{-}c}) \cdot (\mathtt{0}\mathtt{b}\mathtt{0}\mathtt{0}\mathtt{1}\mathtt{0}\mathtt{0}\mathtt{1}\mathtt{1} + \mathtt{op\text{-}a} \cdot 2^7 + \mathtt{0}\mathtt{b}\mathtt{0}\mathtt{1}\mathtt{0} \cdot 2^{12} + \mathtt{op\text{-}b} \cdot 2^{15} + \mathtt{op\text{-}c} \cdot 2^{20} - \mathtt{instr\text{-}val}) = 0 \end{array}
     (is-sltu) \cdot (imm-c) \cdot (0b0010011 + op-a \cdot 2^7 + 0b011 \cdot 2^{12} + op-b \cdot 2^{15} + op-c \cdot 2^{20} + -instr-val) = 0
 \bullet \quad (\text{is-xor}) \cdot (\text{imm-c}) \cdot (\text{0b0010011} + \text{op-a} \cdot 2^7 + \text{0b100} \cdot 2^{12} + \text{op-b} \cdot 2^{15} + \text{op-c} \cdot 2^{20} - \text{instr-val}) = 0 
• (is-or) \cdot (imm-c) \cdot (0b0010011 + op-a \cdot 2<sup>7</sup> + 0b110 \cdot 2<sup>12</sup> + op-b \cdot 2<sup>15</sup> + op-c \cdot 2<sup>20</sup> - instr-val) = 0
     (is-and) \cdot (imm-c) \cdot (0b0010011 + op-a \cdot 2^7 + 0b111 \cdot 2^{12} + op-b \cdot 2^{15} + op-c \cdot 2^{20} - instr-val) = 0
ALU instructions with shift immediate values - sll, srl, sra instructions with imm-c = 1
/\!\!/ ALU instructions with shift immediate values - sll, srl, sra instructions with imm-c = 1
// Making sure that op-c is consistent with the immediate parts
• (is-alu-imm-shift) \cdot (op-c0-4 - op-c) = 0
# Setting c-val to op-c0-4
• (is-alu-imm-shift) \cdot (op-c0-4 - c-val) = 0
# Range checking the different immediate parts
   (\text{is-alu-imm-shift}) \cdot (\text{op-c0-4} \in \left[0, 2^5 - 1\right])
\slash\hspace{-0.4em}/\hspace{-0.4em}/ Checking format for ALU Instructions with shift immediate values
 \begin{array}{l} \bullet \quad \text{(is-sl1)} \cdot (\texttt{imm-c}) \cdot (\texttt{0b0010011} + \texttt{op-a} \cdot 2^7 + \texttt{0b001} \cdot 2^{12} + \texttt{op-b} \cdot 2^{15} + \texttt{op-c} \cdot 2^{20} + \texttt{0b0000000} \cdot 2^{25} - \texttt{instr-val}) = 0 \\ \bullet \quad \text{(is-sr1)} \cdot (\texttt{imm-c}) \cdot (\texttt{0b0010011} + \texttt{op-a} \cdot 2^7 + \texttt{0b101} \cdot 2^{12} + \texttt{op-b} \cdot 2^{15} + \texttt{op-c} \cdot 2^{20} + \texttt{0b0000000} \cdot 2^{25} - \texttt{instr-val}) = 0 \\ \bullet \quad \text{(is-sra)} \cdot (\texttt{imm-c}) \cdot (\texttt{0b0010011} + \texttt{op-a} \cdot 2^7 + \texttt{0b101} \cdot 2^{12} + \texttt{op-b} \cdot 2^{15} + \texttt{op-c} \cdot 2^{20} + \texttt{0b0100000} \cdot 2^{25} - \texttt{instr-val}) = 0 \\ \end{array} 
Type R - ALU instructions without immediate values – ALU instructions with imm-c = 0
/\!\!/ Type R instructions - add, sub, slt, sltu, xor, or, and, sll, srl, sra instructions with imm-c =0
# op-a, op-b, op-c are ranged checked via reg memory checking
# Checking format for type R instructions
• (is-add) \cdot (1 - imm-c) \cdot (0b0010011 + op-a \cdot 2<sup>7</sup> + 0b000 \cdot 2<sup>12</sup> + op-b \cdot 2<sup>15</sup> + op-c \cdot 2<sup>20</sup> + 0b0000000 \cdot 2<sup>25</sup> - instr-val) = 0
      (is-sub) \cdot (1-imm-c) \cdot (0b0010011+op-a \cdot 2^7+0b000 \cdot 2^{12}+op-b \cdot 2^{15}+op-c \cdot 2^{20}+0b0100000 \cdot 2^{25}-instr-val) = 0
     (\mathsf{is\text{-}sl1}) \cdot (\mathsf{1} - \mathsf{imm\text{-}c}) \cdot (\mathsf{0b0010011} + \mathsf{op\text{-}a} \cdot 2^7 + \mathsf{0b001} \cdot 2^{12} + \mathsf{op\text{-}b} \cdot 2^{15} + \mathsf{op\text{-}c} \cdot 2^{20} + \mathsf{0b0000000} \cdot 2^{25} - \mathsf{instr\text{-}val}) = 0
     (\text{is-slt}) \cdot (1 - \text{imm-c}) \cdot (\text{0b0010011} + \text{op-a} \cdot 2^7 + \text{0b010} \cdot 2^{12} + \text{op-b} \cdot 2^{15} + \text{op-c} \cdot 2^{20} + \text{0b0000000} \cdot 2^{25} - \text{instr-val}) = 0
   (\text{is-sltu}) \cdot (1 - \text{imm-c}) \cdot (0 \text{b0010011} + \text{op-a} \cdot 2^7 + 0 \text{b011} \cdot 2^{12} + \text{op-b} \cdot 2^{15} + \text{op-c} \cdot 2^{20} + 0 \text{b0000000} \cdot 2^{25} - \text{instr-val}) = 0
     (\texttt{is-xor}) \cdot (\texttt{1}-\texttt{imm-c}) \cdot (\texttt{0b0010011} + \texttt{op-a} \cdot 2^7 + \texttt{0b100} \cdot 2^{12} + \texttt{op-b} \cdot 2^{15} + \texttt{op-c} \cdot 2^{20} + \texttt{0b0000000} \cdot 2^{25} - \texttt{instr-val}) = 0
     (is-sr1) \cdot (1-imm-c) \cdot (0b0010011+op-a \cdot 2^7+0b101 \cdot 2^{12}+op-b \cdot 2^{15}+op-c \cdot 2^{20}+0b0000000 \cdot 2^{25}-instr-val) = 0
     (\mathsf{is\text{-}sra}) \cdot (\mathsf{1} - \mathsf{imm\text{-}c}) \cdot (\mathsf{0b0010011} + \mathsf{op\text{-}a} \cdot 2^7 + \mathsf{0b101} \cdot 2^{12} + \mathsf{op\text{-}b} \cdot 2^{15} + \mathsf{op\text{-}c} \cdot 2^{20} + \mathsf{0b0100000} \cdot 2^{25} - \mathsf{instr\text{-}val}) = 0
     (is-or) \cdot (1 - imm-c) \cdot (0b0010011 + op-a \cdot 2^7 + 0b110 \cdot 2^{12} + op-b \cdot 2^{15} + op-c \cdot 2^{20} + 0b0000000 \cdot 2^{25} - instr-val) = 0
• (is-and) \cdot (1 - imm-c) \cdot (0b0010011 + op-a \cdot 2<sup>7</sup> + 0b111 \cdot 2<sup>12</sup> + op-b \cdot 2<sup>15</sup> + op-c \cdot 2<sup>20</sup> + 0b0000000 \cdot 2<sup>25</sup> - instr-val) = 0
```

Type B - Branch instructions beq, bne, blt, bge, bltu, bgeu

```
// Type B - Branch instructions beq, bne, blt, bge, bltu, bgeu
/\!\!/ Making sure that op-c is consistent with the immediate parts
• is-type-b·(op-c1-4+op-c5-10·2^4+op-c11·2^{10}+op-c12·2^{11}-op-c) = 0
/\!\!/ Setting lower bit to 0 and performing sign extension to compute c-val from op-c
• is-type-b·(op-c1-4·2+op-c5-10·2<sup>5</sup>+op-c11·2<sup>11</sup>+op-c12·2<sup>12</sup>·(2<sup>20</sup>-1)-c-val) = 0
# Range checking the different immediate parts
     is-type-b · (op-c1-4 \in [0, 2^4 - 1])
    is-type-b · (op-c5-10 \in [0, 2^5 - 1])
   [s-type-b \cdot (op-c11) \cdot (1-op-c11)] = 0
   is-type-b \cdot (op-c12) \cdot (1 - op-c12) = 0
// Checking instruction format for branch instructions
     (\text{is-beq}) \cdot (\text{0b1100011} + \text{op-c11} \cdot 2^7 + \text{op-c1-4} \cdot 2^8 + \text{0b000} \cdot 2^{12} + \text{op-b} \cdot 2^{15}
                            +\text{op-a} \cdot 2^{20} + \text{op-c5-10} \cdot 2^{25} + \text{op-c12} \cdot 2^{31} - \text{instr-val}) = 0
      \begin{array}{l} (\mathtt{is\text{-}bne}) \cdot (\mathtt{0b1100011} + \mathtt{op\text{-}c11} \cdot 2^7 + \mathtt{op\text{-}c1\text{-}4} \cdot 2^8 + \mathtt{0b001} \cdot 2^{12} + \mathtt{op\text{-}b} \cdot 2^{15} \\ + \mathtt{op\text{-}a} \cdot 2^{20} + \mathtt{op\text{-}c5\text{-}10} \cdot 2^{25} + \mathtt{op\text{-}c12} \cdot 2^{31} - \mathtt{instr\text{-}val}) = 0 \end{array} 
      \begin{array}{c} (\mathtt{is\text{-}blt}) \cdot (\mathtt{0b1100011} + \mathtt{op\text{-}c11} \cdot 2^7 + \mathtt{op\text{-}c1\text{-}4} \cdot 2^8 + \mathtt{0b100} \cdot 2^{12} + \mathtt{op\text{-}b} \cdot 2^{15} \\ + \mathtt{op\text{-}a} \cdot 2^{20} + \mathtt{op\text{-}c5\text{-}10} \cdot 2^{25} + \mathtt{op\text{-}c12} \cdot 2^{31} - \mathtt{instr\text{-}val}) = 0 \end{array}
      (\texttt{is-bge}) \cdot (\texttt{0b1100011} + \texttt{op-c11} \cdot 2^7 + \texttt{op-c1-4} \cdot 2^8 + \texttt{0b101} \cdot 2^{12} + \texttt{op-b} \cdot 2^{15}
                            +\text{op-a} \cdot 2^{20} + \text{op-c5-10} \cdot 2^{25} + \text{op-c12} \cdot 2^{31} - \text{instr-val}) = 0
      (\mathtt{is\text{-}bltu}) \cdot (\mathtt{0b1100011} + \mathtt{op\text{-}c11} \cdot 2^7 + \mathtt{op\text{-}c1\text{-}4} \cdot 2^8 + \mathtt{0b110} \cdot 2^{12} + \mathtt{op\text{-}b} \cdot 2^{15}
                            +\text{op-a} \cdot 2^{20} + \text{op-c5-10} \cdot 2^{25} + \text{op-c12} \cdot 2^{31} - \text{instr-val} = 0
     (\texttt{is-bgeu}) \cdot (\texttt{0b1100011} + \texttt{op-c11} \cdot 2^7 + \texttt{op-c1-4} \cdot 2^8 + \texttt{0b111} \cdot 2^{12} + \texttt{op-b} \cdot 2^{15}
                            +op-a \cdot 2^{20} + op-c5-10 \cdot 2^{25} + op-c12 \cdot 2^{31} - instr-val) = 0
```

### System instructions - ecall, ebreak

### UNIMP instruction - unimp

```
 \begin{tabular}{ll} $\#$ UNIMP instruction - unimp \\ $\#$ op-a, op-c, op-c are all set to 0 \\ \hline \bullet & (is-unimp) \cdot (0b1110011 + 0b00000 \cdot 2^7 + 0b001 \cdot 2^{12} + 0b00000 \cdot 2^{15} + 0b00000000011 \cdot 2^{20} - instr-val) = 0 \\ $\#$ Enforcing op-a = op-b = op-c = 0 for unimp \\ \hline \bullet & (is-unimp) \cdot (op-a) = 0 \\ \hline \bullet & (is-unimp) \cdot (op-b) = 0 \\ \hline \bullet & (is-unimp) \cdot (op-c) = 0 \\ $\#$ Enforcing a-val = b-val = c-val = 0 for unimp \\ \hline \bullet & (is-unimp) \cdot (a-val) = 0 \\ \hline \bullet & (is-unimp) \cdot (b-val) = 0 \\ \hline \bullet & (is-unimp) \cdot (c-val) = 0 \\ \hline \hline \end{tabular}
```

#### 4.2.4 Interactions with other components

As mentioned above, the CPU component needs to interact with a few other components:

- Interaction with program memory: To read the next instruction using the program counter;
- Interaction with register memory: to read values associated with operands;
- Interaction with instruction execution: to enforce correction execution of instructions.

#### Program memory interaction

•  $instr-val \leftarrow Read_{Prog}(pc, clk)$ 

#### Register memory interaction

```
// Type S + B instructions do not have a destination register
// Type S + B instructions instead read from the op-a register
// Hence, we obtain a-val by reading from the op-a register
(is-type-s + is-type-b) · (Read_Reg(op-a, clk, 3) - a-val) = 0

// Type R + I + U + J instructions have a destination register
// Hence, we need to write a-val to the op-a register
(is-type-r + is-type-i + is-type-u + is-type-j) · (Write_Reg(op-a, a-val-effective, clk, 3) - a-val-effective) = 0

// Type SYS instructions will interact with the register memory directly
// a-val is an input provided by the environment for Type SYS instructions

// We only read from register op-b to obtain b-val when op-b-flag = 1
(op-b-flag) · (Read_Reg(op-b, clk, 1) - b-val) = 0

// We only need to read from register op-c to obtain c-val for Type R instructions
(is-type-r) · (Read_Reg(op-c, clk, 2) - c-val) = 0
```

#### Execution component interaction

 $\bullet \quad \texttt{pc-next} \leftarrow \texttt{exec}(\texttt{pc}, \texttt{opcode}, \texttt{a-val}, \texttt{b-val}, \texttt{c-val})$ 

## 4.3 CPU constraints assuming small fields

### 4.3.1 Range checks

```
\# op-c range check - guaranteed via register memory checking when imm-c = 0 \# a-val^{(i)} range check for i=1,2,3,4 - performed in the register memory component \# b-val^{(i)} range check for i=1,2,3,4 - performed in the register memory component when op-b-flag = 1 \# c-val^{(i)} range check for i=1,2,3,4 - performed in the via register memory checking when imm-c = 0
```

#### 4.3.2 Instruction flag constraints

**Remark 4.8** Since all constraints in the remaining of this section are for the same row, we do not explicitly write down the row index i when describing these constraints.

```
/\!\!/ Enforcing instruction flags are either 0 or 1
• (is-lui) \cdot (1 - is-lui) = 0
                                                                                                                      (is-auipc) \cdot (1-is-auipc) = 0
                                                                                                              > auipc instruction
     (\texttt{is-jal}) \cdot (1 - \texttt{is-jal}) = 0

⇒ jal instruction

• (is-jalr) \cdot (1-is-jalr) = 0

⇒ jalr instruction

• (is-ecall) \cdot (1 - is-ecall) = 0
                                                                                                                \bullet \quad (\mathtt{is-ebreak}) \cdot (1-\mathtt{is-ebreak}) = 0 \qquad \rhd \, \mathtt{ebreak} \,\, \mathtt{instruction}
\bullet \quad ({\tt is-fence}) \cdot (1-{\tt is-fence}) = 0 \qquad \qquad \rhd \; {\tt fence} \; \; {\tt instruction}
\bullet \quad (\mathtt{is\text{-}unimp}) \cdot (1-\mathtt{is\text{-}unimp}) = 0 \qquad \qquad \rhd \, \mathtt{unimp} \, \, \mathtt{instruction}
• (is-beq) \cdot (1 - is-beq) = 0

    beq instruction

\bullet \quad (\mathtt{is}\mathtt{-bne}) \cdot (1-\mathtt{is}\mathtt{-bne}) = 0

    bne instruction

• (is-blt) \cdot (1-is-blt) = 0

    blt instruction

     (\mathtt{is-bge}) \cdot (1 - \mathtt{is-bge}) = 0

    bge instruction

  • (is-bltu) \cdot (1 - is-bltu) = 0

    bltu instruction

        (is-bgeu) \cdot (1 - is-bgeu) = 0

    bgeu instruction

        (\mathtt{is-lb}) \cdot (1 - \mathtt{is-lb}) = 0

    ▷ lb instruction

        (\mathtt{is-lh}) \cdot (1-\mathtt{is-lh}) = 0

⊳ lh instruction

        (\mathtt{is-lw}) \cdot (1 - \mathtt{is-lw}) = 0

⊳ lw instruction

         (\mathtt{is-lbu}) \cdot (1 - \mathtt{is-lbu}) = 0

    □ lbu instruction

         (is-lhu) \cdot (1 - is-lhu) = 0
                                                                                                                   > lhu instruction
         (\mathtt{is\text{-}sb})\cdot(1-\mathtt{is\text{-}sb})=0

⊳ sb instruction

         (\texttt{is-sh}) \cdot (1 - \texttt{is-sh}) = 0
                                                                                                                   > sh instruction
         (\texttt{is-sw}) \cdot (1 - \texttt{is-sw}) = 0
                                                                                                                   > sw instruction
         (is-add) \cdot (1 - is-add) = 0
                                                                                                                   > add or addi instruction
         (\texttt{is-sub}) \cdot (1 - \texttt{is-sub}) = 0
                                                                                                                   (\mathsf{is}\mathsf{-sll})\cdot(1-\mathsf{is}\mathsf{-sll})=0

⇒ sll or slli instruction

         (is-slt) \cdot (1-is-slt) = 0
                                                                                                                   > slt or slti instruction
         (\texttt{is-sltu}) \cdot (1 - \texttt{is-sltu}) = 0

⊳ sltu or sltiu instruction

         (\texttt{is-xor}) \cdot (1 - \texttt{is-xor}) = 0
                                                                                                                    > xor or xori instruction
         (\mathsf{is}\mathsf{-srl})\cdot(1-\mathsf{is}\mathsf{-srl})=0
                                                                                                                   > srl or srli instruction
         (\texttt{is-sra}) \cdot (1 - \texttt{is-sra}) = 0
                                                                                                                   > sra or srai instruction
         (\texttt{is-or}) \cdot (1 - \texttt{is-or}) = 0

    or or ori instruction

         (is-and) \cdot (1-is-and) = 0
                                                                                                                  > and or andi instruction
         (\mathtt{is}\mathtt{-pad})\cdot(1-\mathtt{is}\mathtt{-pad})=0
                                                                                                                   > used for padding
/\!\!/ Enforcing exactly one instruction flag is set to 1
     is-lui+is-auipc+is-jal+is-jalr+is-ecall+is-ebreak+is-unimp+is-is-lui+is-auipc+is-jal+is-jalr+is-ecall+is-ebreak+is-unimp+is-lui+is-ebreak+is-unimp+is-lui+is-ebreak+is-unimp+is-ecall+is-ebreak+is-unimp+is-ecall+is-ebreak+is-unimp+is-ecall+is-ebreak+is-unimp+is-ecall+is-ebreak+is-unimp+is-ecall+is-ebreak+is-unimp+is-ecall+is-ebreak+is-unimp+is-ecall+is-ebreak+is-unimp+is-ecall+is-ebreak+is-unimp+is-ecall+is-ebreak+is-ecall+is-ebreak+is-unimp+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ecall+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-ebreak+is-eb
         {\tt is-beq+is-bne+is-blt+is-bge+is-bltu+is-bgeu+}
          is-lb+is-lh+is-lw+is-lbu+is-lhu+is-sb+is-sh+is-sw+is-lbu+is-lhu+is-sb+is-sh+is-sw+is-lbu+is-lbu+is-lhu+is-sb+is-sh+is-sw+is-lbu+is-lbu+is-sb+is-sh+is-sw+is-lbu+is-sb+is-sh+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+is-sw+
         \mathtt{is}\text{-}\mathtt{add} + \mathtt{is}\text{-}\mathtt{sub} + \mathtt{is}\text{-}\mathtt{sll} + \mathtt{is}\text{-}\mathtt{sltu} + \mathtt{is}\text{-}\mathtt{sltu} + \mathtt{is}\text{-}\mathtt{sltu}
         is-xor + is-srl + is-sra + is-or + is-and + is-pad = 1
# Matching flag with instruction opcode
• (is-lui) \cdot (opcode - LUI) = 0 \triangleright lui instruction
\bullet \quad (\texttt{is-auipc}) \cdot (\texttt{opcode} - \texttt{AUIPC}) = 0 \qquad \rhd \, \texttt{auipc instruction}

⇒ jal instruction

• (is-jal) \cdot (opcode - JAL) = 0
• (is-jalr) \cdot (opcode - JALR) = 0
```

```
(is-ecall) \cdot (opcode - ECALL) = 0
                                              \vartriangleright ecall instruction
 (\texttt{is-ebreak}) \cdot (\texttt{opcode} - \texttt{EBREAK}) = 0 \quad \rhd \; \texttt{ebreak} \; \texttt{instruction}
 (\texttt{is-fence}) \cdot (\texttt{opcode} - \texttt{FENCE}) = 0 \qquad \rhd \texttt{fence instruction}
 (\mathtt{is\text{-}unimp}) \cdot (\mathtt{opcode} - \mathtt{UNIMP}) = 0 \qquad \vartriangleright \mathtt{unimp} \ \mathtt{instruction}
 (is-beq) \cdot (opcode - BEQ) = 0

    beg instruction

 (is-bne) \cdot (opcode - BNE) = 0
                                             ▷ bne instruction
 (is-blt) \cdot (opcode - BLT) = 0

    blt instruction

 (is-bge) \cdot (opcode - BGE) = 0
                                             ▷ bge instruction
 (is-bltu) \cdot (opcode - BLTU) = 0

    bltu instruction

 (is-bgeu) \cdot (opcode - BGEU) = 0
                                             ⊳ bgeu instruction
 (is-lb) \cdot (opcode - LB) = 0
                                             ▷ lb instruction
 (is-lh) \cdot (opcode - LH) = 0

    □ Ih instruction

 (is-lw) \cdot (opcode - LW) = 0
                                            > lw instruction
 (is-lbu) \cdot (opcode - LBU) = 0
                                            > lbu instruction
 (is-lhu) \cdot (opcode - LHU) = 0

    □ lhu instruction

 (is-sb) \cdot (opcode - SB) = 0

⊳ sb instruction

 (is-sh) \cdot (opcode - SH) = 0

⇒ sh instruction

 (is-sw) \cdot (opcode - SW) = 0
                                            (is-add) \cdot (opcode - ADD) = 0

    ▷ add or addi instruction

 (\texttt{is-sub}) \cdot (\texttt{opcode} - \texttt{SUB}) = 0
                                           > sub instruction
 (\texttt{is-sll}) \cdot (\texttt{opcode} - \texttt{SLL}) = 0

⇒ sll or slli instruction

 (is-slt) \cdot (opcode - SLT) = 0
                                           > slt or slti instruction
 (is-sltu) \cdot (opcode - SLTU) = 0 \triangleright sltu or sltiu instruction
 (\texttt{is-xor}) \cdot (\texttt{opcode} - \texttt{XOR}) = 0
                                           > xor or xori instruction
 (\texttt{is-srl}) \cdot (\texttt{opcode} - \texttt{SRL}) = 0
                                           > srl or srli instruction
(is-sra) \cdot (opcode - SRA) = 0
                                           > sra or srai instruction
(is-or) \cdot (opcode - OR) = 0

    or or ori instruction

(is-and) \cdot (opcode - AND) = 0

    □ and or andi instruction
```

#### Remark 4.9 In the constraints defined above:

- LUI, AUIPC, ..., OR, AND are assumed to be predefined constants for the opcodes.
- The constraints above actually refer to these constants as field elements.
- For instance, if XYZ = 0b000100, then XYZ = 4 in the constraint equation.

```
// Instruction types
  is-type-u = is-lui + is-auipc

    □ Type U

    □ Type J - JAL instruction

   is-type-j = is-jal
   is-load = is-lb + is-lh + is-lw + is-lbu + is-lhu
                                                                                is-type-s = is-sb + is-sh + is-sw

    □ Type S - Store instructions

   is-type-b = is-beq + is-bne + is-blt + is-bge + is-bltu + is-bgeu
                                                                                \triangleright Type B - Branch instructions
   is-type-sys = is-ecall + is-ebreak
                                                                                // Type R instructions
• is-type-r = (1 - imm-c) \cdot (is-add + is-sub + is-slt + is-sltu + is-xor + is-or + is-and + is-sll + is-srl + is-sra)
// ALU instructions
  is-alu = is-add + is-sub + is-slt + is-sltu + is-xor + is-or + is-and + is-sll + is-srl + is-sra
   is-alu-imm-shift = imm-c \cdot (is-sll + is-srl + is-sra)
   is-alu-imm-no-shift = imm-c \cdot (is-add + is-slt + is-sltu + is-xor + is-or + is-and)
# Type I instructions with non-shift immediate values
\bullet \quad \text{is-type-i-no-shift} = \text{is-load} + \text{is-alu-imm-no-shift} + \text{is-jalr}
// Type I instructions
\bullet \quad \text{is-type-i} = \text{is-load} + \text{is-alu-imm-no-shift} + \text{is-alu-imm-shift} + \text{is-jalr}
```

Remark 4.10 As already stated in Remark 4.6 for the large field case, since the constraints above guarantee that only 1 instruction flag can be set to 1 in a clock cycle, it follows that the sum of any subset of these flags will be equal to 1 if one of the flags in the subset is set to 1 and 0 otherwise.

As a result, there is no need to introduce additional constraints to enforce that combined instruction types, such as is-type-u or is-type-s, are in the range  $\{0,1\}$  since these are defined as a sum of instruction types. Moreover, since imm-c  $\in \{0,1\}$ , there is no need to enforce the range  $\{0,1\}$  for variables such as is-type-r or is-alu-imm-shift, since these variables are equal to the product of imm-c or 1- imm-c by a sum of instruction types.

#### 4.3.3 Arithmetic constraints

#### **Boundary constraints**

```
\label{eq:setting pc} \begin{array}{l} \mbox{$/$/$} & \mbox{Setting pc}[0] = \mbox{PC-INIT} \mbox{- this is a constant set during the two-pass phase} \\ \mbox{$\bullet$} & \mbox{$\rm pc}^{(1)}[0] = \mbox{PC-INIT}^{(1)} \\ \mbox{$\bullet$} & \mbox{$\rm pc}^{(2)}[0] = \mbox{PC-INIT}^{(3)} \\ \mbox{$\bullet$} & \mbox{$\rm pc}^{(4)}[0] = \mbox{PC-INIT}^{(4)} \\ \mbox{$/$/$} & \mbox{Setting clk}[0] = 1 \\ \mbox{$\bullet$} & \mbox{clk}^{(1)}[0] = 1 \\ \mbox{$\bullet$} & \mbox{clk}^{(2)}[0] = 0 \\ \mbox{$\bullet$} & \mbox{clk}^{(3)}[0] = 0 \\ \mbox{$\bullet$} & \mbox{clk}^{(4)}[0] = 0 \\ \mbox{$\bullet$} & \mbox{clk}^{(4)}[0] = 0 \\ \mbox{$\bullet$} & \mbox{clk}^{(4)}[0] = 0 \end{array}
```

#### Multi-row constraints

**Remark 4.11** We should raise a clock overflow error if  $clk-carry[i]^{(2)} = 1$ . In that case, we will need to increase the number of limbs for clk and other timestamps.

**Remark 4.12** Since all constraints in the remaining of this section are for the same row, we do not explicitly write down the row index i when describing these constraints.

#### Common constraints

**Remark 4.13** a-val-effective is equal to a-val (assuming op-a  $\neq 0$ ) or 0 (otherwise).

```
\label{eq:continuous_equation} \begin{split} & \# \text{ Ensuring that pc is a multiple of 4} \\ & \bullet \quad \text{pc-aux}^{(1)} \cdot 4 - \text{pc}^{(1)} = 0 \\ & \# \text{ Enforcing pc-aux}^{(1)} \in \left[0, 2^6 - 1\right] \\ & \bullet \quad \text{pc-aux}^{(1)} \in \left[0, 2^6 - 1\right] \end{split}
```

```
/\!\!/ Ensuring that op-b-flag =1 for all instructions except lui, auipc, jal, unimp
\bullet \quad (\mathtt{is}\mathtt{-sb} + \mathtt{is}\mathtt{-sh} + \mathtt{is}\mathtt{-sw} + \mathtt{is}\mathtt{-lb} + \mathtt{is}\mathtt{-lh} + \mathtt{is}\mathtt{-lw} + \mathtt{is}\mathtt{-lbu} + \mathtt{is}\mathtt{-lhu} + \mathtt{is}\mathtt{-jalr} +
               \mathtt{is\text{-}add} + \mathtt{is\text{-}sub} + \mathtt{is\text{-}slt} + \mathtt{is\text{-}sltu} + \mathtt{is\text{-}xor} + \mathtt{is\text{-}or} + \mathtt{is\text{-}and} + \mathtt{is\text{-}sll} + \mathtt{is\text{-}srl} + \mathtt{is\text{-}sra} + \mathtt{is
               \verb|is-beq+is-bne+is-blt+is-bge+is-bltu+is-bgeu+is-ecall+is-ebreak-op-b-flag| = 0
/\!\!/ Ensuring that imm-c = 1 for non-ALU instructions
/\!\!/ Notice that imm-c can be 0 or 1 for ALU instructions
\bullet \quad (\mathtt{is\text{-}lui} + \mathtt{is\text{-}auipc} + \mathtt{is\text{-}jal} + \mathtt{is\text{-}jalr} + \mathtt{is\text{-}ecall} + \mathtt{is\text{-}ebreak} +
               is-sb+is-sh+is-sw+is-lb+is-lh+is-lw+is-lbu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu+is-lhu
               is-beq+is-bne+is-blt+is-bge+is-bltu+is-bgeu)(1-imm-c)=0
/\!\!/ Enforcing imm-c = 0 for sub
• (is-sub) \cdot imm-c = 0
// Determining a-val-effective
# a-val-effective-flag is an auxiliary flag
/\!\!/ a-val-effective-flag-aux and a-val-effective-flag-aux-inv are non-zero auxiliary variables
/\!\!/ a-val-effective-flag = 1 indicates op-a is non-zero
\# a-val-effective-flag =0 indicates op-a is zero
• op-a · a-val-effective-flag-aux = a-val-effective-flag
/\!\!/ Ensuring a-val-effective-flag-aux \neq 0
 \bullet \quad \text{a-val-effective-flag-aux} \cdot \text{a-val-effective-flag-aux-inv} = 1 \\
/\!\!/ Enforcing a-val-effective-flag \in \{0,1\}
         (a-val-effective-flag) \cdot (1-a-val-effective-flag) = 0
# Enforcing relation between a-val and a-val-effective
          \mathtt{a}	ext{-}\mathtt{val}^{(1)}\cdot\mathtt{a}	ext{-}\mathtt{val}	ext{-}\mathtt{effective}	ext{-}\mathtt{flag}=\mathtt{a}	ext{-}\mathtt{val}	ext{-}\mathtt{effective}^{(1)}
             a-val^{(2)} \cdot a-val-effective-flag = a-val-effective<sup>(2)</sup>
\bullet \quad \operatorname{a-val}^{(3)} \cdot \operatorname{a-val-effective-flag} = \operatorname{a-val-effective}^{(3)}
 \bullet \quad \operatorname{a-val}^{(4)} \cdot \operatorname{a-val-effective-flag} = \operatorname{a-val-effective}^{(4)}
```

Remark 4.14 imm-c can be 0 or 1 for ALU instructions, except for sub instructions for which imm-c must be 0.

### Type U – LUI and AUIPC instructions

For Type U instructions (LUI, AUIPC):

- op-a is a destination register selector
- a-val is obtained from the instruction execution component
- op-b = 0 and b-val<sup>(j)</sup> = 0 for j = 1, 2, 3, 4
- op-c is a 20-bit U-immediate value (see Section 2.4)

```
# Type U instructions - lui and auipc # op-a is ranged checked via reg memory checking # op-b is set to 0 # op-c range check follows from other range checks below # Making sure that op-c is consistent with the immediate parts • is-type-u · (op-c12-15 + op-c16-23 · 2^4 + op-c24-31 · 2^{12} - op-c) = 0 # Setting lower 12 bits to 0 in order to compute c-val from op-c • is-type-u · (c-val<sup>(1)</sup>) = 0 • is-type-u · (op-c12-15 · 2^4 - c-val<sup>(2)</sup>) = 0 • is-type-u · (op-c16-23 - c-val<sup>(3)</sup>) = 0 • is-type-u · (op-c24-31 - c-val<sup>(4)</sup>) = 0 # Range checking the different op-c immediate parts
```

```
• is-type-u · (op-c12-15 \in [0, 2^4 - 1])
    is-type-u · (op-c16-23 \in [0, 2^8 - 1])
   is-type-u · (op-c24-31 \in [0, 2^8 - 1])
// Making sure that op-a is consistent with the immediate parts
    \texttt{is-type-u} \cdot (\texttt{op-a0} + \texttt{op-a1-4} \cdot 2 - \texttt{op-a}) = 0
// Making sure that op-a immediate parts
    \texttt{is-type-u} \cdot (\texttt{op-a1}) \cdot (1 - \texttt{op-a1}) = 0
    is-type-u · (op-a1-4 \in [0, 2^4 - 1])
/\!\!/ Enforcing op-b = 0 for lui, auipc
    is-type-u \cdot (op-b) = 0
   Enforcing b-val = 0 for lui, auipc
    (\texttt{is-type-u}) \cdot (\texttt{b-val}^{(1)}) = 0
    (\texttt{is-type-u}) \cdot (\texttt{b-val}^{(2)}) = 0
    (\text{is-type-u}) \cdot (\text{b-val}^{(3)}) = 0
    (\texttt{is-type-u}) \cdot (\texttt{b-val}^{(4)}) = 0
  Checking instruction format for type U instructions
    (is-lui) \cdot (0b0110111 + op-a0 \cdot 2^7 - instr-val^{(1)}) = 0
                                                                                                \triangleright limb 1 for lui
    (is-auipc) \cdot (0b0010111 + op-a0 \cdot 2^7 - instr-val^{(1)}) = 0
                                                                                                \triangleright limb 1 for auipc
    (is-type-u) \cdot (op-a1-4 + op-c12-15 \cdot 2^4 - instr-val^{(2)}) = 0

hicksim limb 2
    (\texttt{is-type-u}) \cdot (\texttt{op-c16-23} - \texttt{instr-val}^{(3)}) = 0
                                                                                                 \triangleright limb 3
    (\texttt{is-type-u}) \cdot (\texttt{op-c24-31} - \texttt{instr-val}^{(4)}) = 0
                                                                                                 \triangleright limb 4
```

#### Motivational explanations:

- The goal of the above constraints is to use instr-val (which is uniquely determined by the program memory checking) to uniquely determine the columns relevant for type U instruction execution.
- Other constraints elsewhere make sure that exactly one of the instruction flags such as is-lui, is-auipc is set to one and the rest is zero. The above constraints come into play when the prover sets either is-lui or is-auipc. In those cases, is-type-u is also set to one.
- The constraints involving instr-val<sup>(1)</sup> make sure that is-lui or is-auipc can be set only when instr-val contains the corresponding opcode.
- The above constraints never force the prover to set is-lui or is-auipc. The prover sets one of these flags just because the prover needs to set one instruction flag, and they cannot set any other instruction flags (because of the opcode-checking constraints about the other instruction flags).
- The prover can alternatively set is-pad to 1, but once the prover does so, the prover needs to keep is-pad set to 1 until the end of the trace.
- The above constraints involving instr-val (together with the range check) uniquely determine op-a0, op-a1-4, op-c12-15, op-c16-23, op-c24-31.
- Other constraints above determine op-a , op-c and c-val uniquely as a linear combination of above. op-b and b-val are constrained to be zero.
- The register memory checking uses op-a to choose the destination register.

### Type J – JAL instruction

For Type J instructions (JAL):

- op-a is a destination register selector
- a-val is obtained from the instruction execution component
- op-b = 0 and b-val<sup>(j)</sup> = 0 for j = 1, 2, 3, 4
- op-c is a 20-bit J-immediate value (see Section 2.4)
- c-val is obtained from op-c by setting bit 0 to 0 and, bits 1-20 to op-c, and sign extending it

#### Constraints:

```
// Type J instructions - jal
# op-a is ranged checked via reg memory checking
/\!\!/ op-b is set to 0
// op-c range check follows from other range checks below
// Making sure that op-c is consistent with the immediate parts
\bullet \quad \text{is-type-j} \cdot (\text{op-c1-3} + \text{op-c4-7} \cdot 2^3 + \text{op-c8-10} \cdot 2^7 + \text{op-c11} \cdot 2^{10} + \text{op-c12-15} \cdot 2^{11} + \text{op-c16-19} \cdot 2^{15} + \text{op-c20} \cdot 2^{19} - \text{op-c}) = 0
// Computing c-val limbs from op-c and performing sign extension
  is-type-j \cdot (op-c1-3 \cdot 2 + op-c4-7 \cdot 2<sup>4</sup> - c-val<sup>(1)</sup>) = 0
  is-type-j \cdot (op-c8-10 + op-c11 \cdot 2<sup>3</sup> + op-c12-15 \cdot 2<sup>4</sup> - c-val<sup>(2)</sup>) = 0
\bullet \quad \text{is-type-j} \cdot (\text{op-c16-19} + \text{op-c20} \cdot (2^4-1) \cdot 2^4 - \text{c-val}^{(3)}) = 0
• is-type-j \cdot (op-c20 \cdot (2<sup>8</sup> - 1) - c-val<sup>(4)</sup>) = 0
# Range checking the different op-c immediate parts
  is-type-j · (op-c1-3 \in [0, 2^3 - 1])
  is-type-j\cdot(op-c4-7\in \left[0,2^4-1
ight])
   is-type-j \cdot (op-c8-10 \in [0, 2^3 - 1])
   \begin{array}{l} \text{is-type-j} \cdot (\text{op-c11}) \cdot (1 - \text{op-c11}) = 0 \\ \text{is-type-j} \cdot (\text{op-c12-15} \in \left[0, 2^4 - 1\right]) \end{array}
• is-type-j · (op-c12-15 \in [0, 2^4 - 1])
• is-type-j · (op-c16-19 \in [0, 2^4 - 1])
• is-type-j \cdot (op-c20) \cdot (1 - op-c20) = 0
// Making sure that op-a is consistent with the immediate parts
• is-type-j \cdot (op-a0 + op-a1-4 \cdot 2 - op-a) = 0
// Making sure that op-a immediate parts
  \texttt{is-type-j} \cdot (\texttt{op-a1}) \cdot (1 - \texttt{op-a1}) = 0
  \texttt{is-type-j} \cdot (\texttt{op-a1-4} \in \left\lceil 0, 2^4 - 1 \right\rceil)
/\!\!/ Enforcing op-b = 0 for jal
• is-type-j \cdot (op-b) = 0
/\!\!/ Enforcing b-val = 0 for jal
    (\texttt{is-type-j}) \cdot (\texttt{b-val}^{(1)}) = 0
    (\texttt{is-type-j}) \cdot (\texttt{b-val}^{(2)}) = 0
    (is-type-j) \cdot (b-val^{(3)}) = 0
    (\texttt{is-type-j}) \cdot (\texttt{b-val}^{(4)}) = 0
\slash\hspace{-0.4em}/\hspace{-0.4em}/\hspace{-0.4em} Checking instruction format for type J instructions
   (is-lui) \cdot (0b1101111 + op-a0 \cdot 2^7 - instr-val^{(1)}) = 0

hicksim limb 1
    (\texttt{is-type-j}) \cdot (\texttt{op-a1-4} + \texttt{op-c12-15} \cdot 2^4 - \texttt{instr-val}^{(2)}) = 0
                                                                                                                             \triangleright limb 2
  (is-type-j) \cdot (op-c16-19 + op-c11 \cdot 2^4 + op-c1-3 \cdot 2^5 - instr-val^{(3)}) = 0

hicksim limb 3
• (is-type-j) \cdot (op-c4-7 + op-c8-10 \cdot 2<sup>4</sup> + op-c20 \cdot 2<sup>7</sup> - instr-val<sup>(4)</sup>) = 0

hicksim limb 4
```

#### Type S – Store instructions SB, SH, SW

For Type S instructions (SB, SH, SW):

- op-a is a source register selector
- a-val is obtained by reading from the register memory component
- op-b is another source register selector
- b-val is obtained by reading from the register memory component
- op-c is a 12-bit S-immediate value (see Section 2.4)
- c-val is obtained by sign extending the value of op-c

```
// Type J instructions - jal
// op-a is ranged checked via reg memory checking
```

```
/\!\!/ op-b is ranged checked via reg memory checking
# op-c range check follows from other range checks below
// Making sure that op-c is consistent with the immediate parts
• is-type-s \cdot (op-c0 + op-c1-4 \cdot 2 + op-c5-7 \cdot 2<sup>5</sup> + op-c8-10 \cdot 2<sup>8</sup> + op-c11 \cdot 2<sup>11</sup> - op-c) = 0
// Computing c-val limbs from op-c and performing sign extension
• is-type-s \cdot (op-c0 + op-c1-4 \cdot 2 + op-c5-7 \cdot 2<sup>5</sup> - c-val<sup>(1)</sup>) = 0
  is-type-s \cdot (op-c8-10 + op-c11 \cdot (2<sup>5</sup> - 1) \cdot 2<sup>3</sup> - c-val<sup>(2)</sup>) = 0
   is-type-s \cdot (\text{op-c11} \cdot (2^8 - 1) - \text{c-val}^{(3)}) = 0
   is-type-s \cdot (\text{op-c11} \cdot (2^8 - 1) - \text{c-val}^{(4)}) = 0
# Range checking the different op-c immediate parts
• is-type-s \cdot (op-c0) \cdot (1 - op-c0) = 0
  \texttt{is-type-s} \cdot (\texttt{op-c1-4} \in \left\lceil 0, 2^4 - 1 \right\rceil)
  is-type-s · (op-c5-7 \in [0, 2^3 - 1])
  is-type-s \cdot (op-c8-10 \in [0,2^3-1])
• is-type-s \cdot (op-c11) \cdot (1 - op-c11) = 0
// Making sure that op-a is consistent with the immediate parts
• is-type-s \cdot (op-a0 + op-a1-4 \cdot 2 - op-a) = 0
# Range checking the different op-a immediate parts
• is-type-s \cdot (op-a1) \cdot (1 - op-a1) = 0
• is-type-s \cdot (op-a1-4 \in [0, 2^4 - 1])
// Making sure that op-b is consistent with the immediate parts
• is-type-s \cdot (op-b0-3 + op-b4 \cdot 2<sup>4</sup> - op-b) = 0
# Range checking the different op-b immediate parts
• is-type-s \cdot (op-b0-3 \in [0, 2^4 - 1])
   is-type-s \cdot (op-b4) \cdot (1 - op-b1) = 0
# Checking instruction format for type S instructions
  (is-type-s) \cdot (0b0100011 + op-c0 \cdot 2^7 - instr-val^{(1)}) = 0
                                                                                                           \triangleright limb 1
   (is-sb) \cdot (op-c1-4 + 0b000 \cdot 2^4 + op-a0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                          \triangleright limb 2 for sb
  (is-sh) \cdot (op-c1-4 + 0b001 \cdot 2^4 + op-a0 \cdot 2^7 - instr-val^{(2)}) = 0

hicksim limb 2 for sh
• (is-sw) · (op-c1-4 + 0b010 · 2^4 + op-a0 · 2^7 - instr-val<sup>(2)</sup>) = 0
                                                                                                           \triangleright limb 2 for sw
• (is-type-s) \cdot (op-a1-4 + op-b0-3 \cdot \cdot \cdot \cdot \cdot 1 instr-val \cdot \cdot \cdot 0 = 0
                                                                                                           \triangleright limb 3
• (is-type-s) · (op-b4 + op-c5-7 · 2 + op-c8-10 · 2^4 + op-c11 · 2^7 - instr-val^{(4)}) = 0 \Rightarrow limb 4
```

### Type I without shifts - Load + JALR + ALU instructions with non-shift immediates For Type I instructions without shifts (Load + JALR + ALU with non-shift immediate values):

- op-a is a destination register selector
- a-val is obtained from the instruction execution component
- a-val-effective will be written to the op-a register
- op-b is a source register selector
- b-val is obtained by reading from the register memory component
- op-c is a 12-bit I-immediate value (see Section 2.4)
- c-val is obtained by sign extending the value of op-c

```
# Type I instructions with no shifts - Load + JALR + ALU with non-shift immediate values # Load instructions - lb, lh, lw, lbu, lhu # ALU instructions with non-shift immediate values - add, slt, sltu, xor, and instructions with imm-c = 1 # op-a is ranged checked via reg memory checking # op-b is ranged checked via reg memory checking # op-c range check follows from other range checks below # Making sure that op-c is consistent with the immediate parts • (is-type-i-no-shift) \cdot (op-c0-3 + op-c4-7 \cdot \cdot \cdot \cdot \cdot 24 + op-c8-10 \cdot 28 + op-c11 \cdot 211 - op-c) = 0
```

```
# Performing sign extension to compute c-val from op-c
    (\texttt{is-type-i-no-shift}) \cdot (\texttt{op-c0-3} + \texttt{op-c4-7} \cdot 2^4 - \texttt{c-val}^{(1)}) = 0
    (\mathtt{is-type-i-no-shift}) \cdot (\mathtt{op-c8-10} + \mathtt{op-c11} \cdot (2^5-1) \cdot 2^3 - \mathtt{c-val}^{(2)}) = 0
    (\texttt{is-type-i-no-shift}) \cdot (\texttt{op-c11} \cdot (2^8 - 1) - \texttt{c-val}^{(3)}) = 0
    (is-type-i-no-shift) \cdot (op-c11 \cdot (2^8 - 1) - c-val^{(4)}) = 0
# Range checking the different op-c immediate parts
    is-type-i-no-shift \cdot (op-c0-3 \in [0, 2^4 - 1])
    is-type-i-no-shift \cdot (op-c4-7 \in \left\lceil 0, 2^4 - 1 \right\rceil)
    is-type-i-no-shift \cdot (op-c8-10 \in [0, 2^3 - 1])
    is-type-i-no-shift \cdot (op-c11) \cdot (1 - op-c11) = 0
// Making sure that op-a is consistent with the immediate parts
  is-type-i-no-shift \cdot (op-a0 + op-a1-4 \cdot 2 - op-a) = 0
# Range checking the different op-a immediate parts
    is-type-i-no-shift \cdot (op-a0) \cdot (1 - op-a0) = 0
    is-type-i-no-shift \cdot (op-a1-4 \in [0, 2^4-1])
// Making sure that op-b is consistent with the immediate parts
    is-type-i-no-shift \cdot (op-b0 + op-b1-4 \cdot 2 - op-b) = 0
# Range checking the different op-b immediate parts
  \texttt{is-type-i-no-shift} \cdot (\texttt{op-b0}) \cdot (1-\texttt{op-b1}) = 0
  is-type-i-no-shift \cdot (op-b1-4 \in [0, 2^4 - 1])
// Checking instruction format for limb 1
• (is-load) \cdot (0b0000011 + op-a0 \cdot 2<sup>7</sup> - instr-val<sup>(1)</sup>) = 0
                                                                                                              (\mathtt{is-alu-imm-no-shift}) \cdot (\mathtt{0b0010011} + \mathtt{op-a0} \cdot 2^7 - \mathtt{instr-val}^{(1)}) = 0
                                                                                                              > limb 1 for ALU with non-shift imm
    (is-jalr) \cdot (0b1100111 + op-a0 \cdot 2^7 - instr-val^{(1)}) = 0
                                                                                                              \triangleright limb 1 for jalr
// Checking instruction format for limb 2
    (is-1b) \cdot (op-a1-4 + 0b000 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                            \triangleright limb 2 for 1b
    (is-lh) \cdot (op-a1-4 + 0b001 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                            \triangleright limb 2 for lh
    (is-lw) \cdot (op-a1-4 + 0b010 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                            \triangleright limb 2 for lw
    (is-lbu) \cdot (op-a1-4 + 0b100 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                            \triangleright limb 2 for lbu
   (is-lhu) \cdot (op-a1-4 + 0b101 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0

hitharpoonup limb 2 for lhu
  (\texttt{is-add}) \cdot (\texttt{imm-c}) \cdot (\texttt{op-a1-4} + \texttt{0b000} \cdot 2^4 + \texttt{op-b0} \cdot 2^7 - \texttt{instr-val}^{(2)}) = 0
                                                                                                            ⊳ limb 2 for addi
 (\mathsf{is\text{-}slt}) \cdot (\mathsf{imm\text{-}c}) \cdot (\mathsf{op\text{-}a1\text{-}4} + \mathsf{0b010} \cdot 2^4 + \mathsf{op\text{-}b0} \cdot 2^7 - \mathsf{instr\text{-}val}^{(2)}) = 0

hittirp limb 2 for slti
    (is-sltu) \cdot (imm-c) \cdot (op-a1-4 + 0b011 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                             \triangleright limb 2 for sltiu
    (is-xor) \cdot (imm-c) \cdot (op-a1-4 + 0b100 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                             \triangleright limb 2 for xori
    (is-or) \cdot (imm-c) \cdot (op-a1-4 + 0b110 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                             \triangleright limb 2 for ori
    (is-and) \cdot (imm-c) \cdot (op-a1-4 + 0b111 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0

hicksim limb 2 for andi
    (is-jalr)(op-a1-4 + 0b000 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                             \triangleright limb 2 for jalr
// Checking instruction format for limb 3
• (is-type-i-no-shift) \cdot (op-b1-4 + op-c0-3 \cdot 2<sup>4</sup> - instr-val<sup>(3)</sup>) = 0
/\!\!/ Checking instruction format for limb 4
\bullet \quad (\text{is-type-i-no-shift}) \cdot (\text{op-c4-7} + \text{op-c8-10} \cdot 2^4 + \text{op-c11} \cdot 2^7 - \text{instr-val}^{(4)}) = 0 \ \rhd \ \text{limb} \ 4
```

#### Type I with shifts – ALU instructions with shift immediate values

For Type I - shifts instructions (ALU instructions with shift immediate values):

- op-a is a destination register selector
- a-val is obtained from the instruction execution component
- a-val-effective will be written to the op-a register
- op-b is a source register selector
- b-val is obtained by reading from the register memory component
- op-c is a 5-bit I-immediate value (see Section 2.4)
- c-val is obtained by setting the 5 lower bits to op-c and the higher bits to 0

```
/\!\!/ ALU instructions with shift immediate values - sll, srl, sra instructions with imm-c = 1
# op-a is ranged checked via reg memory checking
# op-b is ranged checked via reg memory checking
// op-c range check follows from other range checks below
// Making sure that op-c is consistent with the immediate parts
• (is-alu-imm-shift) \cdot (op-c0-3 + op-c4 \cdot 2<sup>4</sup> - op-c) = 0
// Computing c-val from op-c
  (\texttt{is-alu-imm-shift}) \cdot (\texttt{op-c0-3} + \texttt{op-c4} \cdot 2^4 - \texttt{c-val}^{(1)}) = 0
    (\texttt{is-alu-imm-shift}) \cdot (\texttt{c-val}^{(2)}) = 0
    (is-alu-imm-shift) \cdot (c-val^{(3)}) = 0
    (is-alu-imm-shift) \cdot (c-val^{(4)}) = 0
# Range checking the different op-c immediate parts
   \texttt{is-alu-imm-shift} \cdot (\texttt{op-c0-3} \in \left\lceil 0, 2^4 - 1 \right\rceil)
    is-alu-imm-shift \cdot (\text{op-c4}) \cdot (1 - \text{op-c4}) = 0
// Making sure that op-a is consistent with the immediate parts
  is-alu-imm-shift \cdot (op-a0 + op-a1-4 \cdot 2 - op-a) = 0
\ensuremath{/\!/} Range checking the different op-a immediate parts
   \texttt{is-alu-imm-shift} \cdot (\texttt{op-a1}) \cdot (1-\texttt{op-a1}) = 0
    is-alu-imm-shift \cdot (op-a1-4 \in [0, 2^4 - 1])
// Making sure that op-b is consistent with the immediate parts
• is-alu-imm-shift \cdot (op-b0 + op-b1-4 \cdot 2 - op-b) = 0
# Range checking the different op-b immediate parts
  \texttt{is-alu-imm-shift} \cdot (\texttt{op-b0}) \cdot (1 - \texttt{op-b1}) = 0
  is-alu-imm-shift \cdot (op-b1-4 \in [0, 2^4 - 1])
// Checking instruction format for limb 1
\bullet \quad (\mathtt{is-alu-imm-shift}) \cdot (\mathtt{0b0010011} + \mathtt{op-a0} \cdot 2^7 - \mathtt{instr-val}^{(1)}) = 0

hicksim limb 1
// Checking instruction format for limb 2
   (is-sl1) \cdot (imm-c) \cdot (op-a1-4 + 0b001 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                          \triangleright limb 2 for slli
    (is-srl) \cdot (imm-c) \cdot (op-a1-4 + 0b101 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                          ⊳ limb 2 for srli
   (is-sra) \cdot (imm-c) \cdot (op-a1-4 + 0b101 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                          \triangleright limb 2 for srai
// Checking instruction format for limb 3
\bullet \quad (\text{is-alu-imm-shift}) \cdot (\text{op-b1-4} + \text{op-c0-3} \cdot 2^4 - \text{instr-val}^{(3)}) = 0
                                                                                                           \triangleright limb 3
# Checking instruction format for limb 4
   (is-sl1) \cdot (imm-c) \cdot (op-c4 + 0b0000000 \cdot 2 - instr-val^{(4)}) = 0

hicksim limb 4 for slli
    (\texttt{is-srl}) \cdot (\texttt{imm-c}) \cdot (\texttt{op-c4} + \texttt{0b0000000} \cdot 2 - \texttt{instr-val}^{(4)}) = 0
                                                                                                           \triangleright limb 4 for srli
    (\texttt{is-sra}) \cdot (\texttt{imm-c}) \cdot (\texttt{op-c4} + \texttt{0b0100000} \cdot 2 - \texttt{instr-val}^{(4)}) = 0

hickspace > \mathtt{limb} \ 4 \ \mathtt{for} \ \mathtt{srai}
```

#### Type R: ALU instructions without immediate values

For Type R instructions (ALU instructions without immediate values):

- op-a is a destination register selector
- a-val is obtained from the instruction execution component
- a-val-effective will be written to the op-a register
- op-b is a source register selector
- b-val is obtained by reading from the register memory component
- op-c is another source register selector (see Section 2.4)
- c-val is obtained by reading from the register memory component

```
/\!\!/ ALU instructions where op-c is a register address - ALU instructions with imm-c =0 /\!\!/ op-a is ranged checked via reg memory checking /\!\!/ op-b is ranged checked via reg memory checking
```

```
# op-c is ranged checked via reg memory checking
// Making sure that op-c is consistent with the immediate parts
• (is-type-r) \cdot (op-c0-3 + op-c4 \cdot 2<sup>4</sup> - op-c) = 0
# Range checking the different op-c immediate parts
  is-type-r · (op-c0-3 \in [0, 2^4 - 1])
   is-type-r \cdot (op-c4) \cdot (1 - op-c4) = 0
// Making sure that op-a is consistent with the immediate parts
    is-type-r \cdot (op-a0 + op-a1-4 \cdot 2 - op-a) = 0
# Range checking the different op-a immediate parts
    \texttt{is-type-r} \cdot (\texttt{op-a1}) \cdot (1 - \texttt{op-a1}) = 0
   \texttt{is-type-r} \cdot (\texttt{op-a1-4} \in \left[0, 2^4 - 1\right])
// Making sure that op-b is consistent with the immediate parts
  is-type-r \cdot (op-b0 + op-b1-4 \cdot 2 - op-b) = 0
# Range checking the different op-b immediate parts
    \texttt{is-type-r} \cdot (\texttt{op-b0}) \cdot (1 - \texttt{op-b1}) = 0
    is-type-r · (op-b1-4 \in [0, 2^4 - 1])
/\!\!/ Checking instruction format for limb 1
    (\mathtt{is\text{-}type\text{-}r})\cdot(\mathtt{0b0110011}+\mathtt{op\text{-}a0}\cdot2^7-\mathtt{instr\text{-}val}^{(1)})=0
                                                                                                                         \triangleright limb 1
// Checking instruction format for limb 2
    (\mathtt{is-add}) \cdot (1 - \mathtt{imm-c}) \cdot (\mathtt{op-a1-4} + \mathtt{0b000} \cdot 2^4 + \mathtt{op-b0} \cdot 2^7 - \mathtt{instr-val}^{(2)}) = 0
                                                                                                                         \triangleright limb 2 for add
    (is-sub) \cdot (1 - imm-c) \cdot (op-a1-4 + 0b000 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                                         \vartriangleright limb 2 for sub
    (is-sll) \cdot (1 - imm-c) \cdot (op-al-4 + 0b001 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                                        \triangleright limb 2 for sll
    (is-slt) \cdot (1 - imm-c) \cdot (op-a1-4 + 0b010 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                                        \triangleright limb 2 for slt
    (\texttt{is-sltu}) \cdot (1 - \texttt{imm-c}) \cdot (\texttt{op-a1-4} + \texttt{0b011} \cdot 2^4 + \texttt{op-b0} \cdot 2^7 - \texttt{instr-val}^{(2)}) = 0
                                                                                                                       ⊳ limb 2 for sltu
    (\texttt{is-xor}) \cdot (1 - \texttt{imm-c}) \cdot (\texttt{op-a1-4} + \texttt{0b100} \cdot 2^4 + \texttt{op-b0} \cdot 2^7 - \texttt{instr-val}^{(2)}) = 0
                                                                                                                        \triangleright limb 2 for xor
    (is-srl) \cdot (1 - imm-c) \cdot (op-al-4 + 0b101 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                                        \triangleright limb 2 for srl
    (is-sra) \cdot (1 - imm-c) \cdot (op-a1-4 + 0b101 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                                        \triangleright limb 2 for sra
    (is-or) \cdot (1 - imm-c) \cdot (op-a1-4 + 0b110 \cdot 2^4 + op-b0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                                        \vartriangleright limb 2 for or
    (\texttt{is-and}) \cdot (1 - \texttt{imm-c}) \cdot (\texttt{op-a1-4} + \texttt{0b111} \cdot 2^4 + \texttt{op-b0} \cdot 2^7 - \texttt{instr-val}^{(2)}) = 0
                                                                                                                        \vartriangleright limb 2 for and
// Checking instruction format for limb 3
  (\texttt{is-type-r}) \cdot (\texttt{op-b1-4} + \texttt{op-c0-3} \cdot 2^4 - \texttt{instr-val}^{(3)}) = 0
                                                                                                                         \rhd \; {\tt limb} \;\; 3
# Checking instruction format for limb 4
    (is-add) \cdot (1 - imm-c) \cdot (op-c4 + 0b00000000 \cdot 2 - instr-val^{(4)}) = 0
                                                                                                                        \triangleright limb 4 for add
    (is-sub) \cdot (1 - imm-c) \cdot (op-c4 + 0b0100000 \cdot 2 - instr-val^{(4)}) = 0
                                                                                                                        \triangleright limb 4 for sub
    (\texttt{is-sll}) \cdot (1 - \texttt{imm-c}) \cdot (\texttt{op-c4} + \texttt{0b00000000} \cdot 2 - \texttt{instr-val}^{(4)}) = 0
                                                                                                                       \triangleright limb 4 for sll
    (is-slt) \cdot (1 - imm-c) \cdot (op-c4 + 0b0000000 \cdot 2 - instr-val^{(4)}) = 0

hicksim limb 4 for slt
    (\mathtt{is\text{-}sltu}) \cdot (1 - \mathtt{imm\text{-}c}) \cdot (\mathtt{op\text{-}c4} + \mathtt{0b00000000} \cdot 2 - \mathtt{instr\text{-}val}^{(4)}) = 0
                                                                                                                       \triangleright limb 4 for sltu
    (is-xor) \cdot (1 - imm-c) \cdot (op-c4 + 0b00000000 \cdot 2 - instr-val^{(4)}) = 0
                                                                                                                       ⊳ limb 4 for xor
    (is-srl) \cdot (1 - imm-c) \cdot (op-c4 + 0b00000000 \cdot 2 - instr-val^{(4)}) = 0
                                                                                                                       \triangleright limb 4 for srl
   (\texttt{is-sra}) \cdot (1 - \texttt{imm-c}) \cdot (\texttt{op-c4} + \texttt{0b0100000} \cdot 2 - \texttt{instr-val}^{(4)}) = 0
                                                                                                                      \triangleright limb 4 for sra
    (\texttt{is-or}) \cdot (1 - \texttt{imm-c}) \cdot (\texttt{op-c4} + \texttt{0b0000000} \cdot 2 - \texttt{instr-val}^{(4)}) = 0

hicksim limb 4 for or
    (\mathtt{is}\mathtt{-and}) \cdot (1 - \mathtt{imm-c}) \cdot (\mathtt{op-c4} + \mathtt{0b00000000} \cdot 2 - \mathtt{instr-val}^{(4)}) = 0

hicksim limb 4 for and
```

#### Type B: Branch instructions

For Type B instructions (Branch instructions):

- op-a is a destination register selector
- a-val is obtained from the instruction execution component
- a-val-effective will be written to the op-a register
- op-b is a source register selector
- b-val is obtained by reading from the register memory component
- op-c is a 12-bit B-immediate value (see Section 2.4)
- c-val is obtained by setting bit 0 to 0 and, bits 1-12 to op-c, and sign extending it

#### CONSTRAINTS:

```
// Type B instructions - beq, bne, blt, bge, bltu, bgeu
# op-a is ranged checked via reg memory checking
# op-b is ranged checked via reg memory checking
// op-c range check follows from other range checks below
// Making sure that op-c is consistent with the immediate parts
• (is-type-b) \cdot (op-c1-4 + op-c5-7 \cdot 2<sup>4</sup> + op-c8-10 \cdot 2<sup>7</sup> + op-c11 \cdot 2<sup>10</sup> + op-c12 \cdot 2<sup>11</sup> - op-c) = 0
// Computing c-val from op-c and performing sign extension
  (is-type-b) \cdot (op-c1-4 \cdot 2 + op-c5-7 \cdot 2^5 - c-val^{(1)}) = 0
    (is-type-b) \cdot (op-c8-10 + op-c11 \cdot 2^3 + op-c12 \cdot (2^4 - 1) \cdot 2^4 - c-val^{(2)}) = 0
     \begin{array}{l} (\text{is-type-b}) \cdot (\text{op-c12} \cdot (2^8-1) - \text{c-val}^{(3)}) = 0 \\ (\text{is-type-b}) \cdot (\text{op-c12} \cdot (2^8-1) - \text{c-val}^{(4)}) = 0 \end{array} 
# Range checking the different op-c immediate parts
• is-type-b · (op-c1-4 \in [0, 2^4 - 1])
  is-type-b · (op-c5-7 \in [0, 2^3 - 1])
  is-type-b \cdot (op-c8-10 \in [0, 2^3 - 1])
  \texttt{is-type-b} \cdot (\texttt{op-c11}) \cdot (1 - \texttt{op-c11}) = 0
    \texttt{is-type-b} \cdot (\texttt{op-c12}) \cdot (1 - \texttt{op-c12}) = 0
# Making sure that op-a is consistent with the immediate parts
• is-type-b \cdot (op-a0 + op-a1-4 \cdot 2 - op-a) = 0
# Range checking the different op-a immediate parts
  is-type-b \cdot (op-a1) \cdot (1 - op-a1) = 0
• is-type-b · (op-a1-4 \in [0, 2^4 - 1])
/\!\!/ Making sure that op-b is consistent with the immediate parts
\bullet \quad \text{is-type-b} \cdot (\text{op-b0-3} + \text{op-b4} \cdot 2^4 - \text{op-b}) = 0
# Range checking the different op-b immediate parts
• is-type-b · (op-b0-3 \in [0, 2^4 - 1])
• is-type-b \cdot (op-b4) \cdot (1 - op-b4) = 0
// Checking instruction format for limb 1
• (is-type-b) \cdot (0b1100011 + op-c11 \cdot 2<sup>7</sup> - instr-val<sup>(1)</sup>) = 0

hicksim limb 1
// Checking instruction format for limb 2
  (is-beq) \cdot (op-c1-4 + 0b000 \cdot 2^4 + op-a0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                             \triangleright limb 2 for beq
  (is-bne) \cdot (op-c1-4 + 0b001 \cdot 2^4 + op-a0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                             \vartriangleright limb 2 for bne
  (is-blt) \cdot (op-c1-4 + 0b100 \cdot 2^4 + op-a0 \cdot 2^7 - instr-val^{(2)}) = 0
                                                                                                             \triangleright limb 2 for blt
 (\text{is-bge}) \cdot (\text{op-c1-4} + \text{0b101} \cdot 2^4 + \text{op-a0} \cdot 2^7 - \text{instr-val}^{(2)}) = 0
                                                                                                             \triangleright limb 2 for bge
  (\text{is-bltu}) \cdot (\text{op-c1-4} + \text{0b110} \cdot 2^4 + \text{op-a0} \cdot 2^7 - \text{instr-val}^{(2)}) = 0

hicksim limb 2 for bltu
• (is-bgeu) \cdot (op-c1-4 + 0b111 \cdot 2<sup>4</sup> + op-a0 \cdot 2<sup>7</sup> - instr-val<sup>(2)</sup>) = 0
                                                                                                             \triangleright limb 2 for bgeu
// Checking instruction format for limb 3
• (is-type-b) \cdot (op-a1-4 + op-b0-3 \cdot 2<sup>4</sup> - instr-val<sup>(3)</sup>) = 0

hicksim limb 3
# Checking instruction format for limb 4
\bullet \quad (\texttt{is-type-b}) \cdot (\texttt{op-b4} + \texttt{op-c5-7} \cdot 2 + \texttt{op-c8-10} \cdot 2^4 + \texttt{op-c12} \cdot 2^7 - \texttt{instr-val}^{(4)}) = 0 \rhd \, \texttt{limb} \, \, 4 = 0 = 0
```

#### Type SYS: System call instructions

For Type SYS instructions (ECALL, EBREAK):

- op-a is computed within the instruction execution component depending on R[x17]
- a-val is either a private input provided by the prover directly or not set
- op-a range check guaranteed by the register memory component when set
- a-val-effective will be written to the op-a register when set
- op-b is set to x17
- b-val is obtained by reading from the register memory component
- op-c is set to 0

#### • c-val is set to 0

#### Constraints:

```
// System instructions - ecall, ebreak
/\!\!/ op-a value is set by the execution component depending on R[x17]
\# op-b is set to x17
/\!\!/ op-c is set to 0
/\!\!/ Enforcing op-b = x17 for system instructions
• is-type-sys \cdot (17 - \text{op-b}) = 0
/\!\!/ \; {\tt Enforcing op-c} = 0 \ {\tt for unimp}
\bullet \quad \text{is-type-sys} \cdot (\text{op-c}) = 0
/\!\!/ Enforcing c-val =0 for unimp
  (\texttt{is-type-sys}) \cdot (\texttt{c-val}^{(1)}) = 0
    (\text{is-type-sys}) \cdot (\text{c-val}^{(2)}) = 0
  (\texttt{is-type-sys}) \cdot (\texttt{c-val}^{(3)}) = 0
• (is-type-sys) \cdot (c-val<sup>(4)</sup>) = 0
// Checking instruction format for system instructions
\bullet \quad (\texttt{is-type-sys}) \cdot (\texttt{0b01110011} - \texttt{instr-val}^{(1)}) = 0

hicksim limb 1
    (is-type-sys) \cdot (0b00000000 - instr-val^{(2)}) = 0
                                                                                                                     \triangleright limb 2
    (is-ecall) \cdot (0b0000 + 0b0000 \cdot 2^4 - instr-val^{(3)}) = 0
                                                                                                                     \triangleright limb 3 for ecall
  (is-ebreak · (0b0000 + 0b1000 · 2^4 - instr-val<sup>(3)</sup>) = 0
                                                                                                                     \triangleright limb 3 for ebreak
\bullet \quad (\mathtt{is-type-sys}) \cdot (\mathtt{0b00000000} - \mathtt{instr-val}^{(4)}) = 0
                                                                                                                     \triangleright limb 4
```

### UNIMP instruction - unimp

For the UNIMP instruction:

- op-a = 0 a-val<sup>(j)</sup> = 0 for j = 1, 2, 3, 4
- op-b = 0 b-val<sup>(j)</sup> = 0 for j = 1, 2, 3, 4
- op-b = 0 c-val<sup>(j)</sup> = 0 for j = 1, 2, 3, 4

```
// UNIMP instruction - unimp
/\!\!/ op-a, op-c, op-c are all set to 0
/\!\!/ \; {\tt Enforcing op-c} = 0 \ {\tt for unimp}
• is-unimp \cdot (op-c) = 0
/\!\!/ Enforcing c-val = 0 for unimp
• (is-unimp) \cdot (c-val^{(1)}) = 0
    (is-unimp) \cdot (c-val^{(2)}) = 0
• (is-unimp) \cdot (c-val<sup>(3)</sup>) = 0
• (is-unimp) \cdot (c-val^{(4)}) = 0
/\!\!/ Enforcing op-b = 0 for unimp
• is-unimp \cdot (op-b) = 0
/\!\!/ Enforcing b-val = 0 for unimp
\bullet \quad (\mathtt{is\text{-}unimp}) \cdot (\mathtt{b\text{-}val}^{(1)}) = 0
  (\texttt{is-unimp}) \cdot (\texttt{b-val}^{(2)}) = 0
• (is-unimp) \cdot (b-val<sup>(3)</sup>) = 0
• (is-unimp) \cdot (b-val<sup>(4)</sup>) = 0
\# Enforcing op-a = 0 for unimp
• is-unimp \cdot (op-b) = 0
/\!\!/ Enforcing a-val = 0 for unimp
 \bullet \quad (\mathtt{is\text{-}unimp}) \cdot (\mathtt{a\text{-}val}^{(1)}) = 0 
   (\text{is-unimp}) \cdot (\text{a-val}^{(2)}) = 0
  (\texttt{is-unimp}) \cdot (\texttt{a-val}^{(3)}) = 0
```

#### 4.3.4 Interactions with other components

As mentioned before, the CPU component needs to interact with a few other components:

- Interaction with program memory: To read the next instruction using the program counter;
- Interaction with register memory: to read values associated with operands;
- Interaction with instruction execution: to enforce correction execution of instructions.

**Remark 4.15** Since all constraints in the remaining of this section are for the same row, we do not explicitly write down the row index i when describing these constraints.

### Program memory interaction

•  $instr-val \leftarrow Read_{Prog}(pc, clk)$ 

#### Register memory interaction

```
# Type S + B instructions do not have a destination register

# Type S + B instructions instead read from the op-a register

# Hence, we obtain a-val by reading from the op-a register

• (is-type-s + is-type-b) \cdot \operatorname{Read}_{\operatorname{Reg}}(\operatorname{op-a},\operatorname{clk},3) - \operatorname{a-val}) = 0

# Type R + I + U + J instructions have a destination register

# Hence, we need to write a-val-effective to the op-a register

• (is-type-r + is-type-i + is-type-u + is-type-j) \cdot \operatorname{Write}_{\operatorname{Reg}}(\operatorname{a-val-effective},\operatorname{a-val},\operatorname{clk},3) - \operatorname{a-val-effective}) = 0

# Type SYS instructions will interact with the register memory directly

# From the execution component so we ignore SYS instructions above

# a-val is an input provided by the environment for Type SYS instructions

# We only read from register op-b to obtain b-val when op-b-flag = 1

• (op-b-flag) \cdot \operatorname{Read}_{\operatorname{Reg}}(\operatorname{op-b},\operatorname{clk},1) - \operatorname{b-val}) = 0

# We only need to read from register op-c to obtain c-val for Type R instructions

• (is-type-r) \cdot \operatorname{Read}_{\operatorname{Reg}}(\operatorname{op-c},\operatorname{clk},2) - \operatorname{c-val}) = 0
```

Remark 4.16 Remark on handling the case where the destination register rd = x0: Since the contents of register x0 must remain equal to 0, we need to make sure the value of the destination register does not get updated when rd = x0. For this reason, we added an additional column a-val-effective whose value is a-val (if  $rd \neq x0$ ) or 0 (if rd = x0). The register memory checking uses a-val-effective instead of a-val. That is,

```
a-val-effective = a-val when rd ≠ x0
a-val-effective = 0 when rd = x0
```

#### Execution component interaction

•  $pc-next \leftarrow exec(pc, opcode, a-val, b-val, c-val)$ 

# 5 Register memory component

As described above, the read-write register memory component is responsible for managing access to the registers. Since this is a read-write memory, each register will have a timestamp associated with it indicating the last time that that register has been written.

In order to check that the register memory has been updated correctly, we will make use of logups to check the consistency between the read and write sets, where each element of the set has the form (reg-addr, reg-val, reg-ts) and indicates that the value reg-val was written to address reg-addr at time reg-ts.

Section outline: The remainder of this section is organized as follows:

- Section 5.1 provides a quick overview of the operations supported by the register memory and how to enforce the consistency of these operations using logups.
- Section 5.2 defines the main trace elements used by the register memory component.
- Section 5.3 describes how the contents of all registers should be initialized.
- Section 5.4 defines the interface that other components can use to read and write to the register memory.
- Sections 5.5 and 5.6 describe the constraints for the register memory component assuming, respectively, large and small fields. This includes in particular range checks (Sections 5.5.1 and 5.6.1), arithmetic constraints (Sections 5.5.2 and 5.6.2), and logup computations (Sections 5.5.3 and 5.6.3).
- Section 5.7 specifies additional constraints used to compute the logup contributions related to the initial and final state of the register memory.

### 5.1 Read and write operations

Read operation Suppose we want to read the contents of a register reg-addr at time reg-ts. Let reg-val-prev be the value stored in it and let reg-ts-prev be the time in which this value was written. In order to simplify the register memory checking, we make sure that every written value is read only once. Of course a CPU might read the same value again, so each read operation is followed by a write operation where the same value is written to memory. More precisely, we will update the read and write sets as follows:

```
    reg-read-set = reg-read-set ∪ {(reg-addr, reg-val-prev, reg-ts-prev)}
    reg-write-set = reg-write-set ∪ {(reg-addr, reg-val-prev, reg-ts-cur)}
```

Note that the union is disjoint because of the unique timestamps.

Write operation Suppose we want to update the contents of a register reg-addr at time reg-ts-cur with the reg-val-cur. Let reg-val-prev be the value stored in this register and let reg-ts-prev be the time in which this value was written. In order to simplify the register memory checking, we make sure that every written value is read once. For this reason, although a write operation overwrites the content of the register, each write operation is preceded by a read operation. More precisely, we will update the read and write sets as follows:

```
    reg-read-set = reg-read-set ∪ {(reg-addr, reg-val-prev, reg-ts-prev)}
    reg-write-set = reg-write-set ∪ {(reg-addr, reg-val-cur, reg-ts-cur)}
```

Let  $\alpha$  and  $\beta$  be two random values chosen by the verifier after the prover commits to the execution trace of the program. To capture these operations using logups, we will first convert each triple (reg-addr, reg-val, reg-ts) in the read and write sets to a field element in the secure extension field, which can be seen as a fingerprint of the triple.

This can be done by picking a random linear combination of the elements in the triple using consecutive powers  $\beta^0$ ,  $\beta^1$ ,  $\beta^2$  as the coefficients. In other words, the fingerprint for (reg-addr, reg-val,

reg-ts) will be reg-addr· $\beta^0$ +reg-val· $\beta^1$ +reg-ts· $\beta^2$ . If we denote by fp(reg-addr, reg-val, reg-ts) this fingerprint function, the logup contribution for the entry reg-addr, reg-val, reg-ts) will be  $1/(\text{fp(reg-addr, reg-val, reg-ts)} + \alpha)$ .

### 5.2 Register memory trace elements

In the case of the register memory, up to three register addresses can be accessed during an execution cycle. Since each access to the register memory requires us to maintain a set (reg-addr, reg-val-cur, reg-val-prev, reg-ts-prev, reg-ts-cur) to properly handle memory updates related to a particular register, we define 3 such sets of values.

As a result, we will have the following set of trace elements:

- clk: the current execution time
- reg1-addr, reg2-addr, reg3-addr: the address of the register
- reg1-val-cur, reg2-val-cur, reg3-val-cur: 32-bit values used to update register contents
- reg1-ts-cur, reg2-ts-cur, reg3-ts-cur: current timestamps for the registers
- reg1-val-prev, reg2-val-prev, reg3-val-prev: previous 32-bit values stored at the registers
- reg1-ts-prev, reg2-ts-prev, reg3-ts-prev: previous timestamps for the registers
- reg-read-digest (in the interaction trace, not in the execution trace): a digest of the read set, used for logups.
- reg-write-digest (in the interaction trace, not in the execution trace): a digest of the write set, used for logups.
- reg1-accessed, reg2-accessed, reg3-accessed: flags indicating whether the set of trace elements (regj-addr, regj-val-cur, regj-ts-cur, regj-val-prev, regj-ts-prev) for j=1,2,3 are being used

### 5.3 Register memory initialization

Initially, at time 0, we assume that the contents of all registers are initialized to 0. In particular, this means that the initial write set will contain an entry (reg-addr, 0,0) for each register selector reg-addr, where the second and third components correspond to their initial value and timestamp. The corresponding digest for this initial write set will be

$$\texttt{reg-write-init-digest} = \sum_{\texttt{reg-addr} \in \{0,\dots,31\}} \frac{1}{\mathsf{fp}(\texttt{reg-addr},0,0) + \alpha}.$$

### 5.4 Register memory interface

In order to clarify the interaction between the register memory and other components, we define here the interface used for reading from and writing to the register memory. Since the register memory allows for up to 3 register read and write operations in a clock cycle and since each of them uses a different time tick, the interface also includes a selector reg-sel  $\in \{1, 2, 3\}$  to specify which time tick should be used.

- Read<sub>Reg</sub>(reg-addr, clk, reg-sel) → val: the register memory returns the value val stored at register location reg-addr, updating timestamps according to the timestamp clk and the register selector reg-sel.
- Write<sub>Reg</sub>(reg-addr, val, clk, reg-sel) → val: the register memory updates the value stored at register location reg-addr with the value val, updating timestamps according to the timestamp clk and register selector reg-sel. The register memory returns the updated value val.

#### Remarks

- The interface above considers each input and output as a single element, However, when working over small fields, the clk and val entries will be specified by a set of limbs.
  - clk will be specified by 4 limbs (clk<sup>(1)</sup>, clk<sup>(2)</sup>, clk<sup>(3)</sup>, clk<sup>(4)</sup>)
    val will be specified by 4 limbs (val<sup>(1)</sup>, val<sup>(2)</sup>, val<sup>(3)</sup>, val<sup>(4)</sup>)
- The interface also assumes that registers reg1-addr, reg2-addr, reg3-addr should be accessed in this order, with reg3-addr being the last one to be updated.
- When a register is not accessed during a clock cycle, the corresponding flag regj-accessed for j = 1, 2, 3 should be set to 0 so that the corresponding entry is not taken into account during the logup computation.
- The value regj-accessed for j = 1, 2, 3 will be set according to the flags used by the other components when they call the register memory interface. More precisely, they are set as follows:

- When a row is just a padding (after the program execution), the flags regj-accessed for j = 1, 2, 3 should all be set to 0. This is enforced by the above constraints since all instructions flags must be set to 0 when is-pad = 1.
- Since only the value of reg3 can be updated in a time clock, we also enforce reg1-val-cur = reg1-val-prev and reg2-val-cur = reg2-val-prev for instructions that access these registers. For read operations related to reg3, we additionally enforce reg3-val-cur = reg3-val-prev. More precisely, the following constraints are added in the large field case:

```
 \begin{array}{ll} \bullet & \texttt{reg1-accessed} \cdot (\texttt{reg1-val-prev} - \texttt{reg1-val-cur}) = 0 \\ \bullet & \texttt{reg2-accessed} \cdot (\texttt{reg2-val-prev} - \texttt{reg2-val-cur}) = 0 \\ \bullet & (\texttt{is-type-s} + \texttt{is-type-b}) \cdot (\texttt{reg3-val-prev} - \texttt{reg3-val-cur}) = 0 \\ \end{array}
```

In the small field case, the constraints have to take into account the limbs associated with these variables.

In the actual implementation, the Read<sub>RAM</sub> and Write<sub>RAM</sub> interfaces are not explicitly implemented. Instead, the trace columns for reg1-val-cur, reg2-val-cur, reg3-val-cur are replaced, respectively, with b-val, c-val, and a-val-effective, which are shared with the other components.

### 5.5 Register memory constraints assuming large fields

#### 5.5.1 Range checks

#### Remarks

- For  $T < 2^{32}$ , we can check if an element  $a \in \{0, ..., T-1\}$  by performing two range checks:  $a \in [0, 2^{32} 1]$  and  $T 1 a \in [0, 2^{32} 1]$ , as per [GPR21].
- We also assume that  $T < 2^{30}$  so that the result of the multiplication by 3 remains a 32-bit value. We can add more limbs for timestamps and the clock if necessary.
- Since the current timestamps for reg j-ts-cur for j = 1, 2, 3 are specific functions of the clk value, we added specific arithmetic constraints below for these trace elements.
- Since we are working with a trace that is ordered according to the clock cycle, the range check for clk is not needed. Instead, we require a boundary constraint (clk[0] = 1) and a transition constraint (clk[i+1] = clk[i] + 1).

#### 5.5.2 Arithmetic constraints

```
/\!\!/ Computing regj-ts-cur as a function of clk for j=1,2,3  
 • reg1-ts-cur = 3\cdot {\rm clk}-2  
 • reg2-ts-cur = 3\cdot {\rm clk}-1  
 • reg3-ts-cur = 3\cdot {\rm clk}
```

Remark 5.1 The constraints above assume that registers reg1, reg2, reg3 are accessed and updated in this order, with reg3 being the last one to be updated.

#### 5.5.3 Logup computations

As we said above, one may access up to 3 registers in a clock cycle depending on the instruction that is being executed.

Let regj-accessed[i] for j = 1, 2, 3 be the three flags that indicate whether the set of trace elements (regj-addr[i], regj-val-cur[i], regj-ts-cur[i], regj-val-prev[i], regj-ts-prev[i]) are being accessed during the i-th clock cycle. In order to compute the difference between the read set and write set digests between rows i - 1 and i, we can use these flags as follows:

```
• reg-read-digest[i] - reg-read-digest[i-1] = reg1-accessed[i]/(fp(reg1-addr[i], reg1-val-prev[i], reg1-ts-prev[i]) + \alpha) + reg2-accessed[i]/(fp(reg2-addr[i], reg2-val-prev[i], reg2-ts-prev[i]) + \alpha) + reg3-accessed[i]/(fp(reg3-addr[i], reg3-val-prev[i], reg3-ts-prev[i]) + \alpha) • reg-write-digest[i] - reg-write-digest[i] - reg1-accessed[i]/(fp(reg1-addr[i], reg1-val-cur[i], reg1-ts-cur[i]) + \alpha) + reg2-accessed[i]/(fp(reg2-addr[i], reg2-val-cur[i], reg2-ts-cur[i]) + \alpha) + reg3-accessed[i]/(fp(reg3-addr[i], reg3-val-cur[i], reg3-ts-cur[i]) + \alpha).
```

Initial write set digest: Let reg-write-init-digest be the logup sum for the initial state of the register memory (see Section 5.3). That is,

$$\texttt{reg-write-init-digest} = \sum_{\substack{\texttt{reg-addr} \in \{0, \dots, 31\}}} \frac{1}{\texttt{fp}(\texttt{reg-addr}, 0, 0) + \alpha}.$$

Final read set digest: Let reg-val-final(reg-addr) denote the last value written to register reg-addr  $\in \{0, ..., 31\}$  and let reg-ts-final(reg-addr) denote the corresponding timestamp. Let reg-read-final-digest denote the logup sum for the final state of memory. That is, reg-read-final-digest is equal to

$$\sum_{\texttt{reg-addr} \in \{0,\dots,31\}} \frac{1}{\texttt{fp}(\texttt{reg-addr},\texttt{reg-val-final}(\texttt{reg-addr}),\texttt{reg-ts-final}(\texttt{reg-addr})) + \alpha}$$

Boundary constraints: Let reg-write-init-digest and reg-read-final-digest be as defined above. The boundary constraints can then be specified as follows:

```
• reg-read-digest[0] = reg1-accessed[0]/(fp(reg1-addr[0], reg1-val-prev[0], reg1-ts-prev[0]) + \alpha) + reg2-accessed[0]/(fp(reg2-addr[0], reg2-val-prev[0], reg2-ts-prev[0]) + \alpha) + reg3-accessed[0]/(fp(reg3-addr[0], reg3-val-prev[0], reg3-ts-prev[0]) + \alpha) • reg-write-digest[0] = reg-write-init-digest + reg1-accessed[0]/(fp(reg1-addr[0], reg1-val-cur[0], reg1-ts-cur[0]) + \alpha) + reg2-accessed[0]/(fp(reg2-addr[0], reg2-val-cur[0], reg2-ts-cur[0]) + \alpha) + reg3-accessed[0]/(fp(reg3-addr[0], reg3-val-cur[0], reg3-ts-cur[0]) + \alpha) • reg-read-digest[n] = reg-read-final-digest + reg-write-digest[n]
```

**Transition constraints**  $(0 < i \le n)$ : The transition constraints can be specified as follows:

```
• reg-read-digest[i] - reg-read-digest[i-1] = reg1-accessed[i]/(fp(reg1-addr[i], reg1-val-prev[i], reg1-ts-prev[i]) + \alpha) + reg2-accessed[i]/(fp(reg2-addr[i], reg2-val-prev[i], reg2-ts-prev[i]) + \alpha) + reg3-accessed[i]/(fp(reg3-addr[i], reg3-val-prev[i], reg3-ts-prev[i]) + \alpha) + reg-write-digest[i] - reg-write-digest[i] - reg1-accessed[i]/(fp(reg1-addr[i], reg1-val-cur[i], reg1-ts-cur[i]) + \alpha) + reg2-accessed[i]/(fp(reg2-addr[i], reg2-val-cur[i], reg2-ts-cur[i]) + \alpha) + reg3-accessed[i]/(fp(reg3-addr[i], reg3-val-cur[i], reg3-ts-cur[i]) + \alpha)
```

**Remark 5.2** • Instead of initializing all register values, it suffices to initialize only those registers which will be accessed during the execution of a program.

• Instead of maintaining separate running sums, the implementation can choose to keep track of the difference reg-read-write-digest[i] = reg-write-digest[i] - reg-read-digest[i] between the two logup sums so that its final value is equal to 0. To make this work, the initialization would need to be changed slightly to also take into account reg-read-final-digest.

### 5.6 Register memory constraints assuming small fields

#### 5.6.1 Range checks

```
/\!\!/ reg1-addr \in \{0,\ldots,31\} - guaranteed via memory checking
/\!\!/ reg2-addr \in \{0,\ldots,31\} - guaranteed via memory checking
/\!\!/ reg3-addr \in \{0,\ldots,31\} - guaranteed via memory checking
• reg1-val-cur^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4 \triangleright Common to instruction execution
  reg2-val-cur^{(j)} \in [0, 2^8-1] for j=1, 2, 3, 4 \triangleright Common to instruction execution
  reg3-val-cur^{(j)} \in \left[0,2^8-1\right] for j=1,2,3,4
                                                            reg1-val-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
• reg2-val-prev^{(j)} \in \left[0, 2^8 - 1\right] for j = 1, 2, 3, 4
• reg3-val-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
/\!\!/ regi-ts-prev \in \{0,\ldots, 	ext{reg}i-ts-cur -1\} for i=1,2,3
/\!\!/ \Longrightarrow {\tt reg} i{\tt -ts-prev} \in \left[0, 2^{32} - 1\right] \text{ for } i = 1, 2, 3
/\!\!/\Longrightarrow \mathtt{reg} i\mathtt{-ts-prev-aux} \in \left[0,2^{32}-1\right] for i=1,2,3
// where regi-ts-prev-aux = regi-ts-cur -1 - regi-ts-prev for i=1,2,3
• reg1-ts-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
  reg2-ts-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
• reg3-ts-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
• reg1-ts-prev-aux^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
• reg2-ts-prev-aux<sup>(j)</sup> \in [0, 2^8 - 1] for j = 1, 2, 3, 4
  reg3-ts-prev-aux^{(j)} \in \left[0,2^8-1\right] for j=1,2,3,4
```

#### 5.6.2 Arithmetic constraints

As indicated in the large field case, we need to enforce constraints on the limbs associated with reg1-ts-cur, reg2-ts-cur, and reg3-ts-cur so that they satisfy the following conditions:

```
• reg1-ts-cur = 3 \cdot \text{clk} - 2
• reg2-ts-cur = 3 \cdot \text{clk} - 1
• reg3-ts-cur = 3 \cdot \text{clk}
```

values used to handle borrows

In order to do so, we need to introduce helper values to handle carries and borrows because, when performing the above computations over the limbs, both positive or negative can occur.

```
Let clk := (clk<sup>(1)</sup>, clk<sup>(2)</sup>, clk<sup>(3)</sup>, clk<sup>(4)</sup>)
Let reg1-ts-cur := (reg1-ts-cur<sup>(1)</sup>, reg1-ts-cur<sup>(2)</sup>, reg1-ts-cur<sup>(3)</sup>, reg1-ts-cur<sup>(4)</sup>)
Let reg2-ts-cur := (reg2-ts-cur<sup>(1)</sup>, reg2-ts-cur<sup>(2)</sup>, reg2-ts-cur<sup>(3)</sup>, reg2-ts-cur<sup>(4)</sup>)
Let reg3-ts-cur := (reg3-ts-cur<sup>(1)</sup>, reg3-ts-cur<sup>(2)</sup>, reg3-ts-cur<sup>(3)</sup>, reg3-ts-cur<sup>(4)</sup>)
Let hi-carry := (hi-carry<sup>(1)</sup>, hi-carry<sup>(2)</sup>, hi-carry<sup>(3)</sup>, hi-carry<sup>(4)</sup>) for i = 1, 2, 3 be helper values used to handle carries
Let hi-borrow := (hi-borrow<sup>(1)</sup>, hi-borrow<sup>(2)</sup>, hi-borrow<sup>(3)</sup>, hi-borrow<sup>(4)</sup>) for i = 1, 2, 3 be helper
```

Arithmetic constraints for reg1-ts-cur, reg2-ts-cur, reg3-ts-cur

```
/\!\!/ Carry and borrow handling for reg1-ts-cur = 3 \cdot {
m clk} - 2
                                                                                                                                        + \mathtt{h1\text{-}carry}^{(1)} \cdot 2^8 = 3 \cdot \mathtt{clk}^{(1)} + \mathtt{h1\text{-}borrow}^{(1)} \cdot 2^8
• reg1-ts-cur^{(1)} + 2
             \texttt{reg1-ts-cur}^{(2)} + \texttt{h1-borrow}^{(1)} + \texttt{h1-carry}^{(2)} \cdot 2^8 = 3 \cdot \texttt{clk}^{(2)} + \texttt{h1-borrow}^{(2)} \cdot 2^8 + \texttt{h1-carry}^{(1)} + \texttt{h1-borrow}^{(2)} \cdot 2^8 + \texttt{h1-carry}^{(2)} + \texttt{h1-borrow}^{(2)} + \texttt{h1-borrow}
             \texttt{reg1-ts-cur}^{(3)} + \texttt{h1-borrow}^{(2)} + \texttt{h1-carry}^{(3)} \cdot 2^8 = 3 \cdot \texttt{clk}^{(3)} + \texttt{h1-borrow}^{(3)} \cdot 2^8 + \texttt{h1-carry}^{(2)} + \texttt{h1-borrow}^{(3)} \cdot 2^8 + \texttt{h1-carry}^{(3)} \cdot 2^8 + \texttt{h
             \texttt{reg1-ts-cur}^{(4)} + \texttt{h1-borrow}^{(3)} + \texttt{h1-carry}^{(4)} \cdot 2^8 = 3 \cdot \texttt{clk}^{(4)} + \texttt{h1-borrow}^{(4)} \cdot 2^8 + \texttt{h1-carry}^{(3)}
// Carry and borrow handling for reg2-ts-cur = 3 \cdot \mathrm{clk} - 1
                                                                                                                                        + \mathtt{h2\text{-}carry}^{(1)} \cdot 2^8 = 3 \cdot \mathtt{clk}^{(1)} + \mathtt{h2\text{-}borrow}^{(1)} \cdot 2^8
             reg2-ts-cur^{(1)} + 1
             \texttt{reg2-ts-cur}^{(2)} + \texttt{h2-borrow}^{(1)} + \texttt{h2-carry}^{(2)} \cdot 2^8 = 3 \cdot \texttt{clk}^{(2)} + \texttt{h2-borrow}^{(2)} \cdot 2^8 + \texttt{h2-carry}^{(1)}
             \texttt{reg2-ts-cur}^{(4)} + \texttt{h2-borrow}^{(3)} + \texttt{h2-carry}^{(4)} \cdot 2^8 = 3 \cdot \texttt{clk}^{(4)} + \texttt{h2-borrow}^{(4)} \cdot 2^8 + \texttt{h2-carry}^{(3)}
/\!\!/ Carry and borrow handling for reg3-ts-cur = 3 \cdot \mathrm{clk}
                                                                                                                                          +h3-carry^{(1)} \cdot 2^8 = 3 \cdot clk^{(1)} + h3-borrow^{(1)} \cdot 2^8
             reg3-ts-cur<sup>(2)</sup> + h3-borrow<sup>(1)</sup> +h3-carry<sup>(2)</sup> \cdot 2<sup>8</sup> = 3 \cdot clk<sup>(2)</sup> + h3-borrow<sup>(2)</sup> \cdot 2<sup>8</sup> + h3-carry<sup>(1)</sup>
              \texttt{reg3-ts-cur}^{(3)} + \texttt{h3-borrow}^{(2)} + \texttt{h3-carry}^{(3)} \cdot 2^8 = 3 \cdot \texttt{clk}^{(3)} + \texttt{h3-borrow}^{(3)} \cdot 2^8 + \texttt{h3-carry}^{(2)}
             \texttt{reg3-ts-cur}^{(4)} + \texttt{h3-borrow}^{(3)} + \texttt{h3-carry}^{(4)} \cdot 2^8 = 3 \cdot \texttt{clk}^{(4)} + \texttt{h3-borrow}^{(4)} \cdot 2^8 + \texttt{h3-carry}^{(3)}
/\!\!/ Enforcing ranges for the borrows \in \{0,1\} - last borrow must be 0
           (h1-borrow^{(j)}) \cdot (1-h1-borrow^{(j)}) = 0 \text{ for } j = 1, 2, 3
             (h2-borrow^{(j)}) \cdot (1-h2-borrow^{(j)}) = 0 \text{ for } j = 1, 2, 3
             (h3-borrow^{(j)}) \cdot (1-h3-borrow^{(j)}) = 0 \text{ for } j = 1, 2, 3
        (\mathtt{h1-borrow}^{(4)}) = 0
             (h2-borrow^{(4)})=0
             (h3\text{-borrow}^{(4)}) = 0
/\!\!/ Enforcing ranges for the carries \in \{0,1,2\}
             (\mathtt{h1-carry}^{(j)}) \cdot (1 - \mathtt{h1-carry}^{(j)}) \cdot (2 - \mathtt{h1-carry}^{(j)}) = 0 \text{ for } j = 1, 2, 3, 4
              (\mathtt{h2-carry}^{(j)}) \cdot (1-\mathtt{h2-carry}^{(j)}) \cdot (2-\mathtt{h2-carry}^{(j)}) = 0 \text{ for } j=1,2,3,4
             (\mathtt{h3-carry}^{(j)}) \cdot (1 - \mathtt{h3-carry}^{(j)}) \cdot (2 - \mathtt{h3-carry}^{(j)}) = 0 \text{ for } j = 1, 2, 3, 4
// Making sure that borrows and carries cannot both be non-zero
        (h1-carry^{(j)}) \cdot (h1-borrow^{(j)}) = 0 \text{ for } j = 1, 2, 3
              (h2-carry^{(j)}) \cdot (h2-borrow^{(j)}) = 0 \text{ for } j = 1, 2, 3
             (h3-carry^{(j)}) \cdot (h3-borrow^{(j)}) = 0 \text{ for } j = 1, 2, 3
```

In addition, we also need to enforce the constraints on the limbs associated with regi-ts-prev and regi-ts-prev-aux for i = 1, 2, 3 so that they satisfy the following conditions:

```
• reg1-ts-prev-aux = reg1-ts-cur -1 - reg1-ts-prev
• reg2-ts-prev-aux = reg2-ts-cur -1 - reg2-ts-prev
• reg3-ts-prev-aux = reg3-ts-cur -1 - reg3-ts-prev
```

Arithmetic constraints for reg1-ts-prev-aux, reg2-ts-prev-aux, reg3-ts-prev-aux

```
\# Enforcing reg1-ts-prev-aux = reg1-ts-cur -1 - reg1-ts-prev
/\!\!/ h1-aux-borrow^{(j)} for j=1,2,3,4 used for borrow handling
    reg1-ts-prev-aux^{(1)} + reg1-ts-prev^{(1)} + 1
                                                                                        = \mathtt{reg1\text{-}ts\text{-}cur}^{(1)} + \mathtt{h1\text{-}aux\text{-}borrow}^{(1)} \cdot 2^8
    \texttt{reg1-ts-prev-aux}^{(2)} + \texttt{reg1-ts-prev}^{(2)} + \texttt{h1-aux-borrow}^{(1)} = \texttt{reg1-ts-cur}^{(2)} + \texttt{h1-aux-borrow}^{(2)} \cdot 2^8
    \texttt{reg1-ts-prev-aux}^{(2)} + \texttt{reg1-ts-prev}^{(3)} + \texttt{h1-aux-borrow}^{(2)} = \texttt{reg1-ts-cur}^{(3)} + \texttt{h1-aux-borrow}^{(3)} \cdot 2^8
    \texttt{reg1-ts-prev-aux}^{(2)} + \texttt{reg1-ts-prev}^{(4)} + \texttt{h1-aux-borrow}^{(3)} = \texttt{reg1-ts-cur}^{(4)} + \texttt{h1-aux-borrow}^{(4)} \cdot 2^8
\# Enforcing reg2-ts-prev-aux = reg2-ts-cur -1 - reg2-ts-prev
  {\tt h2-aux-borrow}^{(j)} \ {\tt for} \ j=1,2,3,4 \ {\tt used} \ {\tt for} \ {\tt borrow} \ {\tt handling}
    {\tt reg2\text{-}ts\text{-}prev\text{-}aux}^{(1)} + {\tt reg2\text{-}ts\text{-}prev}^{(1)} + 1
                                                                                        = \mathtt{reg2\text{-}ts\text{-}cur}^{(1)} + \mathtt{h2\text{-}aux\text{-}borrow}^{(1)} \cdot 2^8
    \texttt{reg2-ts-prev-aux}^{(2)} + \texttt{reg2-ts-prev}^{(2)} + \texttt{h2-aux-borrow}^{(1)} = \texttt{reg2-ts-cur}^{(2)} + \texttt{h2-aux-borrow}^{(2)} \cdot 2^8
    \texttt{reg2-ts-prev-aux}^{(2)} + \texttt{reg2-ts-prev}^{(3)} + \texttt{h2-aux-borrow}^{(2)} = \texttt{reg2-ts-cur}^{(3)} + \texttt{h2-aux-borrow}^{(3)} \cdot 2^8
    \texttt{reg2-ts-prev-aux}^{(2)} + \texttt{reg2-ts-prev}^{(4)} + \texttt{h2-aux-borrow}^{(3)} = \texttt{reg2-ts-cur}^{(4)} + \texttt{h2-aux-borrow}^{(4)} \cdot 2^8
/\!\!/ Enforcing reg3-ts-prev-aux = reg3-ts-cur -1 - reg3-ts-prev
// h3-aux-borrow<sup>(j)</sup> for j=1,2,3,4 used for borrow handling
    reg3-ts-prev-aux^{(1)} + reg3-ts-prev^{(1)} + 1
                                                                                     = reg3-ts-cur^{(1)} + h3-aux-borrow^{(1)} \cdot 2^8
    reg3-ts-prev-aux^{(2)} + reg3-ts-prev^{(2)} + h3-aux-borrow^{(1)} = reg3-ts-cur^{(2)} + h3-aux-borrow^{(2)} \cdot \cdot \cdot \cdot \cdot \cdot 28
    \texttt{reg3-ts-prev-aux}^{(2)} + \texttt{reg3-ts-prev}^{(3)} + \texttt{h3-aux-borrow}^{(2)} = \texttt{reg3-ts-cur}^{(3)} + \texttt{h3-aux-borrow}^{(3)} \cdot 2^8
    \texttt{reg3-ts-prev-aux}^{(2)} + \texttt{reg3-ts-prev}^{(4)} + \texttt{h3-aux-borrow}^{(3)} = \texttt{reg3-ts-cur}^{(4)} + \texttt{h3-aux-borrow}^{(4)} \cdot 2^8
// Enforcing ranges for the borrows \in \{0,1\} - last borrow must be 0
    (h1-aux-borrow^{(j)}) \cdot (1-h1-aux-borrow^{(j)}) = 0 for j = 1, 2, 3
    (\text{h2-aux-borrow}^{(j)}) \cdot (1 - \text{h2-aux-borrow}^{(j)}) = 0 \text{ for } j = 1, 2, 3
    (\mathtt{h3-aux-borrow}^{(j)}) \cdot (1-\mathtt{h3-aux-borrow}^{(j)}) = 0 \text{ for } j=1,2,3
    (h1-aux-borrow^{(4)})=0
    (h2-aux-borrow^{(4)})=0
    (\texttt{h3-aux-borrow}^{(4)}) = 0
```

#### 5.6.3 Logup computations

When adapting the logup argument to small fields, the components reg-val and reg-ts are each split into four 8-bit limb values reg-val<sup>(j)</sup> and reg-ts<sup>(j)</sup> for j = 1, 2, 3, 4.

Let fp be a fingerprint function which takes as input a tuple  $(reg-addr, reg-val, reg-ts) := (reg-addr, reg-val^{(1)}, ..., reg-val^{(4)}, reg-ts^{(1)}, ..., reg-ts^{(4)})$  and returns a field element in the secure extension field qm31 using a value  $\beta$  chosen by the verifier.

Like in the large field case, one way to achieve this goal is to compute a random linear combination of the elements in the tuple using consecutive powers  $\beta^0, \beta^1, \ldots, \beta^8$  as the coefficients. That is,  $fp(reg-addr, reg-val^{(1)}, \ldots, reg-val^{(4)}, reg-ts^{(1)}, \ldots, reg-ts^{(4)})$  simply returns  $reg-addr \cdot \beta^0 + reg-val^{(1)} \cdot \beta^1 + \ldots + reg-ts^{(4)} \cdot \beta^8$ .

As before, let reg1-accessed[i], reg2-accessed[i], reg3-accessed[i] be flags that indicate, respectively, whether the set of elements (regj-addr[i], regj-val-cur[i], regj-ts-cur[i], regj-val-prev[i], regj-ts-prev[i]) for j=1,2,3 are being accessed during the i-th clock cycle. Using fp and these flags, we can compute the difference between the read set and write set digests between rows i-1 and i as follows:

```
• reg-read-digest[i] - reg-read-digest[i-1] = reg1-accessed[i]/(fp(reg1-addr[i], reg1-val-prev[i], reg1-ts-prev[i]) + \alpha) + reg2-accessed[i]/(fp(reg2-addr[i], reg2-val-prev[i], reg2-ts-prev[i]) + \alpha) + reg3-accessed[i]/(fp(reg3-addr[i], reg3-val-prev[i], reg3-ts-prev[i]) + \alpha)
```

```
• reg-write-digest[i] - reg-write-digest[i-1] = reg1-accessed[i]/(fp(reg1-addr[i], reg1-val-cur[i], reg1-ts-cur[i]) + \alpha) + reg2-accessed[i]/(fp(reg2-addr[i], reg2-val-cur[i], reg2-ts-cur[i]) + \alpha) + reg3-accessed[i]/(fp(reg3-addr[i], reg3-val-cur[i], reg3-ts-cur[i]) + \alpha).
```

Initial write set digest: Let reg-write-init-digest be the logup sum for the initial state of the register memory (see Section 5.3). That is,

$$\texttt{reg-write-init-digest} = \sum_{\texttt{reg-addr} \in \{0,\dots,31\}} \frac{1}{\texttt{fp}(\texttt{reg-addr}, \mathbf{0}, \mathbf{0}) + \alpha},$$

where  $\mathbf{0} = (0, 0, 0, 0)$ .

Final read set digest: Let reg-val-final(reg-addr) denote the last value written to register reg-addr  $\in \{0, ..., 31\}$  and let reg-ts-final(reg-addr) denote the corresponding timestamp. Let reg-read-final-digest denote the logup sum for the final state of memory. That is, reg-read-final-digest is equal to

$$\sum_{\substack{\mathtt{reg-addr} \in \{0,\dots,31\}}} \frac{1}{\mathsf{fp}(\mathtt{reg-addr},\mathtt{reg-val-final}(\mathtt{reg-addr}),\mathtt{reg-ts-final}(\mathtt{reg-addr})) + \alpha}.$$

Boundary constraints: Let reg-write-init-digest and reg-read-final-digest be be as defined above. The boundary constraints can then be specified as follows:

```
• reg-read-digest[0] = reg1-accessed[0]/(fp(reg1-addr[0], reg1-val-prev[0], reg1-ts-prev[0]) + \alpha) + reg2-accessed[0]/(fp(reg2-addr[0], reg2-val-prev[0], reg2-ts-prev[0]) + \alpha) + reg3-accessed[0]/(fp(reg3-addr[0], reg3-val-prev[0], reg3-ts-prev[0]) + \alpha). • reg-write-digest[0] = reg-write-init-digest + reg1-accessed[0]/(fp(reg1-addr[0], reg1-val-cur[0], reg1-ts-cur[0]) + \alpha) + reg2-accessed[0]/(fp(reg2-addr[0], reg2-val-cur[0], reg2-ts-cur[0]) + \alpha) + reg3-accessed[0]/(fp(reg3-addr[0], reg3-val-cur[0], reg3-ts-cur[0]) + \alpha) • reg-read-digest[n] = reg-read-final-digest + reg-write-digest[n]
```

**Transition constraints**  $(0 < i \le n)$ : The transition constraints can be specified as follows:

```
• reg-read-digest[i] - reg-read-digest[i-1] = reg1-accessed[i]/(fp(reg1-addr[i], reg1-val-prev[i], reg1-ts-prev[i]) + \alpha) + reg2-accessed[i]/(fp(reg2-addr[i], reg2-val-prev[i], reg2-ts-prev[i]) + \alpha) + reg3-accessed[i]/(fp(reg3-addr[i], reg3-val-prev[i], reg3-ts-prev[i]) + \alpha). • reg-write-digest[i] - reg-write-digest[i] - reg1-accessed[i]/(fp(reg1-addr[i], reg1-val-cur[i], reg1-ts-cur[i]) + \alpha) + reg2-accessed[i]/(fp(reg2-addr[i], reg2-val-cur[i], reg2-ts-cur[i]) + \alpha) +
```

reg3-accessed[i]/(fp(reg3-addr[i], reg3-val-cur[i], reg3-ts-cur[i]) +  $\alpha$ ),

where regj-accessed[i] for j=1,2,3 are the three flags that indicate whether the set of trace elements (regj-addr, regj-val-cur, regj-ts-cur, regj-val-prev, regj-ts-prev) are being accessed during the i-th clock cycle.

### 5.7 Constraints and logup computation for initial write and final read sets

In order to help compute the logup contributions related to the initial write and final read sets mentioned in Section 5.6.3, we define additional trace columns and constraints for the register memory component.

#### 5.7.1 Trace elements for initial write and final read sets

In addition to the trace elements defined in Section 5.2, this section defines additional elements to help the verifier in the calculation of the initial write set and the final read set.

- reg-init-final-addr: register addresses used during the execution of the program
- reg-val-final: the final value stored at register reg-init-final-addr at the end of the execution
- reg-ts-final: the timestamp associated with the last access to register reg-init-final-addr.
- is-reg-addr: a flag that indicates whether row  $i \in \{0, ..., 31\}$ .
- row-index: a column that contains the index of the row.

#### Remarks

- reg-init-final-addr includes all the register addresses accessed during the execution.
- All registers addresses  $\{0, \dots, 31\}$  are initialized with  $\mathbf{0} = (0, 0, 0, 0)$ .
- Both is-reg-addr and row-index contain values that are known to the verifier, so we do not need to constrain them.

### 5.7.2 Constraints for register address values

Since the number of reg-init-final-addr addresses is small, the simplest way to account for all the addresses needed for the logup computations is to set the value of reg-init-final-addr to be equal to row-index and to ignore its value after row 31.

```
\label{eq:constraints} \begin{array}{l} \mbox{$\#$ Enforcing reg-init-final-addr = row-index} \\ \bullet & (\text{reg-init-final-addr} - \text{row-index}) = 0 \\ \mbox{$\#$ reg-val-final}^{(j)} \in \left[0, 2^8 - 1\right] \text{ for } j = 1, 2, 3, 4 \text{ - Implied by range checks during memory checking } \\ \mbox{$\#$ reg-ts-final}^{(j)} \in \left[0, 2^8 - 1\right] \text{ for } j = 1, 2, 3, 4 \text{ - Does not need to be reinforced} \\ \mbox{$\#$ reg-init-final-addr} \in \{0, \dots, 31\} \text{ - Implied by constraints} \end{array}
```

#### 5.7.3 Logup computation for initial write and final read sets

In order to compute the logup sums for the initial and final states of the register memory, as described in Section 5.6.3, the prover creates two additional columns:

- reg-write-init-digest: a digest column used to compute the logup sum for the initial state of the register memory;
- reg-read-final-digest: a digest column used to compute the logup sum for the final state of the register memory.

**Boundary constraints**: Let  $\mathbf{0} = (0, 0, 0, 0)$ . The boundary constraints are as follows:

```
• reg-write-init-digest[0] = is-reg-addr[0]/(fp(reg-init-final-addr[0], \mathbf{0}, \mathbf{0}) + \alpha) • reg-read-final-digest[0] = is-reg-addr[0]/(fp(reg-init-final-addr[0], reg-val-final[0], reg-ts-final[0]) + \alpha)
```

**Transition constraints**  $(0 < i \le n)$ : Let  $\mathbf{0} = (0, 0, 0, 0)$ . The transition constraints are as follows:

```
• reg-write-init-digest[i] - reg-write-init-digest[i-1] = is-reg-addr[i]/(fp(reg-init-final-addr[i], \mathbf{0}, \mathbf{0}) + \alpha)
• reg-write-digest[i] - reg-write-digest[i-1] = is-reg-addr[i]/(fp(reg-init-final-addr[i], reg-val-final[i], reg-ts-final[i]) + \alpha)
```

# 6 Program memory component

The program memory component is responsible for managing access to the read-only program memory. Since this is a read-only memory, this component only requires a simplified version of the offline memory checking technique described in Section 3.2 to guarantee the consistency of read operations. More precisely, the algorithm for performing a read described in Section 3.2 can be simplified so that the timestamp update step is replaced by a counter update step which simply increments the counter at every read.

As a result, instead of keeping a timestamp for each memory cell, each address in the memory will have a counter associated with it indicating the number of times that this memory location has been read. Notice that write sets are still needed to keep track of the number of times a program memory location has been read.

As for other memory types, we make use of logups to check the consistency between the read and write sets, where each element of the set has the form (pc, instr-val, prog-ctr). This indicates that the value instr-val at address pc has been read prog-ctr times.

Like the data memory component in Section 7, the program memory component is byte addressable. However, since the instructions are 32-bits long, 4 consecutive memory locations are read in each clock cycle. As a result, the program memory is also assumed to be word aligned, with the base address for the first memory location being read always being a multiple of 4. For more details, see Section 6.5.

**Convention**: The <u>base address</u> of an instruction refers to the address of the first byte of the 32-bit program instruction. Base addresses are word-aligned (that is, a multiple of 4). The contents of the pc register always point to the base address of an instruction.

**Section outline**: The remainder of this section is organized as follows:

- Section 6.1 defines the main trace elements used by the program memory component.
- Section 6.2 provides a quick overview of the operations supported by the program memory and how to enforce their consistency using logups.
- Section 6.3 describes how the contents of the program memory are initialized.
- Section 6.4 defines the interface that other components can use to read from the program memory.
- Section 6.5 briefly discusses issues related to program memory alignment and addressing.
- Sections 6.6 and 6.7 describe the constraints for the program memory component assuming, respectively, large and small fields. This includes in particular range checks (Sections 6.6.1 and 6.7.1), arithmetic constraints (Sections 6.6.2 and 6.7.2), and logup computations (Sections 6.6.3 and 6.7.3).
- Section 6.8 specifies additional constraints used to compute the logup contributions related to the initial and final state of the program memory.

### 6.1 Program memory trace elements

As a result, we will have the following base set of trace elements:

- pc: the word-aligned base address associated with a program instruction
- instr-val<sup>(1)</sup>: bits 0-7 of the instruction word instr-val stored at address pc
- instr-val<sup>(2)</sup>: bits 8-15 of the instruction word instr-val stored at address pc + 1
- $instr-val^{(3)}$ : bits 16-23 of the instruction word instr-val stored at address pc + 2
- instr-val $^{(4)}$ : bits 24-31 of the instruction word instr-val stored at address pc + 3
- prog-ctr-prev: the previous counter value associated with base address pc
- prog-ctr-cur: the current counter value associated with base address pc
- prog-read-digest: a digest of the read set, used for logups

• prog-write-digest: a digest of the write set, used for logup

Remark 6.1 • instr-val = (instr-val<sup>(1)</sup>, instr-val<sup>(2)</sup>, instr-val<sup>(3)</sup>, instr-val<sup>(4)</sup>) refers to the actual 32-bit word of the instruction stored at address pc

- prog-ctr-prev and prog-ctr-cur are shared across the limbs instr-val<sup>(j)</sup> of the instruction instr-val, where j = 1, 2, 3, 4, because these 4 byte values are always read simultaneously.
- Since the 4 byte values are always read simultaneously, we will use the 32-bit word instr-val when computing the read set digest during the logup computation. That is, each element of the read set is of the form (pc, instr-val, prog-ctr) where instr-val = (instr-val<sup>(1)</sup>, instr-val<sup>(2)</sup>, instr-val<sup>(3)</sup>, instr-val<sup>(4)</sup>). The prover commits to pairs (pc, instr-val) for the whole program in a separate trace, and sends a deterministic commitment of those to the verifier.
- Somewhere in the execution trace, the prover commits to prog-ctr-final for every program instruction instr-val accessed during the execution, where prog-ctr-final denotes the total number of times the instruction instr-val has been executed. The value prog-ctr-final should be committed to the same row as (pc, instr-val) on the program memory trace.
- The initial counters can be omitted because they are known to be zero.

#### Counter update:

- Initially, the counters associated with the base address of each instruction will be 0 and the contents of these addresses will be initialized with the program that is being executed.
- The counter associated with a given base address will be incremented by one every time that
  base address is accessed.
- The value of the current program counter prog-ctr-cur should always be equal to prog-ctr-prev + 1 for any given base address.

### 6.2 Program memory read operations

Let  $instr-val = (instr-val^{(1)}, instr-val^{(2)}, instr-val^{(3)}, instr-val^{(4)})$  be the instruction word stored at an base address pc and let prog-ctr-prev be the latest value of the counter associated with this base address (that is, the number of times the instruction instr-val at the base address pc has been accessed). Whenever a read operation takes place at an address, the current counter prog-ctr-cur associated with the base address pc must be set to prog-ctr-prev+1.

**Read operation** Let pc be the word-aligned base address used to access the instruction. In order to read the contents of the 4 bytes (instr-val<sup>(1)</sup>,...,instr-val<sup>(4)</sup>) of the instruction instr-val stored at addresses [pc, pc + 3], we update the read and write sets as follows:

prog-read-set = prog-read-set ∪ {(pc, instr-val, prog-ctr-prev)}
 prog-write-set = prog-write-set ∪ {(pc, instr-val, prog-ctr-cur)}

### 6.3 Program memory initialization

Initially, the counters associated with each word-aligned base address will be 0 and the memory locations will be initialized with the contents of the program being executed.

Let fp be a fingerprint function which takes as input the tuple (pc, instr-val, prog-ctr) and returns a field element in the secure extension field qm31 using a value  $\beta$  chosen by the verifier.

Let PROG-SET denote the set of word-aligned base addresses used by the input program and let  $instr-val := (instr-val^{(1)}, ..., instr-val^{(4)})$  denote the instruction byte values stored respectively at memory locations [pc, pc + 3]. The corresponding digest for the initial write set will be

$$\texttt{prog-write-init-digest} = \sum_{\texttt{pc} \in \mathsf{PROG-SET}} \frac{1}{(\mathsf{fp}(\mathsf{pc}, \mathsf{instr-val}, 0) + \alpha)}.$$

In other words, even though the memory is byte-addressable, the computation of the fingerprint function is performed as if the memory was word-addressable since only addresses at 4-byte boundaries are used. This is possible because the base address pc is expected to be a multiple of 4 and each access to the program memory will read the 4 consecutive bytes stored at pc.

The case where pc and prog-ctr-prev are represented by 8-bit limbs is treated internally by the fingerprint function fp, as described in more detail below.

### 6.4 Program memory interface

In order to clarify the interaction between the read-only program memory and other components, we now define the interface used for reading from the program memory.

• Read<sub>Prog</sub>(pc) → instr-val: the program memory returns the 32-bit value instr-val = (instr-val<sup>(1)</sup>,...,instr-val<sup>(4)</sup>) stored at base address pc, updating the counter associated with pc as needed.

#### Remarks

- As in the register memory case, the interface above considers each input as a single element. However, each of these values will be specified by a set of limbs.
  - pc will be specified by 4 limbs  $(pc^{(1)}, \dots, pc^{(4)})$
  - instr-val will be specified by 4 limbs (instr-val $^{(1)}, \dots,$  instr-val $^{(4)}$ )
- The base address pc used to access the program memory is assumed to be word-aligned (that is, a multiple of 4), as discussed in the next subsection. This requirement is enforced by the CPU component.
- Exactly 1 instruction word instr-val := (instr-val<sup>(1)</sup>,...,instr-val<sup>(4)</sup>) is accessed in a clock cycle. Hence, each non-padding row will contain an access to the program memory.
- When performing logup computations for program memory accesses, prog-accessed := (1 is-pad) will be used as a flag when accounting for logup contributions.

### 6.5 Program memory alignment and addressing

The program memory of the Nexus virtual machine follows the RV32I instruction set and is byte addressable. However, since instructions are 4 bytes in length, the base address used to access the program memory has to be a multiple of 4 or else a memory misalignment exception will be raised.

Remark 6.2 Remark about misalignments: According to the RISC-V specification [RIS19, Page 15], "in the base ISA, there are four core instruction formats (R/I/S/U), as shown in Figure 2.2. All are a fixed 32 bits in length and must be aligned on a four-byte boundary in memory. An instruction address misaligned exception is generated on a taken branch or unconditional jump if the target address is not four-byte aligned. No instruction fetch misaligned exception is generated for a conditional branch that is not taken."

Given the four-byte alignment requirement, this component presupposes that the base address provided through the interface satisfies this condition. This condition is enforced by the CPU component.

### 6.6 Program memory constraints assuming large fields

#### 6.6.1 Range checks

```
/\!\!/ pc \in \left[0,2^{32}-1\right] - guaranteed via memory checking /\!\!/ Enforcing instr-val ^{(j)} \in \left[0,2^8-1\right] for j=1,2,3,4
```

```
• instr-val^{(1)} \in [0, 2^8 - 1]

• instr-val^{(2)} \in [0, 2^8 - 1]

• instr-val^{(3)} \in [0, 2^8 - 1]

• instr-val^{(4)} \in [0, 2^8 - 1]

# Enforcing range check for prog-ctr-prev

• prog-ctr-prev \in [0, 2^{32} - 1]

# Enforcing range check for prog-ctr-cur

• prog-ctr-cur \in [0, 2^{32} - 1]
```

#### 6.6.2 Arithmetic constraints

#### 6.6.3 Logup computations

Since we assume that only one instruction gets executed in a clock cycle, there will be exactly 4 program memory accesses in each cycle, in order to read the 4 bytes of the program instruction word.

Let fp be a fingerprint function which takes as input the tuple (pc,instr-val,prog-ctr) and returns a field element in the secure extension field qm31 using a value  $\beta$  chosen by the verifier, where instr-val:=(instr-val<sup>(1)</sup>,...,instr-val<sup>(4)</sup>).

Moreover, let PROG-SET denote the set of word-aligned base addresses used by the input program.

Initial write set digest: Let prog-write-init-digest be the logup sum for the initial state of the program memory (see Section 6.3). That is,

$$\texttt{prog-write-init-digest} = \sum_{\texttt{pc} \in \mathsf{PROG-SET}} \frac{1}{(\mathsf{fp}(\texttt{pc}, \texttt{instr-val}, 0) + \alpha)},$$

where  $instr-val = (instr-val^{(1)}, ..., instr-val^{(4)})$  denotes, respectively, the byte values of the instruction stored at memory locations [pc, pc + 3].

Final read set digest: Let prog-read-final-digest be the logup sum for the final state of the program memory. That is,

$$\texttt{prog-read-final-digest} = \sum_{\texttt{pc} \in \mathsf{PROG-SET}} \frac{1}{(\mathsf{fp}(\texttt{pc}, \texttt{instr-val}, \texttt{prog-ctr-final}) + \alpha)},$$

where prog-ctr-final denotes the final value of the counter associated with the base address pc when the latter is last accessed and  $instr-val = (instr-val^{(1)}, ..., instr-val^{(4)})$  denotes the instruction byte values stored, respectively, at memory locations [pc, pc + 3].

Difference between read and write set digests: Let pc[i] be the word-aligned base address for the instruction (instr-val<sup>(1)</sup>[i], instr-val<sup>(2)</sup>[i], instr-val<sup>(3)</sup>[i], instr-val<sup>(4)</sup>[i]) accessed in row i. Let prog-ctr-prev[i] and prog-ctr-prev[i] be the corresponding previous and current counters associated with the base address pc[i]. The difference between the digests of the read set and the write set between rows i-1 and i can be computed as follows:

$$\begin{array}{l} \bullet \ \ \operatorname{prog-read-digest}[i] - \operatorname{prog-read-digest}[i-1] = \\ \hline \frac{1}{(\operatorname{fp}(\operatorname{pc}[i],\operatorname{instr-val}[i],\operatorname{prog-ctr-prev}[i]) + \alpha)} \end{array}$$

```
 \bullet \ \operatorname{prog-write-digest}[i] - \operatorname{prog-write-digest}[i-1] =
      \frac{1}{(\mathsf{fp}(\mathsf{pc}[i],\mathsf{instr-val}[i],\mathsf{prog-ctr-cur}[i]) + \alpha)}
```

Boundary constraints: Let prog-write-init-digest and prog-read-final-digest be as defined above. The boundary constraints can then be specified as follows:

- $\bullet \ \ \mathsf{prog\text{-}read\text{-}digest}[0] = \tfrac{1}{(\mathsf{fp}(\mathsf{pc}[0],\mathsf{instr\text{-}val}[0],\mathsf{prog\text{-}ctr\text{-}prev}[0]) + \alpha)}$  $\bullet \ \ \mathsf{prog\text{-}write-digest}[0] = \mathsf{prog\text{-}write-init-digest} + \tfrac{1}{(\mathsf{fp}(\mathsf{pc}[0],\mathsf{instr\text{-}val}[0],\mathsf{prog\text{-}ctr\text{-}cur}[0]) + \alpha)}$
- prog-read-digest[n] = prog-read-final-digest + prog-write-digest[n].

**Transition constraints**  $(0 < i \le n)$ : The transition constraints can be specified as follows:

- $\bullet \ \operatorname{prog-read-digest}[i] \operatorname{prog-read-digest}[i-1] = \tfrac{1}{(\operatorname{fp}(\operatorname{pc}[i],\operatorname{instr-val}[i],\operatorname{prog-ctr-prev}[i]) + \alpha)}$
- $\bullet \ \ \mathsf{prog-write-digest}[i] \mathsf{prog-write-digest}[i-1] = \tfrac{1}{(\mathsf{fp}(\mathsf{pc}[i],\mathsf{instr-val}[i],\mathsf{prog-ctr-cur}[i]) + \alpha)}.$

### Program memory constraints assuming small fields

### 6.7.1 Range checks

```
/\!\!/\operatorname{pc}^{(1)} \in \left[0, 2^8 - 1
ight] - guaranteed via memory checking
/\!\!/ pc^{(2)}\in\left[0,2^{8}-1
ight] - guaranteed via memory checking
/\!\!/\operatorname{pc}^{(3)} \in \left[0, 2^8 - 1
ight] - guaranteed via memory checking
/\!\!/\operatorname{pc}^{(4)} \in \left[0, 2^8 - 1
ight] - guaranteed via memory checking
/\!\!/ Enforcing instr-val^{(j)} \in \left[0,2^8-1\right] for j=1,2,3,4
• \operatorname{instr-val}^{(1)} \in \left[0, 2^8 - 1\right]
• instr-val^{(2)} \in [0, 2^8 - 1]
• instr-val^{(3)} \in [0, 2^8 - 1]
```

# Enforcing range check for prog-ctr-prev

 $prog-ctr-prev^{(1)} \in \left[0, 2^8 - 1\right]$ 

• instr-val $^{(4)} \in [0, 2^8 - 1]$ 

- $prog-ctr-prev^{(2)} \in [0, 2^8 1]$
- prog-ctr-prev<sup>(3)</sup>  $\in [0, 2^8 1]$
- $exttt{prog-ctr-prev}^{(4)} \in \left[0, 2^8 1\right]$

 $\slash\hspace{-0.4em}/\hspace{-0.4em}/\hspace{-0.4em}$  Enforcing range check for prog-ctr-cur

- $\bullet \quad \operatorname{prog-ctr-cur}^{(1)} \in \left\lceil 0, 2^8 1 \right\rceil$
- prog-ctr-cur $^{(2)} \in [0, 2^8 1]$
- prog-ctr-cur<sup>(3)</sup>  $\in [0, 2^8 1]$
- $prog-ctr-cur^{(4)} \in \left[0, 2^8 1\right]$

#### 6.7.2 Arithmetic constraints

```
/\!\!/ Enforcing prog-ctr-cur = prog-ctr-prev + 1
// prog-ctr-carry used for carry handling
• prog-ctr-prev^{(1)} + 1 - prog-ctr-carry^{(1)} \cdot 2^8 - prog-ctr-cur^{(1)}
 \bullet \quad \texttt{prog-ctr-prev}^{(2)} + \texttt{prog-ctr-carry}^{(1)} - \texttt{prog-ctr-carry}^{(2)} \cdot 2^8 - \texttt{prog-ctr-cur}^{(2)} \\
  \texttt{prog-ctr-prev}^{(3)} + \texttt{prog-ctr-carry}^{(2)} - \texttt{prog-ctr-carry}^{(3)} \cdot 2^8 - \texttt{prog-ctr-cur}^{(3)}
• prog-ctr-prev^{(4)} + prog-ctr-carry^{(3)} - prog-ctr-carry^{(4)} \cdot 2^8 - prog-ctr-cur^{(4)}
/\!\!/ Enforcing prog-ctr-carry^{(j)} \in \{0,1\} for j=1,2,3,4
\bullet \quad (\texttt{prog-ctr-carry}^{(1)}) \cdot (1 - \texttt{prog-ctr-carry}^{(1)}) = 0
• (prog-ctr-carry<sup>(2)</sup>) · (1 – prog-ctr-carry<sup>(2)</sup>) = 0
• (prog-ctr-carry<sup>(3)</sup>) · (1 – prog-ctr-carry<sup>(3)</sup>) = 0
• (prog-ctr-carry^{(4)}) \cdot (1 - prog-ctr-carry^{(4)}) = 0
```

#### 6.7.3Logup computations

In the small field case, the components pc and prog-ctr-prev are specified by four 8-bit limb values  $pc^{(j)}$  and prog-ctr-prev<sup>(j)</sup> for j = 1, 2, 3, 4.

Let fp be a fingerprint function which takes as input the tuple  $(pc^{(1)}, \dots, pc^{(4)}, instr-val^{(1)}, \dots,$ instr-val<sup>(4)</sup>, prog-ctr<sup>(1)</sup>,..., prog-ctr<sup>(4)</sup>) and returns a field element in the secure extension field qm31 using a value  $\beta$  chosen by the verifier. To simplify the notation and avoid explicitly describing each 8-bit value, we use

to denote

$$\mathsf{fp}(\mathsf{pc}^{(1)},\ldots,\mathsf{pc}^{(4)},\mathsf{instr-val}^{(1)},\ldots,\mathsf{instr-val}^{(4)},\mathsf{prog-ctr}^{(1)},\ldots,\mathsf{prog-ctr}^{(4)}).$$

Using fp, we can compute the difference between the read set and write set digests between rows i-1 and i as follows:

- prog-read-digest[i] prog-read-digest[i-1] =  $\frac{1}{(\mathsf{fp}(\mathsf{pc}[i],\mathsf{instr-val}[i],\mathsf{prog-ctr-prev}[i]) + \alpha)}$  prog-write-digest[i] prog-write-digest[i-1] =  $\frac{1}{(\mathsf{fp}(\mathsf{pc}[i],\mathsf{instr-val}[i],\mathsf{prog-ctr-cur}[i]) + \alpha)}$

Initial write set digest: Let prog-write-init-digest be the logup sum for the initial state of the program memory. That is,

$$\texttt{prog-write-init-digest} = \sum_{\texttt{pc} \in \mathsf{PROG-SET}} \frac{1}{(\mathsf{fp}(\texttt{pc}, \texttt{instr-val}, \mathbf{0}) + \alpha)},$$

where

- $pc = (pc^{(1)}, pc^{(2)}, pc^{(3)}, pc^{(4)})$
- instr-val = (instr-val<sup>(1)</sup>, instr-val<sup>(2)</sup>, instr-val<sup>(3)</sup>, instr-val<sup>(4)</sup>) denotes the instruction byte values stored respectively at memory locations  $[\mathtt{pc},\mathtt{pc}+3]$
- $\mathbf{0} = (0, 0, 0, 0)$

Final read set digest: Let prog-read-final-digest be the logup sum for the final state of the program memory. That is,

$$\texttt{prog-read-final-digest} = \sum_{\texttt{pc} \in \mathsf{PROG-SET}} \frac{1}{(\mathsf{fp}(\texttt{pc}, \texttt{instr-val}, \texttt{prog-ctr-final}) + \alpha)},$$

where

- prog-ctr-final denotes the 4 limbs of the final value of the counter associated with the base address pc; and
- instr-val = (instr-val<sup>(1)</sup>, instr-val<sup>(2)</sup>, instr-val<sup>(3)</sup>, instr-val<sup>(4)</sup>) denotes the four-byte instruction value stored at address pc.

Difference between read and write set digests: Let pc[i] be the word-aligned base address for the instruction (instr-val<sup>(1)</sup>[i], instr-val<sup>(2)</sup>[i], instr-val<sup>(3)</sup>[i], instr-val<sup>(4)</sup>[i]) being accessed at row i. Let prog-ctr-prev[i] and prog-ctr-cur[i] be the corresponding previous and current counters associated with the base address pc[i]. The difference between the read set and write set digests between rows i-1 and i can be computed as follows:

• prog-read-digest[
$$i$$
] - prog-read-digest[ $i-1$ ] =  $1/(fp(pc[i], instr-val[i], prog-ctr-prev[i]) +  $\alpha)$$ 

```
• prog-write-digest[i] - prog-write-digest[i-1] = 1/(\operatorname{fp}(\operatorname{pc}[i],\operatorname{instr-val}[i],\operatorname{prog-ctr-cur}[i]) + \alpha)
```

Boundary constraints: Let prog-write-init-digest and prog-read-final-digest be as defined above. The boundary constraints can then be specified as follows:

- $\begin{array}{l} \bullet \ \ \operatorname{prog-read-digest}[0] = \frac{1}{(\operatorname{fp}(\operatorname{pc}[0],\operatorname{instr-val}[0],\operatorname{prog-ctr-prev}[0]) + \alpha)} \\ \bullet \ \ \operatorname{prog-write-digest}[0] = \operatorname{prog-write-init-digest} + \frac{1}{(\operatorname{fp}(\operatorname{pc}[0],\operatorname{instr-val}[0],\operatorname{prog-ctr-cur}[0]) + \alpha)} \\ \end{array}$
- $\bullet \ \mathtt{prog-read-digest}[n] = \mathtt{prog-read-final-digest} + \mathtt{prog-write-digest}[n]. \\$

**Transition constraints**  $(0 < i \le n)$ : The transition constraints can be specified as follows:

 $\begin{array}{l} \bullet \ \ \mathsf{prog\text{-}read\text{-}digest}[i] - \mathsf{prog\text{-}read\text{-}digest}[i-1] = \frac{1}{(\mathsf{fp}(\mathsf{pc}[i],\mathsf{instr\text{-}val}[i],\mathsf{prog\text{-}ctr\text{-}prev}[i]) + \alpha)} \\ \bullet \ \ \mathsf{prog\text{-}write\text{-}digest}[i] - \mathsf{prog\text{-}write\text{-}digest}[i-1] = \frac{1}{(\mathsf{fp}(\mathsf{pc}[i],\mathsf{instr\text{-}val}[i],\mathsf{prog\text{-}ctr\text{-}cur}[i]) + \alpha)}. \end{array}$ 

### 6.8 Constraints and logup computation for initial write and final read sets

In order to help compute the logup contributions related to the initial write set and the final read set mentioned in Section 6.7.3, we define additional trace columns and constraints for the program memory component.

#### 6.8.1 Trace elements for initial write and final read sets

In addition to the trace elements defined in Section 6.1, this section defines additional elements to help with the computation of the initial write set and the final read set.

- prog-init-base-addr: the memory address given for each 4-byte instruction in the program memory ever touched.
- prog-val-init: the 4-byte instruction word stored at address prog-init-base-addr.
- prog-ctr-final: the counter associated with the last access to address prog-init-base-addr.
- prog-init-flag: a flag indicating whether prog-val-init, prog-ctr-final columns on the current row are being used.

#### Remarks

- prog-init-base-addr should include all the program memory base addresses accessed during execution.
- prog-init-base-addr must be a multiple of 4 due to memory alignment requirements.

#### 6.8.2 Constraints for ensuring uniqueness of base address values

To ensure that there are no duplicates whenever prog-init-flag = 1, we constrain the relevant values of prog-init-base-addr to increase strictly monotonically and remain a multiple of 4, although gaps in the address range are allowed. When prog-init-flag = 0, we add constraints to enforce that the value of prog-init-base-addr remains unchanged.

Let i be the index of the current row. The precise guaranteed condition is the following:

- For all rows i, prog-init-base-addr[i] must be a multiple of 4.
- If row i is the last row in the trace (that is, is-last[i] = 1), no additional constraint is enforced.
- If row i is not the last row (that is, is-last[i] = 0), then the following must also be enforced:
  - If the value of prog-init-base-addr[i+1] is being used (that is, prog-init-flag[i+1]=1), then prog-init-base-addr[i+1] should be strictly larger than prog-init-base-addr[i] (that is, prog-init-base-addr[i+1]) prog-init-base-addr[i])

```
- If the value of prog-init-base-addr[i+1] is not being used (that is, prog-init-flag[i+1] = 0), then prog-init-base-addr[i+1] = prog-init-base-addr[i]
```

In order to enforce the above conditions:

- let prog-addr-cur denote the value of prog-init-base-addr in row i;
- let prog-addr-next denote the value of prog-init-base-addr in row i + 1;
- let is-last be a flag that indicates whether row i is the last row in the trace;
- let prog-addr-diff be a helper variable used to compute the difference (prog-addr-cur prog-addr-next);
- let prog-addr-borrow be a helper borrow variable; and
- let prog-init-flag-next denote the value of prog-init-flag in row i+1.

```
/\!\!/ Setting prog-addr-borrow to the borrow value for prog-addr-cur - prog-addr-next
     (\texttt{prog-addr-cur}^{(1)} + \texttt{prog-addr-borrow}^{(1)} \cdot 2^8 - \texttt{prog-addr-next}^{(1)} - \texttt{prog-addr-diff}^{(1)}) = 0
     (\mathsf{prog}\,\mathsf{addr}\,\mathsf{cur}^{(2)} + \mathsf{prog}\,\mathsf{addr}\,\mathsf{borrow}^{(2)} \cdot 2^8 - \mathsf{prog}\,\mathsf{addr}\,\mathsf{next}^{(2)} - \mathsf{prog}\,\mathsf{addr}\,\mathsf{diff}^{(2)} - \mathsf{prog}\,\mathsf{addr}\,\mathsf{borrow}^{(1)}) = 0 \\ (\mathsf{prog}\,\mathsf{addr}\,\mathsf{cur}^{(3)} + \mathsf{prog}\,\mathsf{addr}\,\mathsf{borrow}^{(3)} \cdot 2^8 - \mathsf{prog}\,\mathsf{addr}\,\mathsf{next}^{(3)} - \mathsf{prog}\,\mathsf{addr}\,\mathsf{diff}^{(3)} - \mathsf{prog}\,\mathsf{addr}\,\mathsf{borrow}^{(2)}) = 0 \\ (\mathsf{prog}\,\mathsf{addr}\,\mathsf{cur}^{(4)} + \mathsf{prog}\,\mathsf{addr}\,\mathsf{borrow}^{(4)} \cdot 2^8 - \mathsf{prog}\,\mathsf{addr}\,\mathsf{next}^{(4)} - \mathsf{prog}\,\mathsf{addr}\,\mathsf{diff}^{(4)} - \mathsf{prog}\,\mathsf{addr}\,\mathsf{borrow}^{(3)}) = 0 \\ \end{aligned}
/\!\!/ Enforcing increment if the value prog-init-base-addr in the next row is being used
   (1 - \text{is-last})(\text{prog-init-flag-next}) \cdot (1 - \text{prog-addr-borrow}^{(4)}) = 0
# Enforcing no change if the value prog-init-base-addr in the next row is not being used
   (1-is-last)(1-prog-init-flag-next) \cdot (prog-addr-next^{(1)}-prog-addr-cur^{(1)})=0
     (1-\mathtt{is-last})(1-\mathtt{prog-init-flag-next}) \cdot (\mathtt{prog-addr-next}^{(2)} - \mathtt{prog-addr-cur}^{(2)}) = 0
     (1-\mathtt{is\text{-}last})(1-\mathtt{prog\text{-}init\text{-}flag\text{-}next}) \cdot (\mathtt{prog\text{-}addr\text{-}next}^{(3)} - \mathtt{prog\text{-}addr\text{-}cur}^{(3)}) = 0
     (1-\texttt{is-last})(1-\texttt{prog-init-flag-next}) \cdot (\texttt{prog-addr-next}^{(4)} - \texttt{prog-addr-cur}^{(4)}) = 0
# Enforcing prog-addr-borrow<sup>(j)</sup> \in \{0,1\} for j=1,2,3,4
     (prog-addr-borrow^{(1)}) \cdot (1 - prog-addr-borrow^{(1)}) = 0
      (\operatorname{prog-addr-borrow}^{(2)}) \cdot (1 - \operatorname{prog-addr-borrow}^{(2)}) = 0
     (\text{prog-addr-borrow}^{(3)}) \cdot (1 - \text{prog-addr-borrow}^{(3)}) = 0
     (\operatorname{prog-addr-borrow}^{(4)}) \cdot (1 - \operatorname{prog-addr-borrow}^{(4)}) = 0
# Ensuring that prog-addr-cur is a multiple of 4
• prog-addr-aux \cdot 4 - prog-addr-cur<sup>(1)</sup> = 0
```

#### 6.8.3 Range checks

```
\label{eq:program} \begin{tabular}{ll} \# \ Enforcing \ ranges \ for \ additional \ program \ memory \ variables \\ \# \ prog-val-init^{(j)} \in \left[0,2^8-1\right] \ for \ j=1,2,3,4 \ - \ \ Does \ not \ need \ to \ be \ reinforced \\ \# \ prog-init-base-addr^{(j)} \in \left[0,2^8-1\right] \ for \ j=1,2,3,4 \ - \ \ Does \ not \ need \ to \ be \ reinforced \\ \P \ prog-addr-diff^{(j)} \in \left[0,2^8-1\right] \ for \ j=1,2,3,4 \\ \P \ prog-addr-aux \in \left[0,2^8-1\right] \ for \ j=1,2,3,4 \\ \P \ prog-addr-aux \in \left[0,2^6-1\right] \\ \P \ (prog-init-flag) \cdot (1-prog-init-flag) = 0 \\ \end{tabular}
```

### 6.8.4 Logup computation for initial write and final read sets

In order to compute the logup sums for the initial and final states of the program memory, as described in Section 6.7.3, the prover creates two additional columns:

- prog-write-init-digest: a digest column used to compute the logup sum for the initial state of the program memory;
- prog-read-final-digest: a digest column used to compute the logup sum for the final state of the program memory.

**Boundary constraints**: Let  $\mathbf{0} = (0, 0, 0, 0)$ . The boundary constraints are as follows:

```
• prog-write-init-digest[0] = prog-init-flag[0]/(fp(prog-init-base-addr[0], prog-val-init[0], \mathbf{0}) + \alpha)
• prog-read-final-digest[0] = prog-init-flag[0]/(fp(prog-init-base-addr[0], prog-val-init[0], prog-ctr-final[0]) + \alpha)
```

**Transition constraints**  $(0 < i \le n)$ : Let  $\mathbf{0} = (0, 0, 0, 0)$ . The transition constraints are as follows:

```
• prog-write-init-digest[i] - prog-write-init-digest[i-1] = prog-init-flag[i]/(fp(prog-init-base-addr[i], prog-val-init[i], \mathbf{0}) + \alpha) • prog-write-digest[i] - prog-write-digest[i-1] = prog-init-flag[i]/(fp(prog-init-base-addr[i], prog-val-init[i], prog-ctr-final[i]) + \alpha)
```

# 7 Data memory component

The read-write data memory is responsible for managing access to the RAM. Since this is a read-write memory, each address will have a timestamp associated with it indicating the last time that an address location has been accessed.

As in the case of the register memory, we will treat each memory access as first a read of the previous value and then a write of the current value. When the operation does not change the value, e.g., a load from memory, we write back the same value that was read, but update its timestamp. We will make use of logups to check the consistency between the read and write sets, where each element of the set has the form (ram-addr,ram-val,ram-ts). This indicates that the value ram-val was written to address ram-addr at time ram-ts.

**Section outline**: The remainder of this section is organized as follows:

- Section 7.1 provides a quick overview of the operations supported by the data memory and how to enforce the consistency of these operations using logups.
- Section 7.2 defines the main trace elements used by the data memory component.
- Section 7.3 describes how the contents of the data memory are initialized.
- Section 7.4 defines the interface that other components can use to read and write to the data memory.
- Sections 7.5 and 7.6 describe the constraints for the data memory component assuming, respectively, large and small fields. This includes in particular range checks (Sections 7.5.1 and 7.6.1), arithmetic constraints (Sections 7.5.2 and 7.6.2), and logup computations (Sections 7.5.3 and 7.6.3).
- Section 7.7 specifies additional constraints used to compute the logup contributions related to the initial and final state of the data memory.

#### 7.1 Read and write operations

Let ram-val-prev be the value stored at an address ram-addr and let ram-ts-prev be the timestamp when the value was written to it.

**Read operation** In order to read the contents of a memory address ram-addr at the current time ram-ts-cur, we will update the read and write sets as follows:

```
• ram-read-set = ram-read-set \cup {(ram-addr, ram-val-prev, ram-ts-prev)} • ram-write-set = ram-write-set \cup {(ram-addr, ram-val-prev, ram-ts-cur)}
```

Write operation In order to update the contents of a memory address ram-addr at the current time ram-ts-cur with the value ram-val-cur, we will update the read and write sets as follows:

```
• ram-read-set = ram-read-set \cup \{(ram-addr, ram-val-prev, ram-ts-prev)\}
```

• ram-write-set = ram-write-set  $\cup \{(ram-addr, ram-val-cur, ram-ts-cur)\}$ 

Let  $\alpha$  be a random value chosen by the verifier after the prover commits to the execution trace of the program. In order to convert each triple (ram-addr, ram-val, ram-ts) to a field element so that we can use it in the logup computation, we will use the fingerprint function fp(ram-addr, ram-val, ram-ts). As a result, the logup contribution for the entry (ram-addr, ram-val, ram-ts) will be

$$1/(fp(ram-addr, ram-val, ram-ts) + \alpha).$$

### 7.2 Data memory trace elements

In the case of the data memory, up to four consecutive addresses can be accessed during an execution cycle (for instance, when executing lw or sw). Since each access to the data memory requires us to maintain a set (ram-addr, ram-val-cur, ram-val-prev, ram-ts-prev) to properly handle memory updates related to a particular address, we will have 4 such sets of values.

As a result, we will have the following base set of trace elements:

- clk: the current execution time
- ram-base-addr: memory base address
- ram1-val-cur,...,ram4-val-cur: 8-bit values used to update memory locations
- ram1-val-prev,...,ram4-val-prev: previous 8-bit values stored at each location
- ram1-ts-prev,...,ram4-ts-prev: previous timestamps for each location
- ram1-accessed,...,ram4-accessed: flags indicating whether each address in [ram-base-addr, ram-base-addr + 3] is accessed.
- ram-read-digest: a digest of the read set, used for logups.
- ram-write-digest: a digest of the write set, used for logups.

#### Remarks

- Since the data memory addresses being accessed in a row are consecutive, only the value associated with the first of the four memory locations (the base address ram-base-addr) needs to be maintained. The remaining addresses can be computed from it by adding 1, 2, or 3 to ram-base-addr.
- Due to memory alignment restrictions, adding 1, 2, or 3 to the base address will not cause an overflow so we do not need to worry about carries.
  - For instance, when the function 1h calls the data memory component interface to read two bytes, the base address ram-base-addr will be a multiple of 2 and therefore at most equal to 254. Hence, the address for the second memory location (i.e., ram-base-addr + 1) would not overflow.
  - Although the values of ram-base-addr +i for i=2,3 could overflow (meaning could be larger than 255) when used within the 1h instruction, their logup contributions would be ignored since ram3-accessed and ram4-accessed would be 0 in this case.
  - The same reasoning used for 1h applies to 1hu and sh instructions.
  - For 1b, 1bu, sb instructions, ram-base-addr+i for i=1,2,3 could overflow but their logup contributions would be ignored since ram2-accessed, ram3-accessed, and ram4-accessed would be 0.
  - For lw and sw instructions, ram-base-addr is always 4-aligned, hence ram-base-addr + i for i = 1, 2, 3 would not overflow.
- When updating the write set for the data memory, it suffices to use the current execution time clk as the current timestamp for the memory addresses being accessed. Hence, we do need to explicitly define ram-ts-cur and we can use clk in its place.

# 7.3 Data memory initialization

We assume that the contents of memory locations accessed during the execution of the program are initialized to 0 at time 0, except for public input locations which may contain other values. In particular, this means that the initial write set will contain an entry (ram-addr, 0, 0) for each memory address ram-addr accessed during the execution of the program that is not a public input address. Similarly, for each public input address pub-in-addr being initialized to a value pub-in-val, there will be an entry (pub-in-addr, pub-in-val, 0) in the initial write set.

Let MEM—SET denote the set of locations accessed during the execution of the program, and let PUB—IN—SET  $\subseteq$  MEM—SET denote the subset of public input locations. Moreover, let init-pub-val(pub-in-addr) denote the initial value stored at the public input address pub-in-addr. The corresponding digest for this initial write set will be

$$\begin{split} \text{ram-write-init-digest} &= \sum_{\substack{\text{ram-addr} \in \mathsf{MEM-SET} \backslash \mathsf{PUB-IN-SET}}} \frac{1}{\mathsf{fp}(\mathsf{ram-addr},0,0) + \alpha} \\ &+ \sum_{\substack{\text{pub-in-addr} \in \mathsf{PUB-IN-SET}}} \frac{1}{\mathsf{fp}(\mathsf{pub-in-addr},\mathsf{init-pub-val}(\mathsf{pub-in-addr}),0) + \alpha}. \end{split}$$

### 7.4 Data memory interface

In order to clarify the interaction between the data memory and other components, we define here the interface used to read and write to the data memory.

Currently, we assume that there might be at most 4 memory locations being accessed in a clock cycle. For this reason, the interface also includes an input ram-shift  $\in \{0, 1, 2, 3\}$  that specifies the memory location being accessed.

- Read<sub>RAM</sub>(clk, ram-base-addr, ram-shift) → val: the data memory returns the value val stored at memory location ram-base-addr + ram-shift, updating timestamps according to timestamp clk and shift amount ram-shift.
- Write<sub>RAM</sub>(clk, val, ram-base-addr, ram-shift) → val: the data memory updates the value stored
  at memory location base-addr + ram-shift with the value val, updating timestamps according to
  timestamp clk and shift amount ram-shift. The data memory additionally returns the updated
  value val.

#### Remarks

As in the register memory case, the interface above considers each input as a single element.
 However, when working over small fields, the values clk and ram-base-addr will be specified by a set of limbs.

```
\begin{array}{l} - \ \mathtt{clk} \coloneqq (\mathtt{clk}^{(1)},\mathtt{clk}^{(2)},\mathtt{clk}^{(3)},\mathtt{clk}^{(4)}) \\ - \ \mathtt{ram\text{-}base\text{-}addr} \coloneqq (\mathtt{ram\text{-}base\text{-}addr}^{(1)},\ldots,\mathtt{ram\text{-}base\text{-}addr}^{(4)}) \end{array}
```

- When a memory location is not accessed in a clock cycle, the corresponding flag ram j-accessed for j = 1, 2, 3, 4 should be set to 0 so that the corresponding entry is not taken into account during the logup computation.
- The value ram j-accessed for j = 1, 2, 3, 4 will be set according to the flags used by the other components when they call the data memory interface. More precisely, they are set as follows:
  - ram1-accessed = is-load + is-type-s
  - $\bullet \quad \texttt{ram2-accessed} = \texttt{is-lh} + \texttt{is-lw} + \texttt{is-lhu} + \texttt{is-sh} + \texttt{is-sw}$
  - $\bullet \quad {\tt ram3-accessed} = {\tt is-lw+is-sw} \\$
  - $\bullet \quad \mathtt{ram4-accessed} = \mathtt{is-lw} + \mathtt{is-sw}$

### 7.5 Data memory constraints assuming large fields

In order to properly implement memory checking for the memory locations that are accessed at clock cycle clk, it is important to ensure that the timestamps ramj-ts-prev associated with the previous values ramj-val-prev stored at addresses base-addr+j for j = 1, 2, 3, 4 fall in the range  $\{0, \ldots, clk-1\}$ .

To achieve this goal using 32-bit range checks, we split each range check of the form ram j-ts-prev  $\in \{0, \ldots, clk - 1\}$  for j = 1, 2, 3, 4 into two range checks ram j-ts-prev  $\in [0, 2^{32} - 1]$  and clk - 1 - ram j-ts-prev  $\in [0, 2^{32} - 1]$ . To implement this idea, we first define the auxiliary variables ram j-ts-prev-aux for j = 1, 2, 3, 4 and then create the following set of constraints:

```
• \operatorname{ram} j-ts-prev \in \left[0, 2^{32} - 1\right]
• \operatorname{ram} j-ts-prev-aux \in \left[0, 2^{32} - 1\right]
• \operatorname{ram} j-ts-prev-aux = \operatorname{clk} - 1 - \operatorname{ram} j-ts-prev
```

### 7.5.1 Range checks

```
# ram-base-addr \in [0, 2^{32} - 1] - guaranteed via memory checking
   ram-val1 \in [0, 2^8 - 1]
                                    > Common to instruction execution
   ram-val2 \in [0, 2^8 - 1]
                                    \texttt{ram-val3} \in \left\lceil 0, 2^8 - 1 \right\rceil

ightharpoonup Common to instruction execution
   ram-val4 \in [0, 2^8 - 1]
                                    ram1-val-prev \in \left[0, 2^8 - 1\right]
   ram2-val-prev \in [0, 2^8 - 1]
   ram3-val-prev \in [0, 2^8 - 1]
  ram4-val-prev \in [0, 2^8 - 1]
/\!\!/ ramj-ts-prev \in \{0,\ldots,\mathtt{clk}-1\} for j=1,2,3,4
  \mathtt{ram1-ts-prev} \in \left[0, 2^{32} - 1\right]
   ram2-ts-prev \in [0, 2^{32} - 1]
   ram3-ts-prev \in [0, 2^{32} - 1]
   \texttt{ram4-ts-prev} \in \left\lceil 0, 2^{32} - 1 \right\rceil
   \texttt{ram1-ts-prev-aux} \in \left[0, 2^{32} - 1\right]
   \texttt{ram2-ts-prev-aux} \in \left[0, 2^{32} - 1\right]
   ram3-ts-prev-aux \in [0, 2^{32} - 1]
   \texttt{ram4-ts-prev-aux} \in \left[0, 2^{32} - 1\right]
```

#### 7.5.2 Arithmetic constraints

#### 7.5.3 Logup computations

As stated above, one may access up to 4 consecutive memory locations in a clock cycle depending on the instruction that is being executed. For this reason, we use the four flags ramj-accessed for j = 1, 2, 3, 4 to indicate whether the set of trace elements (ramj-val-cur, ramj-val-prev, ramj-ts-prev) are accessed during the current clock cycle.

In order to compute the difference between the read set and write set digests between rows i-1 and i, we can use these flags as follows:

```
• ram-read-digest[i] - ram-read-digest[i-1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-prev[i], ram1-ts-prev[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-prev[i], ram2-ts-prev[i]) + \alpha) + ram3-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-prev[i], ram3-ts-prev[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 3, ram4-val-prev[i], ram4-ts-prev[i]) + \alpha) • ram-write-digest[i] - ram-write-digest[i] - 1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-cur[i], clk[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-cur[i], clk[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-cur[i], clk[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 3, ram4-val-cur[i], clk[i]) + \alpha)
```

Let fp be a fingerprint function which takes as input the tuple (ram-addr, ram-val, ram-ts) and returns a field element in the secure extension field qm31 using a random value  $\beta$  chosen by the verifier. Moreover, let MEM-SET denote the set of data memory addresses used by the program over the course of the execution and let PUB-IN-SET  $\subseteq$  MEM-SET denote its subset of public input locations. Finally, let init-pub-val(pub-in-addr) denote the initial value stored at the public input address pub-in-addr.

Initial write set digest: Let ram-write-init-digest be the logup sum for the initial state of the data memory (see Section 7.3). That is,

$$\begin{split} \text{ram-write-init-digest} &= \sum_{\substack{\text{ram-addr} \in \mathsf{MEM-SET} \backslash \mathsf{PUB-IN-SET}}} \frac{1}{\mathsf{fp}(\mathsf{ram-addr},0,0) + \alpha} \\ &+ \sum_{\substack{\text{pub-in-addr} \in \mathsf{PUB-IN-SET}}} \frac{1}{\mathsf{fp}(\mathsf{pub-in-addr},\mathsf{init-pub-val}(\mathsf{pub-in-addr}),0) + \alpha}. \end{split}$$

Final read set digest: Let ram-read-final-digest be the logup sum for the final state of the data memory. That is,

$$\texttt{ram-read-final-digest} \quad = \quad \sum_{\texttt{ram-addr} \in \mathsf{MEM-SET}} \frac{1}{\mathsf{fp}(\texttt{ram-addr}, \texttt{ram-val}, \texttt{ram-ts-final}) + \alpha},$$

where ram-ts-final denotes the timestamp associated with the last access to address ram-addr and ram-val denotes the corresponding byte value stored at this location.

Difference between read and write set digests: Let ram-base-addr[i] be the base address for the memory locations being accessed at row i. Let  $\mathtt{clk}[i]$  denote the value of  $\mathtt{clk}$  at row i and let  $\mathtt{ram} j$ -val-prev[i] and  $\mathtt{ram} j$ -ts-prev[i] for j=1,2,3,4 denote respectively the previous values and the previous timestamps associated with addresses  $\mathtt{ram-base-addr}[i],\ldots,\mathtt{ram-base-addr}[i]+3$ . The difference between the read set and write set digests between row i-1 and row i can be computed as follows:

```
• ram-read-digest [i] - ram-read-digest [i-1] = ram1-accessed [i] /(fp(ram-base-addr[i], ram1-val-prev[i], ram1-ts-prev[i]) + \alpha) + ram2-accessed [i] /(fp(ram-base-addr[i] + 1, ram2-val-prev[i], ram2-ts-prev[i]) + \alpha) + ram3-accessed [i] /(fp(ram-base-addr[i] + 2, ram3-val-prev[i], ram3-ts-prev[i]) + \alpha) + ram4-accessed [i] /(fp(ram-base-addr[i] + 3, ram4-val-prev[i], ram4-ts-prev[i]) + \alpha) • ram-write-digest [i] - ram-write-digest [i-1] = ram1-accessed [i] /(fp(ram-base-addr[i], ram1-val-cur[i], clk[i]) + \alpha) + ram2-accessed [i] /(fp(ram-base-addr[i] + 1, ram2-val-cur[i], clk[i]) + \alpha) + ram4-accessed [i] /(fp(ram-base-addr[i] + 2, ram3-val-cur[i], clk[i]) + \alpha),
```

where  $ram1-accessed[i], \ldots, ram4-accessed[i]$  are the flags that indicate whether these addresses are accessed in row i.

Boundary constraints: Let ram-write-init-digest and ram-read-final-digest be as defined above. The boundary constraints can then be specified as follows:

**Transition constraints**  $(0 < i \le n)$ : The transition constraints can be specified as follows:

```
• ram-read-digest[i] - ram-read-digest[i - 1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-prev[i], ram1-ts-prev[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-prev[i], ram2-ts-prev[i]) + \alpha) + ram3-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-prev[i], ram3-ts-prev[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 3, ram4-val-prev[i], ram4-ts-prev[i]) + \alpha) • ram-write-digest[i] - ram-write-digest[i - 1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-cur[i], clk[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-cur[i], clk[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-cur[i], clk[i]) + \alpha),
```

where  $ram1-accessed[i], \ldots, ram4-accessed[i]$  are the flags that indicate whether these addresses are accessed in row i.

# 7.6 Data memory constraints assuming small fields

# 7.6.1 Range checks

```
/\!\!/ ram-base-addr^{(1)} \in \left[0, 2^8 - 1
ight] - guaranteed via memory checking
// ram-base-addr^{(2)} \in [0, 2^8 - 1] - guaranteed via memory checking
/\!\!/ \, \mathrm{ram-base-addr}^{(3)} \in \left| \bar{[0, 2^8 - 1]} \right| - guaranteed via memory checking
// ram-base-addr^{(4)} \in [0, 2^8 - 1] - guaranteed via memory checking
  \mathtt{ram\text{-}val1} \in \left[0, 2^8 - 1\right] \qquad 	riangle 	ext{Common to instruction execution}
  ram-val2 \in [0, 2^8 - 1] > Common to instruction execution
• ram-val3 \in [0, 2^8 - 1] \triangleright Common to instruction execution
• ram-val4 \in [0, 2^8 - 1] \triangleright Common to instruction execution
• ram1-val-prev \in [0, 2^8 - 1]
• ram2-val-prev \in [0, 2^8 - 1]
• ram3-val-prev \in [0, 2^8 - 1]
• ram4-val-prev \in \left[0, 2^8 - 1\right]
// rami-ts-prev \in \{0,\ldots,\mathtt{clk}-1\} for i=1,2,3,4
• ram1-ts-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
  ram2-ts-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
  ram3-ts-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
• ram4-ts-prev^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
• ram1-ts-prev-aux^{(j)} \in \left[0, 2^{8} - 1\right] for j = 1, 2, 3, 4
```

```
• ram2-ts-prev-aux<sup>(j)</sup> \in [0, 2^8 - 1] for j = 1, 2, 3, 4
• ram3-ts-prev-aux<sup>(j)</sup> \in [0, 2^8 - 1] for j = 1, 2, 3, 4
• ram4-ts-prev-aux<sup>(j)</sup> \in [0, 2^8 - 1] for j = 1, 2, 3, 4
```

#### 7.6.2 Arithmetic constraints

In order to compute the range checks for  $rami-ts-prev^{(j)}$  for i=1,2,3,4 and j=1,2,3,4 using 8-bit range checks, we need to define auxiliary variables  $rami-ts-prev-aux^{(j)}$  to help with the computation of the expressions in the auxiliary range check. More precisely, in the large field case, we had to define the following constraints for these auxiliary variables:

```
• ram1-ts-prev-aux = clk - 1 - ram1-ts-prev
• ram2-ts-prev-aux = clk - 1 - ram2-ts-prev
• ram3-ts-prev-aux = clk - 1 - ram3-ts-prev
• ram4-ts-prev-aux = clk - 1 - ram4-ts-prev
```

To enforce these constraints in the small field case, we also need to take borrows into account when enforcing these constraints one limb at a time. As a result, we also define helper values ram1-ts-prev-borrow<sup>(j)</sup>, ram2-ts-prev-borrow<sup>(j)</sup>, ram3-ts-prev-borrow<sup>(j)</sup>, ram4-ts-prev-borrow<sup>(j)</sup> for j=1,2,3,4 to handle these borrows.

```
\# Computing ram1-ts-prev-aux = clk - 1 - ram1-ts-prev
/\!\!/ ram1-ts-prev-borrow<sup>(j)</sup> for j=1,2,3,4 used for borrow handling
          ram1-ts-prev-aux<sup>(1)</sup> + 1 + ram1-ts-prev<sup>(1)</sup> = clk^{(1)} + ram1-ts-prev-borrow<sup>(1)</sup> \cdot 2^8
          \texttt{ram1-ts-prev-aux}^{(2)} + \texttt{ram1-ts-prev-borrow}^{(1)} + \texttt{ram1-ts-prev}^{(2)} = \texttt{clk}^{(2)} + \texttt{ram1-ts-prev-borrow}^{(2)} \cdot 2^8
          \texttt{ram1-ts-prev-aux}^{(3)} + \texttt{ram1-ts-prev-borrow}^{(2)} + \texttt{ram1-ts-prev}^{(3)} = \texttt{clk}^{(3)} + \texttt{ram1-ts-prev-borrow}^{(3)} \cdot 2^8
          \texttt{ram1-ts-prev-aux}^{(4)} + \texttt{ram1-ts-prev-borrow}^{(3)} + \texttt{ram1-ts-prev}^{(4)} = \texttt{clk}^{(4)} + \texttt{ram1-ts-prev-borrow}^{(4)} \cdot 2^8
\# Computing ram2-ts-prev-aux = clk - 1 - ram2-ts-prev
/\!/ ram2-ts-prev-borrow<sup>(j)</sup> for j=1,2,3,4 used for borrow handling
          \texttt{ram2-ts-prev-aux}^{(1)} + 1 + \texttt{ram2-ts-prev}^{(1)} = \texttt{clk}^{(1)} + \texttt{ram2-ts-prev-borrow}^{(1)} \cdot 2^8
         \mathtt{ram2\text{-}ts\text{-}prev\text{-}aux}^{(2)} + \mathtt{ram2\text{-}ts\text{-}prev\text{-}borrow}^{(1)} + \mathtt{ram2\text{-}ts\text{-}prev}^{(2)} = \mathtt{clk}^{(2)} + \mathtt{ram2\text{-}ts\text{-}prev\text{-}borrow}^{(2)} \cdot 2^8
          \texttt{ram2-ts-prev-aux}^{(3)} + \texttt{ram2-ts-prev-borrow}^{(2)} + \texttt{ram2-ts-prev}^{(3)} = \texttt{clk}^{(3)} + \texttt{ram2-ts-prev-borrow}^{(3)} \cdot 2^{\texttt{Normal}} + \texttt{ram2-ts-prev-borrow}^{(3)} + \texttt{r
          \texttt{ram2-ts-prev-aux}^{(4)} + \texttt{ram2-ts-prev-borrow}^{(3)} + \texttt{ram2-ts-prev}^{(4)} = \texttt{clk}^{(4)} + \texttt{ram2-ts-prev-borrow}^{(4)} \cdot 2^8
      Computing ram3-ts-prev-aux = clk - 1 - ram3-ts-prev
      ram3-ts-prev-borrow<sup>(j)</sup> for j = 1, 2, 3, 4 used for borrow handling
          \texttt{ram3-ts-prev-aux}^{(1)} + 1 + \texttt{ram3-ts-prev}^{(1)} = \texttt{clk}^{(1)} + \texttt{ram3-ts-prev-borrow}^{(1)} \cdot 2^8
          \texttt{ram3-ts-prev-aux}^{(2)} + \texttt{ram3-ts-prev-borrow}^{(1)} + \texttt{ram3-ts-prev}^{(2)} = \texttt{clk}^{(2)} + \texttt{ram3-ts-prev-borrow}^{(2)} \cdot 2^8
          \texttt{ram3-ts-prev-aux}^{(3)} + \texttt{ram3-ts-prev-borrow}^{(2)} + \texttt{ram3-ts-prev}^{(3)} = \texttt{clk}^{(3)} + \texttt{ram3-ts-prev-borrow}^{(3)} \cdot 2^8
          \texttt{ram3-ts-prev-aux}^{(4)} + \texttt{ram3-ts-prev-borrow}^{(3)} + \texttt{ram3-ts-prev}^{(4)} = \texttt{clk}^{(4)} + \texttt{ram3-ts-prev-borrow}^{(4)} \cdot 2^{\texttt{Normal-super-borrow}} + \texttt{clk}^{(4)} + \texttt{ram3-ts-prev-borrow}^{(4)} + \texttt{clk}^{(4)} + \texttt{clk}^{(4
\# Computing ram4-ts-prev-aux = clk -1 - ram4-ts-prev
/\!\!/ ram4-ts-prev-borrow^{(j)} for j=1,2,3,4 used for borrow handling
       \mathtt{ram4-ts-prev-aux}^{(1)} + 1 + \mathtt{ram4-ts-prev}^{(1)} = \mathtt{clk}^{(1)} + \mathtt{ram4-ts-prev-borrow}^{(1)} \cdot 2^8
          \texttt{ram4-ts-prev-aux}^{(2)} + \texttt{ram4-ts-prev-borrow}^{(1)} + \texttt{ram4-ts-prev}^{(2)} = \texttt{clk}^{(2)} + \texttt{ram4-ts-prev-borrow}^{(2)} \cdot 2^8
          \mathtt{ram4-ts-prev-aux}^{(3)} + \mathtt{ram4-ts-prev-borrow}^{(2)} + \mathtt{ram4-ts-prev}^{(3)} = \mathtt{clk}^{(3)} + \mathtt{ram4-ts-prev-borrow}^{(3)} \cdot 2^{8}
          \texttt{ram4-ts-prev-aux}^{(4)} + \texttt{ram4-ts-prev-borrow}^{(3)} + \texttt{ram4-ts-prev}^{(4)} = \texttt{clk}^{(4)} + \texttt{ram4-ts-prev-borrow}^{(4)} \cdot 2^8
/\!\!/ Enforcing rami-ts-prev-borrow^{(j)} \in \{0,1\} for i=1,2,3,4 and j=1,2,3
      ram1-ts-prev-borrow^{(j)} \cdot (1 - \texttt{ram1-ts-prev-borrow}^{(j)}) = 0 \text{ for } j = 1, 2, 3
          \mathtt{ram2-ts-prev-borrow}^{(j)} \cdot (1 - \mathtt{ram2-ts-prev-borrow}^{(j)}) = 0 \text{ for } j = 1, 2, 3
          ram3-ts-prev-borrow<sup>(j)</sup> \cdot (1 - ram3-ts-prev-borrow<sup>(j)</sup>) = 0 for j = 1, 2, 3
          ram4-ts-prev-borrow^{(j)} \cdot (1-ram4-ts-prev-borrow^{(j)}) = 0 for j=1,2,3
/\!\!/ Enforcing rami-ts-prev-borrow^{(4)}=0 for i=1,2,3,4
        {\tt ram1-ts-prev-borrow}^{(4)}=0
        \verb"ram2-ts-prev-borrow"^{(4)}=0
ullet ram3-ts-prev-borrow^{(4)}=0
      \mathtt{ram}4\mathtt{-ts-prev-borrow}^{(4)}=0
```

### 7.6.3 Logup computations

In the small field case, the components ram-base-addr and rami-ts-prev for i = 1, 2, 3, 4 are each specified by four field elements representing the 8-bit limb values of the base address and time stamps.

Let fp be a fingerprint function which takes as input the tuple  $(ram-addr^{(1)}, ..., ram-addr^{(4)}, ram-val, ram-ts^{(1)}, ..., ram-ts^{(4)})$  and returns a field element in the secure extension field qm31 using a value  $\beta$  chosen by the verifier. To simplify the notation and avoid explicitly describing each limb for the address and timestamp values, we use

to denote

$$\mathsf{fp}(\mathsf{ram}\text{-}\mathsf{addr}^{(1)},\ldots,\mathsf{ram}\text{-}\mathsf{addr}^{(4)},\mathsf{ram}\text{-}\mathsf{val},\mathsf{ram}\text{-}\mathsf{ts}^{(1)},\ldots,\mathsf{ram}\text{-}\mathsf{ts}^{(4)})$$

Using fp, we can compute the difference between the read set and write set digests between rows i and i-1 as follows:

```
• ram-read-digest[i] - ram-read-digest[i - 1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-prev[i], ram1-ts-prev[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-prev[i], ram2-ts-prev[i]) + \alpha) + ram3-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-prev[i], ram3-ts-prev[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 3, ram4-val-prev[i], ram4-ts-prev[i]) + \alpha)
```

• ram-write-digest[i] - ram-write-digest[i-1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-cur[i], clk[i]) +  $\alpha$ ) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-cur[i], clk[i]) +  $\alpha$ ) + ram3-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-cur[i], clk[i]) +  $\alpha$ ) + ram4-accessed[i]/(fp(ram-base-addr[i] + 3, ram4-val-cur[i], clk[i]) +  $\alpha$ ),

where  $\operatorname{ram-base-addr}[i] + j := (\operatorname{ram-base-addr}^{(1)}[i] + j, \operatorname{ram-base-addr}^{(2)}[i], \operatorname{ram-base-addr}^{(3)}[i], \operatorname{ram-base-addr}^{(4)}[i])$  for j = 1, 2, 3. As before,  $\operatorname{ram} j$ -accessed[i] for j = 1, 2, 3, 4 are four flags that indicate whether the set of trace elements  $(\operatorname{ram} j$ -val-cur[i],  $\operatorname{ram} j$ -val-prev[i],  $\operatorname{ram} j$ -ts-prev[i]) are accessed during the i-th clock cycle.

Initial write set digest: Let ram-write-init-digest be the logup sum for the initial state of the data memory (see Section 7.3). That is,

$$\begin{split} \text{ram-write-init-digest} &= \sum_{\substack{\text{ram-addr} \in \mathsf{MEM-SET} \setminus \mathsf{PUB-IN-SET}}} \frac{1}{\mathsf{fp}(\mathsf{ram-addr}, 0, \mathbf{0}) + \alpha} \\ &+ \sum_{\substack{\text{pub-in-addr} \in \mathsf{PUB-IN-SET}}} \frac{1}{\mathsf{fp}(\mathsf{pub-in-addr}, \mathsf{init-pub-val}(\mathsf{pub-in-addr}), \mathbf{0}) + \alpha}, \end{split}$$

where

- $ram-addr = (ram-addr^{(1)}, ram-addr^{(2)}, ram-addr^{(3)}, ram-addr^{(4)}),$
- $pub-in-addr = (pub-in-addr^{(1)}, pub-in-addr^{(2)}, pub-in-addr^{(3)}, pub-in-addr^{(4)}), and$
- $\mathbf{0} = (0, 0, 0, 0)$ .

Final read set digest: Let ram-read-final-digest be the logup sum for the final state of the data memory. That is,

$$\texttt{ram-read-final-digest} \quad = \quad \sum_{\substack{\texttt{ram-addr} \in \mathsf{MEM-SET}}} \frac{1}{\mathsf{fp}(\texttt{ram-addr}, \texttt{ram-val}, \texttt{ram-ts-final}) + \alpha},$$

where  $\mathtt{ram-ts-final} = (\mathtt{ram-ts-final}^{(1)}, \dots, \mathtt{ram-ts-final}^{(4)})$  denotes the timestamp associated with the last access to address  $\mathtt{ram-addr} = (\mathtt{ram-addr}^{(1)}, \dots, \mathtt{ram-addr}^{(4)})$  and  $\mathtt{ram-val}$  corresponds to the byte value stored at this location.

Remark 7.1 The values ram-addr, ram-val, and ram-ts-final get committed to the trace. Especially, when there is public output (including exit status code), the values in the public output can be placed in a separate trace or in the same trace as the program memory.

### Difference between read and write set digests: Let

- ram-base-addr $[i] = (ram-base-addr^{(1)}[i], \dots, ram-base-addr^{(4)}[i])$  be the base address for the memory locations being accessed at row i.
- $-\operatorname{clk}[i] = (\operatorname{clk}^{(1)}[i], \ldots, \operatorname{clk}^{(4)}[i])$  denote the 4 limbs of the value of clk at row i.
- ramj-val-prev[i] and ramj-ts-prev[i] for j=1,2,3,4 denote respectively the previous values and the previous timestamps associated with addresses ram-base-addr[i],...,ram-base-addr[i]+3, where ram-base-addr $[i]+j \coloneqq$  (ram-base-addr[i]+j, ram-base-addr[i]+j, ram-base-addr[i]+j) for j=1,2,3.

The difference between the read set and write set digests between row i-1 and row i can be computed as follows:

```
• ram-read-digest[i] - ram-read-digest[i - 1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-prev[i], ram1-ts-prev[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-prev[i], ram2-ts-prev[i]) + \alpha) + ram3-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-prev[i], ram3-ts-prev[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 3, ram4-val-prev[i], ram4-ts-prev[i]) + \alpha) • ram-write-digest[i] - ram-write-digest[i] - ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-cur[i], clk[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-cur[i], clk[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-cur[i], clk[i]) + \alpha),
```

where  $ram1-accessed[i], \ldots, ram4-accessed[i]$  are the flags that indicate whether these addresses are accessed in row i.

Boundary constraints: Let ram-write-init-digest and ram-read-final-digest be as defined above. The boundary constraints can then be specified as follows:

where  $ram1-accessed[i], \ldots, ram4-accessed[i]$  are the flags that indicate whether these addresses are accessed in row i.

**Transition constraints**  $(0 < i \le n)$ : The transition constraints can be specified as follows:

```
• ram-read-digest[i] - ram-read-digest[i - 1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-prev[i], ram1-ts-prev[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-prev[i], ram2-ts-prev[i]) + \alpha) + ram3-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-prev[i], ram3-ts-prev[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 3, ram4-val-prev[i], ram4-ts-prev[i]) + \alpha)
```

```
• ram-write-digest[i] - ram-write-digest[i-1] = ram1-accessed[i]/(fp(ram-base-addr[i], ram1-val-cur[i], clk[i]) + \alpha) + ram2-accessed[i]/(fp(ram-base-addr[i] + 1, ram2-val-cur[i], clk[i]) + \alpha) + ram3-accessed[i]/(fp(ram-base-addr[i] + 2, ram3-val-cur[i], clk[i]) + \alpha) + ram4-accessed[i]/(fp(ram-base-addr[i] + 3, ram4-val-cur[i], clk[i]) + \alpha),
```

where  $ram1-accessed[i], \ldots, ram4-accessed[i]$  are the flags that indicate whether these addresses are accessed in row i.

# 7.7 Constraints and logup computation for initial write and final read sets

In order to help compute the logup contributions related to the initial write set and the final read set mentioned in Section 7.3, we define additional trace columns and constraints for the data memory component. When doing so, we will make a distinction between public and private parts of the execution trace, where the public part will contain elements which are known to the verifier and whose commitments the verifier can efficiently check.

### 7.7.1 Trace elements for initial write and final read sets

In addition to the trace elements defined in Section 7.2, this section defines additional elements to help with the computation of the initial write set (which depends on the public I/O values) and the final read set.

- ram-init-final-addr: the memory address given for each byte in the RAM ever touched or relevant for public I/O (located in private part of the execution trace and *not* known to the verifier)
- ram-val-final: 8-bit values of the final RAM, given byte-wise (located in a private part of the execution trace and *not* known to verifier)
- ram-val-init: a helper column containing 8-bit values of the initial RAM (located in a private part of the execution trace and *not* known to verifier)
- ram-ts-final: the timestamp associated with the last access to address ram-init-final-addr (located in a private part of the execution trace and *not* known to verifier)
- ram-init-final-flag: a flag indicating whether ram-final, ram-init columns on the current row are being used (located in a private part of the execution trace and *not* known to verifier).
- pub-in-flag: a flag indicating whether (pub-io-addr, pub-in-val) on the current row is considered as public input (located in a public part of the execution trace). If pub-in-flag is set, ram-init-final-flag also needs to be set.
- pub-io-addr: the same value as in ram-init-final-addr but only for public input and output (located in a public part of the execution trace). Needs to be equal to ram-init-final-addr if either pub-in-flag or pub-out-flag is set.
- pub-in-val: 8-bit values of the public input, given byte-wise (located in a public part of the execution trace).
- pub-out-val: 8-bit values of the public output, given byte-wise (located in a public part of the execution trace). Needs to be equal to ram-val-final on the same row when pub-out-flag is set
- pub-out-flag: flag indicating whether (pub-io-addr, pub-out-val) on the current row is considered as public output (located in a public part of the execution trace). If pub-out-flag is set, ram-init-final-flag needs to be also set.

### Remarks

• ram-init-final-addr should include all the read-write memory addresses ever accessed during the execution and all the addresses in the public input.

- The RAM is initialized with zero if pub-in-flag is false.
- The initial digest computation determines the initial memory content as follows:
  - For every ram-init-final-addr with ram-init-final-flag set, the initial value of the RAM ram-val-init is equal to pub-in-flag · pub-in-val. In particular, this latter value is zero when the address is not part of the public input.

### 7.7.2 Arithmetic constraints

### Constraints for data memory public I/O consistency

### Constraints for ensuring uniqueness of ram-init-final-addr values

Since ram-init-final-addr values are private witnesses, we need to ensure that there are no duplicates. For this reason, we constrain the relevant ram-init-final-addr values to be strictly monotonically increasing, though gaps in the address range are allowed.

Let i be the index of the current row. The precise guaranteed condition is the following:

- If row i is the last row in the trace (i.e., is-last[i] = 1), nothing is enforced.
- If row i is not the last row (i.e., is-last[i] = 0), then the following must be enforced:
  - If the value of ram-init-final-addr[i+1] is being used (i.e., ram-init-final-flag[i+1] = 1), then ram-init-final-addr[i+1] should be strictly larger than ram-init-final-addr[i] (i.e., ram-init-final-addr[i+1] > ram-init-final-addr[i])
  - If the value of ram-init-final-addr[i+1] is not being used (i.e., ram-init-final-flag[i+1]=0), then ram-init-final-addr[i+1]=ram-init-final-addr[i]

In order to enforce the above conditions:

- let ram-addr-cur denote the value of ram-init-final-addr in row i;
- let ram-addr-next denote the value of ram-init-final-addr in row i + 1;
- let is-last be a flag that indicates whether row i is the last row in the trace;
- let ram-addr-diff be a helper variable used to compute the difference between ram-addr-cur and ram-addr-next:
- let ram-addr-borrow be a helper borrow variable; and
- let ram-init-final-flag-next denote the value of ram-init-final-flag in row i+1.

# 7.7.3 Range checks

```
  \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \begin{tabular}{ll}   \be
```

### 7.7.4 Logup computation for initial write and final read sets

In order to compute the logup sums for the initial and final states of the data memory, as described in Section 7.6.3, the prover creates two additional columns:

- ram-write-init-digest: a digest column used to compute the logup sum for the initial state of the data memory;
- ram-read-final-digest: a digest column used to compute the logup sum for the final state of the data memory.

**Boundary constraints**: Let  $\mathbf{0} = (0,0,0,0)$ . The boundary constraints are as follows:

```
• ram-write-init-digest[0] = ram-init-final-flag[0]/(fp(ram-init-final-addr[0], ram-val-init[0], \mathbf{0}) + \alpha) • ram-read-final-digest[0] = ram-init-final-flag[0]/(fp(ram-init-final-addr[0], ram-val-final[0], ram-ts-final[0]) + \alpha)
```

**Transition constraints**  $(0 < i \le n)$ : Let  $\mathbf{0} = (0, 0, 0, 0)$ . The transition constraints are as follows:

```
• ram-write-init-digest[i] - ram-write-init-digest[i-1] = ram-init-final-flag[i]/(fp(ram-init-final-addr[i], ram-val-init[i], 0) + \alpha) • ram-write-digest[i] - ram-write-digest[i-1] = ram-init-final-flag[i]/(fp(ram-init-final-addr[i], ram-val-final[i], ram-ts-final[i]) + \alpha)
```

# 8 Execution component

The instruction execution component deals with the actual execution of instructions by enforcing constraints that guarantee the correct execution of these instructions over their operands. In some

cases, such as for load and store instructions, this may require interactions with other components. In this section, we describe constraints for all instructions supported by the Nexus Virtual Machine.

**Section outline**: The remainder of this section is organized as follows:

- Section 8.1 defines the main trace elements used by the execution component, including in particular the set of operands used by virtual machine instructions.
- Section 8.2 defines the interface that other components can use to interact with the execution component.
- Section 8.3 recalls some of the constraints related to the instruction flags previously defined in Sections 4.2.2 and 4.3.2 and used in the remainder of this section.
- Section 8.4 describes constraints that are common to all instructions in the basic instruction set
  of the Nexus Virtual Machine.
- Section 8.5 describes constraints for the ALU instructions of the basic instruction set. This includes ADD (Section 8.5.1), SUB (Section 8.5.2), SLTU (Section 8.5.3), SLT (Section 8.5.4), SLL (Section 8.5.5), SRL (Section 8.5.6), SRA (Section 8.5.7), XOR (Section 8.5.8), AND (Section 8.5.9), and OR (Section 8.5.10).
- Section 8.6 describes constraints for the branch instructions of the basic instruction set. This includes BEQ (Section 8.6.1), BNE (Section 8.6.2), BLTU (Section 8.6.3), BLT (Section 8.6.4), BGEU (Section 8.6.5), and BGE (Section 8.6.6).
- Section 8.7 describes constraints for the load instructions of the basic instruction set. This includes LB (Section 8.7.1), LH (Section 8.7.2), LW (Section 8.7.3), LBU (Section 8.7.4), and LHU (Section 8.7.5).
- Section 8.8 describes constraints for the store instructions of the basic instruction set. This
  includes SB (Section 8.8.1), SH (Section 8.8.2), and SW (Section 8.8.3).
- Section 8.9 describes constraints for the jump instructions of the basic instruction set. This includes JAL (Section 8.9.1) and JALR (Section 8.9.2).
- Section 8.10 describes constraints for pc-related instructions of the basic instruction set. This
  includes LUI (Section 8.10.1) and AUIPC (Section 8.10.2).
- Section 8.11 describes constraints for system instructions of the basic instruction set. This includes ECALL (Section 8.11.1) and EBREAK (Section 8.11.2).
- Section 8.12 specifies the execution component interaction with the register memory component.

### 8.1 Instruction execution trace elements

In order to enforce instruction execution constraints, we keep track of the set of operands used by virtual machine instructions as well as some helper values. Moreover, each instruction has a flag associated with it, described in the CPU component, in order to help with the update of the machine state. Since some of the instructions may require interaction with the data memory component, we also require the trace column clk.

The following set of trace elements will be needed for the instruction executor component:

- clk: the current execution time
- a-val: a 32-bit word specifying the value of operand op-a
- b-val: a 32-bit word specifying the value of operand op-b
- c-val: a 32-bit word specifying the value of operand op-c
- pc: the current value of the program counter register
- pc-next: the next value of the program counter register after the execution
- is-lui,...,is-and: boolean flags for supported instructions (see Table 5)
- h1,..., hn: helper elements (defined as needed)

Table 5: List of instructions flags for the Nexus Virtual Machine Instruction Set.

| Instruction flag | Description                                |
|------------------|--------------------------------------------|
| is-lui           | indicates an lui operation                 |
| is-auipc         | indicates an auipc operation               |
| is-jal           | indicates an jal operation                 |
| is-jalr          | indicates an jalr operation                |
| is-ecall         | indicates an ecall operation               |
| is-ebreak        | indicates an ebreak operation              |
| is-fence         | indicates an fence operation               |
| is-unimp         | indicates an unimp operation               |
| is-beq           | indicates an beq operation                 |
| is-bne           | indicates an bne operation                 |
| is-blt           | indicates an blt operation                 |
| is-bge           | indicates an bge operation                 |
| is-bltu          | indicates an bltu operation                |
| is-bgeu          | indicates an bgeu operation                |
| is-lb            | indicates an 1b operation                  |
| is-lh            | indicates an 1h operation                  |
| is-lw            | indicates an lw operation                  |
| is-lbu           | indicates an lbu operation                 |
| is-lhu           | indicates an lhu operation                 |
| is-sb            | indicates an sb operation                  |
| is-sh            | indicates an sh operation                  |
| is-sw            | indicates an sw operation                  |
| is-add           | indicates an add or addi operation         |
| is-sub           | indicates an sub operation                 |
| is-sll           | indicates an sll or slli operation         |
| is-slt           | indicates an slt or slti operation         |
| is-sltu          | indicates an sltu or sltiu operation       |
| is-xor           | indicates an xor or xori operation         |
| is-srl           | indicates an srl or srli operation         |
| is-sra           | indicates an sra or srai operation         |
| is-or            | indicates an or or ori operation           |
| is-and           | indicates an and or andi operation         |
| is-pad           | used for padding, not a computational step |

### 8.2 Instruction execution interface

In order to clarify the interaction with other components, we now define the interface that these other components can use to call the instruction execution component. For this, we assume that the instruction opcode as defined in the instruction encoding will be passed as a parameter together with the three operands, denoted a-val, b-val, and c-val, and the value of the current program counter pc.

• exec(pc, opcode, a-val, b-val, c-val) → pc-next: the instruction execution component performs the operation over the values (a-val, b-val, c-val) according to the opcode value opcode. It also sets pc-next to the updated value of the program counter value pc.

**Remark 8.1** The interface above considers each value as a single element, However, in certain cases, some of these entries will be specified by a set of 8-bit limbs.

**Remark 8.2** To make use of the common structure of certain instructions such as ADD and ADDI, operations such as sign extension of immediate values are performed before calling the interface to the instruction execution component.

**Notation**: When introducing limbs for a variable val, we use the notation "val<sup>(j)</sup>" to indicate the j-th limb for a given variable val.

# 8.3 Basic Instruction Set: flags

This section recalls some of the constraints related to basic instruction flags defined in Sections 4.2.2 and 4.3.2 and used in the remainder of this section. Since the constraints in Sections 4.2.2 and 4.3.2 guarantee that only 1 instruction flag can be set to 1 in a clock cycle, it follows that the sum of any subset of these flags will be equal to 1 if one of the flags in the subset is set to 1 and 0 otherwise.

### 8.4 Basic Instruction Set: Common constraints

# 8.4.1 Constraints assuming large fields

```
\label{eq:local_problem} \begin{tabular}{ll} $\#$ Instructions for which PC always increments by 4, \\ $\#$ Except when the next row is the first row \\ & (is-alu+is-load+is-type-s+is-type-u+is-type-sys\cdot(1-is-sys-halt)-is-pc-inc-std) = 0 \\ $\#$ Incrementing PC by 4 \\ & (is-pc-inc-std)\cdot(pc-next+pc-carry\cdot2^{32}-pc-4) = 0 \\ $\#$ Enforcing pc-carry $\in \{0,1\}$ \\ \end{tabular}
```

• (is-pc-inc-std)  $\cdot$  (pc-carry)  $\cdot$  (1 - pc-carry) = 0

Remark 8.3 As stated at the beginning of Section 8.3 and in Remarks 4.6 and 4.10, the sum of any subset of the instruction flags will be exactly 1 if any of the instruction flags is set to 1 and 0, otherwise. As a result, there is no need to introduce an additional constraint to enforce that is-pc-inc-std  $\in \{0,1\}$  since only one of the variables in {is-alu, is-load, is-type-s, is-type-u} can be 1.

### 8.4.2 Constraints assuming small fields

# 8.5 Basic Instruction Set: ALU Instructions

### 8.5.1 ADD Instruction

The parameters and functionality for the ADD instruction are as follows:

- opcode: ADD
- Parameters: (a-val, b-val, c-val)
- Instruction selector: is-add = 1
- Functionality:  $a\text{-val} \leftarrow b\text{-val} + c\text{-val} \mod 2^{32}$

The mapping from the add and addi instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------|------|------|-------|-------|--------|---------|
| add        | rd   | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2]  |
| addi       | rd   | rs1  | rs2  | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function and i is a 12-bit immediate value.

### Constraints assuming large fields

```
\label{eq:carry} \begin{array}{l} \mbox{$/$} \mbox{Carry handling} \\ \bullet \mbox{$($is-add)$} \cdot (\mbox{$a-val+h-carry$} \cdot 2^{32} - \mbox{$b-val-c-val$}) = 0 \\ \mbox{$/$} \mbox{$/$} \mbox{Enforcing $h-carry$} \in \{0,1\} \\ \bullet \mbox{$($is-add)$} \cdot (\mbox{$h-carry$}) \cdot (1 - \mbox{$h-carry$}) = 0 \\ \mbox{$/$} \mbox{$/$} \mbox{$Range$ check $a-val$} \in \left[0,2^{32}-1\right] - \mbox{$Performed$ in the CPU component} \\ \mbox{$/$} \mbox{$/$} \mbox{$Range$ check $c-val$} \in \left[0,2^{32}-1\right] - \mbox{$Performed$ in the CPU component} \\ \mbox{$/$} \mbox{$/$} \mbox{$Range$ check $c-val$} \in \left[0,2^{32}-1\right] - \mbox{$Performed$ in the CPU component} \\ \end{array}
```

```
/\!\!/ Range check is-add \in \{0,1\} - Performed in the CPU component
```

### Constraints assuming small fields

### 8.5.2 SUB Instruction

The parameters and functionality for the SUB instruction are as follows:

opcode: SUB
Parameters: (a-val, b-val, c-val)
Instruction selector: is-sub = 1
Functionality: a-val ← b-val − c-val mod 2<sup>32</sup>

The mapping from the **sub** instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a                | op-b | op-c | imm-c | a-val | b-val  | c-val  |
|------------|---------------------|------|------|-------|-------|--------|--------|
| sub        | $\operatorname{rd}$ | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2] |

### Constraints assuming large fields

```
\label{eq:carry_problem} \begin{array}{l} \mbox{$\#$ (is-sub) \cdot (b-val+h\text{-borrow} \cdot 2^{32}-a\text{-}val-c\text{-}val) = 0$} \\ \mbox{$\#$ Enforcing $h$-borrow $\in \{0,1\}$} \\ \mbox{$\bullet$ (is-sub) \cdot (h\text{-borrow}) \cdot (1-h\text{-borrow}) = 0$} \\ \mbox{$\#$ Range check $a$-val $\in [0,2^{32}-1]$ - Performed in the CPU component $$ \mbox{$\#$ Range check $b$-val $\in [0,2^{32}-1]$ - Performed in the CPU component $$ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component $$ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component } \\ \mbox{$\#$ Range check is-sub $\in \{0,1\}$ - Performed in the CPU component
```

#### 8.5.3 SLTU Instruction

The parameters and functionality for the SLTU instruction are as follows:

```
    opcode: SLTU
    Parameters: (a-val, b-val, c-val)
    Instruction selector: is-sltu = 1
    Functionality:

            a-val ← 1 if b-val < c-val (unsigned comparison)</li>
            a-val ← 0 if b-val ≥ c-val (unsigned comparison)
```

The mapping from the sltu and sltiu instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a             | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------------------|------|------|-------|-------|--------|---------|
| sltu       | $^{\mathrm{rd}}$ | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2]  |
| sltiu      | rd               | rs1  | rs2  | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function and i is a 12-bit immediate value.

### Constraints assuming large fields

```
 \begin{tabular}{ll}  \begin
```

The set of constraints above implements the unsigned comparison between b-val and c-val using the borrow bit of the subtraction b-val-c-val. More precisely, if the result of subtracting c-val from b-val results in a borrow (h-borrow = 1 and h-rem  $\equiv$  b-val - c-val mod  $2^{32}$ ), then we know that c-val > b-val and a-val must be equal to 1. If there is no borrow (h-borrow = 0 and h-rem = b-val - c-val), then b-val  $\geq$  c-val and a-val must be equal to 1. Hence, it suffices to set a-val = h-borrow.

### Constraints assuming small fields

```
# Borrow handling
    (is-sltu) \cdot (b-val^{(1)} + h-borrow^{(1)} \cdot 2^8 - c-val^{(1)} - h-rem^{(1)}) = 0
    (is-sltu) \cdot (b-val^{(2)} + h-borrow^{(2)} \cdot 2^8 - c-val^{(2)} - h-rem^{(2)} - h-borrow^{(1)}) = 0
    (is-sltu) \cdot (b-val^{(3)} + h-borrow^{(3)} \cdot 2^8 - c-val^{(3)} - h-rem^{(3)} - h-borrow^{(2)}) = 0
    (\text{is-sltu}) \cdot (\text{b-val}^{(4)} + \text{h-borrow}^{(4)} \cdot 2^8 - \text{c-val}^{(4)} - \text{h-rem}^{(4)} - \text{h-borrow}^{(3)}) = 0
# Enforcing h-borrow<sup>(j)</sup> \in \{0,1\} for j=1,2,3,4
    (\texttt{is-sltu}) \cdot (\texttt{h-borrow}^{(1)}) \cdot (1 - \texttt{h-borrow}^{(1)}) = 0
    (is\text{-sltu}) \cdot (h\text{-borrow}^{(2)}) \cdot (1 - h\text{-borrow}^{(2)}) = 0
    (is-sltu) \cdot (h-borrow^{(3)}) \cdot (1-h-borrow^{(3)}) = 0
    (\text{is-sltu}) \cdot (\text{h-borrow}^{(4)}) \cdot (1 - \text{h-borrow}^{(4)}) = 0
  Enforcing h-rem^{(j)} \in \left[0,2^8-1\right] for j=1,2,3,4
    (is-sltu) \cdot (h-rem^{(1)} \in [0, 2^8 - 1]) = 0
    (\text{is-sltu}) \cdot (\text{h-rem}^{(2)} \in [0, 2^8 - 1]) = 0
    (\text{is-sltu}) \cdot (\text{h-rem}^{(3)} \in [0, 2^8 - 1]) = 0
    (is-sltu) \cdot (h-rem^{(4)} \in [0, 2^8 - 1]) = 0
/\!\!/ \; \mathtt{Setting} \  \, \mathtt{a-val}^{(1)} = \mathtt{h-borrow}^{(4)}
    (\mathtt{is\text{-}sltu}) \cdot (\mathtt{h\text{-}borrow}^{(4)} - \mathtt{a\text{-}val}^{(1)}) = 0
# Setting a-val^{(j)} = 0 for j = 2, 3, 4
    (\mathsf{is}\text{-}\mathsf{sltu})\cdot(\mathsf{a}\text{-}\mathsf{val}^{(2)})=0
    (is\text{-sltu}) \cdot (a\text{-val}^{(3)}) = 0
    (\mathsf{is}\text{-}\mathsf{sltu})\cdot(\mathsf{a}\text{-}\mathsf{val}^{(4)})=0
/\!\!/ Range check a-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
// Range check c-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-sltu \in \{0,1\} - Performed in the CPU component
```

The set of constraints above is similar to the large field case, implementing the unsigned comparison between b-val and c-val using the borrow bit of the subtraction b-val - c-val and setting the value of a-val accordingly. The main difference is that we only need to ensure that the first limb of a-val is equal to the last limb of the borrow bit h-borrow and that all the remaining limbs of a-val are 0.

More precisely, if the result of subtracting c-val from b-val limb by limb results in a borrow (h-borrow<sup>(4)</sup> = 1), then we know that c-val > b-val and a-val must be equal to 1. If there is no borrow (h-borrow<sup>(4)</sup> = 0), then b-val  $\geq$  c-val and a-val must be equal to 1. Therefore, it suffices to set a-val<sup>(1)</sup> = h-borrow<sup>(4)</sup> and a-val<sup>(2)</sup> = a-val<sup>(3)</sup> = a-val<sup>(4)</sup> = 0.

### 8.5.4 SLT Instruction

The parameters and functionality for the SLT instruction are as follows:

```
opcode: SLTParameters: (a-val, b-val, c-val)
```

- Instruction selector: is-slt = 1
- Functionality:

```
- a-val ← 1 if b-val < c-val (\underline{signed} comparison)
```

$$-$$
 a-val  $\leftarrow 0$  if b-val  $\geq$  c-val (signed comparison)

The mapping from the slt and slti instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------|------|------|-------|-------|--------|---------|
| slt        | rd   | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2]  |
| slti       | rd   | rs1  | rs2  | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function and i is a 12-bit immediate value.

### Constraints assuming large fields

```
// Performing unsigned comparison between b-val and c-val using SUB borrow bit
• (is-slt) \cdot (b-val - c-val + h-borrow \cdot 2^{32} - h-rem) = 0
// Enforcing h-rem \in [0,2^{32}-1]
  (\mathtt{is\text{-slt}}) \cdot (\mathtt{h\text{-rem}} \in [0, 2^{32} - 1]) = 0
/\!\!/ Enforcing h-borrow \in \{0,1\}
    (is-slt) \cdot (h-borrow) \cdot (1 - h-borrow) = 0
// Setting h-ltu-flag = h-borrow
    (is-slt) \cdot (h-borrow - h-ltu-flag) = 0
# Extracting sign bits from b-val and c-val
    (\mathtt{is\text{-slt}}) \cdot (\mathtt{h\text{-rem-b}} + \mathtt{h\text{-sgn-b}} \cdot 2^{31} - \mathtt{b\text{-val}}) = 0
    (is-slt) \cdot (h-rem-c + h-sgn-c \cdot 2^{31} - c-val) = 0
    (\texttt{is-slt}) \cdot (\texttt{h-rem-b} \in \left[0, 2^{31} - 1\right]) = 0
    (is-slt) \cdot (h-rem-c \in [0, 2^{31} - 1]) = 0
    (is-slt) \cdot (h-sgn-b) \cdot (1-h-sgn-b) = 0
    (is-slt) \cdot (h-sgn-c) \cdot (1-h-sgn-c) = 0
// Computing a-val from h-ltu-flag and sign bits h-sgn-b and h-sgn-c
    (\mathsf{is\text{-}slt}) \cdot ((\mathsf{h\text{-}sgn\text{-}b})(1-\mathsf{h\text{-}sgn\text{-}c}) + \mathsf{h\text{-}ltu\text{-}flag}((\mathsf{h\text{-}sgn\text{-}b})(\mathsf{h\text{-}sgn\text{-}c}) + (1-\mathsf{h\text{-}sgn\text{-}b})(1-\mathsf{h\text{-}sgn\text{-}c})) - \mathsf{a\text{-}val}) = 0
/\!\!/ Range check a-val \in [0,2^{32}-1] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in [0,2^{32}-1] - Performed in the CPU component
/\!\!/ Range check is-slt \in \{0,1\} - Performed in the CPU component
```

The set of constraints above implements the signed comparison between b-val and c-val in three main steps. First, it computes the unsigned comparison between b-val and c-val and stores the result in h-ltu-flag. Second, it extracts the sign bits of b-val and c-val and stores these values into h-sgn-b and h-sgn-c. Finally, it sets the value of a-val based on the value of h-ltu-flag and the sign bits h-sgn-b and h-sgn-c. In particular, a-val = 1 whenever

- 1. (h-sgn-b, h-sgn-c) = (1,0) since b-val < c-val;
- 2. (h-sgn-b, h-sgn-c) = (0,0) and h-ltu-flag = 1 since b-val and c-val are positive values in this case and a-val should match the value of h-ltu-flag
- 3. (h-sgn-b, h-sgn-c) = (1,1) and h-ltu-flag = 1 since b-val and c-val are negative values in this case and a-val should also match the value of h-ltu-flag.

```
// Borrow handling
    (is-slt) \cdot (b-val^{(1)} + h-borrow^{(1)} \cdot 2^8 - c-val^{(1)} - h-rem^{(1)}) = 0
    (is-slt) \cdot (b-val^{(2)} + h-borrow^{(2)} \cdot 2^8 - c-val^{(2)} - h-rem^{(2)} - h-borrow^{(1)}) = 0
    (is-slt) \cdot (b-val^{(3)} + h-borrow^{(3)} \cdot 2^8 - c-val^{(3)} - h-rem^{(3)} - h-borrow^{(2)}) = 0
    (\text{is-slt}) \cdot (\text{b-val}^{(4)} + \text{h-borrow}^{(4)} \cdot 2^8 - \text{c-val}^{(4)} - \text{h-rem}^{(4)} - \text{h-borrow}^{(3)}) = 0
# Enforcing h-borrow<sup>(j)</sup> \in \{0,1\} for j=1,2,3,4
    (\texttt{is-slt}) \cdot (\texttt{h-borrow}^{(1)}) \cdot (1 - \texttt{h-borrow}^{(1)}) = 0
    (\mathtt{is\text{-}slt}) \cdot (\mathtt{h\text{-}borrow}^{(2)}) \cdot (1 - \mathtt{h\text{-}borrow}^{(2)}) = 0
    (\texttt{is-slt}) \cdot (\texttt{h-borrow}^{(3)}) \cdot (1 - \texttt{h-borrow}^{(3)}) = 0
    (is-slt) \cdot (h-borrow^{(4)}) \cdot (1-h-borrow^{(4)}) = 0
# Enforcing h-rem^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4
    (\texttt{is-slt}) \cdot (\texttt{h-rem}^{(1)} \in [0, 2^8 - 1]) = 0
    (is-slt) \cdot (h-rem^{(2)} \in [0, 2^8 - 1]) = 0
    (\mathtt{is\text{-slt}}) \cdot (\mathtt{h\text{-rem}}^{(3)} \in \left[0, 2^8 - 1\right]) = 0
    (\mathtt{is\text{-}slt}) \cdot (\mathtt{h\text{-}rem}^{(4)} \in \lceil 0, 2^8 - 1 \rceil) = 0
/\!\!/ Setting h-ltu-flag = h-borrow<sup>(4)</sup>
   (\mathtt{is\text{-}slt}) \cdot (\mathtt{h\text{-}borrow}^{(4)} - \mathtt{h\text{-}ltu\text{-}flag}) = 0
# Extracting sign bits from b-val and c-val
   (\texttt{is-slt}) \cdot (\texttt{h-rem-b} + \texttt{h-sgn-b} \cdot 2^7 - \texttt{b-val}^{(4)}) = 0
   (\texttt{is-slt}) \cdot (\texttt{h-rem-c} + \texttt{h-sgn-c} \cdot 2^7 - \texttt{c-val}^{(4)}) = 0
   (\texttt{is-slt}) \cdot (\texttt{h-rem-b} \in \left[0, 2^7 - 1\right]) = 0
   (is-slt) \cdot (h-rem-c \in [0, 2^7 - 1]) = 0
   (\texttt{is-slt}) \cdot (\texttt{h-sgn-b}) \cdot (1 - \texttt{h-sgn-b}) = 0
   (\texttt{is-slt}) \cdot (\texttt{h-sgn-c}) \cdot (1 - \texttt{h-sgn-c}) = 0
/\!\!/ Computing a-val^{(1)} from h-ltu-flag and sign bits h-sgn-b and h-sgn-c
   (\texttt{is-slt}) \cdot ((\texttt{h-sgn-b})(1-\texttt{h-sgn-c}) + \texttt{h-ltu-flag}((\texttt{h-sgn-b})(\texttt{h-sgn-c}) + (1-\texttt{h-sgn-b})(1-\texttt{h-sgn-c})) - \texttt{a-val}^{(1)}) = 0
# Setting a-val(j) = 0 for j = 2, 3, 4
   (\texttt{is-slt}) \cdot (\texttt{a-val}^{(2)}) = 0
  (\texttt{is-slt}) \cdot (\texttt{a-val}^{(3)}) = 0
   (\mathtt{is\text{-}slt}) \cdot (\mathtt{a\text{-}val}^{(4)}) = 0
/\!\!/ Range check a-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
// Range check b-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
# Range check c-val^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4 - Performed in the CPU component
/\!\!/ Range check is-slt \in \{0,1\} - Performed in the CPU component
```

The set of constraints above is similar to the large field case, first computing the unsigned comparison flag h-ltu-flag between b-val and c-val, then computing the sign bits h-sgn-b and h-sgn-c of b-val and c-val, and finally setting the value of a-val based on the value of h-ltu-flag and the sign bits h-sgn-b and h-sgn-c.

# 8.5.5 SLL Instruction

The parameters and functionality for the SLL instruction are as follows:

- opcode: SLL
- Parameters: (a-val, b-val, c-val)
- Instruction selector: is-sll = 1
- Functionality:  $a\text{-val} \leftarrow b\text{-val} \ll (c\text{-val} \& 0x0000001F)$

The mapping from the sll and slli instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val  |
|------------|------|------|------|-------|-------|--------|--------|
| sll        | rd   | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2] |
| slli       | rd   | rs1  | rs2  | 1     | R[rd] | R[rs1] | i      |

where i is a 5-bit immediate value.

### Constraints assuming large fields

```
# Extracting shift bits from c-val
     (\mathtt{is-sll}) \cdot (\mathtt{sh1} + \mathtt{sh2} \cdot 2 + \mathtt{sh3} \cdot 2^2 + \mathtt{sh4} \cdot 2^3 + \mathtt{sh5} \cdot 2^4 + \mathtt{h-rem} \cdot 2^5 - \mathtt{c-val}) = 0
     (\texttt{is-sll}) \cdot (\texttt{h-rem} \in \left\lceil 0, 2^{27} - 1 \right\rceil) = 0
     (\mathtt{is\text{-}sll}) \cdot (\mathtt{sh1}) \cdot (1-\mathtt{sh1}) = \vec{0}
     (\texttt{is-sl1}) \cdot (\texttt{sh2}) \cdot (1 - \texttt{sh2}) = 0
     (\texttt{is-sl1}) \cdot (\texttt{sh3}) \cdot (1 - \texttt{sh3}) = 0
     (\texttt{is-sl1}) \cdot (\texttt{sh4}) \cdot (1 - \texttt{sh4}) = 0
     (\texttt{is-sll}) \cdot (\texttt{sh5}) \cdot (1 - \texttt{sh5}) = 0
# Computing auxiliary amount exp5 from shift bits to help with the left shift operation
    (\mathsf{is\text{-}sl1}) \cdot ((\mathsf{sh1}+1) \cdot ((2^2-1)\mathsf{sh2}+1) \cdot ((2^4-1)\mathsf{sh3}+1) \cdot ((2^8-1)\mathsf{sh4}+1) \cdot ((2^{16}-1)\mathsf{sh5}+1) - \mathsf{exp5}) = 0
# Performing the left shift and storing the result in rem1
     (is-sll) \cdot (rem1 + qt1 \cdot 2^{32} - b-val \cdot exp5) = 0
/\!\!/ Range check qt1 \in \left[0, 2^{32} - 1\right]
   (\texttt{is-sll}) \cdot (\texttt{qt1} \in \left[0, 2^{32} - 1\right]) = 0
/\!\!/ Range check rem1 \in [0, \exp 5 - 1]
     (\texttt{is-sll}) \cdot (\texttt{exp5} - 1 - \texttt{rem1} - \texttt{rem1-aux}) = 0
     (is-sll) \cdot (rem1 \in [0, 2^{32} - 1]) = 0
    (\texttt{is-sll}) \cdot (\texttt{rem1-aux} \in \left\lceil 0, 2^{32} - 1 \right\rceil) = 0
// Setting rem1 = a-val
   (\texttt{is-sll}) \cdot (\texttt{rem1} - \texttt{a-val}) = 0
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-sll \in \{0,1\} - Performed in the CPU component
```

The set of constraints above implements the shift left operation in three steps. First, it extracts the shift bits  $\mathtt{sh1},\ldots,\mathtt{sh5}$  from c-val. Next, it computes an auxiliary variable  $\mathtt{exp5}$  from the shift bits so that  $\mathtt{exp5} = 2^{\mathtt{shamt}}$  whenever the value b-val needs to be shifted by shamt bits. In particular, one can observe that  $\mathtt{shamt} = \mathtt{sh1} + 2 \cdot \mathtt{sh2} + 4 \cdot \mathtt{sh3} + 8 \cdot \mathtt{sh4} + 16 \cdot \mathtt{sh5}$ . Finally, the last constraint performs the shift operation by setting a-val to be the remainder of the division of b-val · exp5 by  $2^{32}$ , which is equivalent to shifting left the contents of the 32-bit value representing b-val by shamt positions and filling the lower shamt bits with zeros.

To see why the last step computes the left shift operation correctly, let  $b\text{-val} = \sum_{i=0}^{31} b_i \cdot 2^i$  where  $(b_{31}, \ldots, b_0)$  corresponds to the bit representation of the field element b-val. Since  $exp5 = 2^{\text{shamt}}$ , it

follows that

$$\begin{array}{lll} \text{b-val} \cdot \exp & = & \sum_{i=0}^{31} b_i \cdot 2^i \cdot 2^{\text{shamt}} = \sum_{i=0}^{31} b_i \cdot 2^{i+\text{shamt}} \\ & = & \sum_{i=0}^{31-\text{shamt}} b_i \cdot 2^{i+\text{shamt}} + \sum_{i=31-\text{shamt}+1}^{31} b_i \cdot 2^{i+\text{shamt}} \\ & = & \sum_{i=\text{shamt}}^{31} b_{i-\text{shamt}} \cdot 2^i + \sum_{i=32}^{31+\text{shamt}} b_{i-\text{shamt}} \cdot 2^i \\ & = & \sum_{i=\text{shamt}}^{31} b_{i-\text{shamt}} \cdot 2^i + \sum_{i=0}^{\text{shamt}-1} b_{i+32-\text{shamt}} \cdot 2^{i+32} \\ & = & \sum_{i=\text{shamt}}^{31} b_{i-\text{shamt}} \cdot 2^i + 2^{32} \cdot \sum_{i=0}^{\text{shamt}-1} b_{i+32-\text{shamt}} \cdot 2^i \end{array}$$

As a result, the remainder of the division of b-val·exp5 by  $2^{32}$  is equal to  $\sum_{i=\text{shamt}}^{31} b_{i-\text{shamt}} \cdot 2^i$ , which is equivalent to shifting left the contents of the 32-bit value representing b-val by shamt positions and filling the lower shamt bits with zeros, as desired.

Also note that the quotient of the division of b-val  $\cdot$  exp5 by  $2^{32}$  is  $\sum_{j=0}^{\text{shamt}-1} b_{i+32-\text{shamt}} \cdot 2^i$  and corresponds to the bits b-val which will get discarded after the left shift operation.

```
/\!\!/ Extracting shift bits from c-val<sup>(1)</sup>
     (\mathtt{is-sll}) \cdot (\mathtt{sh1} + \mathtt{sh2} \cdot 2 + \mathtt{sh3} \cdot 2^2 + \mathtt{sh4} \cdot 2^3 + \mathtt{sh5} \cdot 2^4 + \mathtt{h-rem} \cdot 2^5 - \mathtt{c-val}^{(1)}) = 0
     (is-sll) \cdot (h-rem \in [0, 2^3 - 1]) = 0
     (is-sl1) \cdot (sh1) \cdot (1-sh1) = 0
     (\mathsf{is}\mathsf{-sll}) \cdot (\mathsf{sh2}) \cdot (1-\mathsf{sh2}) = 0
     (\texttt{is-sll}) \cdot (\texttt{sh3}) \cdot (1 - \texttt{sh3}) = 0
     (\texttt{is-sll}) \cdot (\texttt{sh4}) \cdot (1 - \texttt{sh4}) = 0
     (is-sll) \cdot (sh5) \cdot (1 - sh5) = 0
# Computing auxiliary amount exp3 from shift bits sh1, sh2, sh3 for a partial left shift operation
     (is-sll) \cdot ((sh1+1) \cdot ((2^2-1)sh2+1) \cdot ((2^4-1)sh3+1) - exp3) = 0
# Performing a partial left shift operation using shift bits sh1, sh2, sh3
    \begin{array}{l} \text{(is-sl1)} \cdot (\text{rem1} + \text{qt1} \cdot 2^8 - \text{b-val}^{(1)} \cdot \text{exp3}) = 0 \\ \text{(is-sl1)} \cdot (\text{rem2} + \text{qt2} \cdot 2^8 - \text{qt1} - \text{b-val}^{(2)} \cdot \text{exp3}) = 0 \\ \text{(is-sl1)} \cdot (\text{rem3} + \text{qt3} \cdot 2^8 - \text{qt2} - \text{b-val}^{(3)} \cdot \text{exp3}) = 0 \\ \text{(is-sl1)} \cdot (\text{rem4} + \text{qt4} \cdot 2^8 - \text{qt3} - \text{b-val}^{(4)} \cdot \text{exp3}) = 0 \end{array}
// Range check \mathtt{qtj} \in \left[0, 2^8 - 1\right] for j = 1, 2, 3, 4
   (\texttt{is-sll}) \cdot (\texttt{qt1} \in \left[0, 2^8 - 1\right]) = 0
   (is-sl1) \cdot (qt2 \in [0, 2^8 - 1]) = 0
   (is-s11) \cdot (qt3 \in [0, 2^8 - 1]) = 0
   (is-sl1) \cdot (qt4 \in [0, 2^8 - 1]) = 0
/\!\!/ Range check remj \in \left[0,2^8-1
ight] for j=1,2,3,4
   (	exttt{is-sll}) \cdot (	exttt{rem1} \in \left[0, 2^8 - 1\right]) = 0
   (\texttt{is-sll}) \cdot (\texttt{rem2} \in \left[0, 2^8 - 1\right]) = 0
   (\texttt{is-sll}) \cdot (\texttt{rem3} \in \left\lceil 0, 2^8 - 1 \right\rceil) = 0
   (is-sll) \cdot (rem4 \in [0, 2^8 - 1]) = 0
// Computing final left shift using remaining bits of the shift amount
\# sh4 = 1 implies an additional 1-byte left shift
\# sh5 = 1 implies an additional 2-byte left shift
```

```
• (is-s11) \cdot (a-va1^{(1)} - rem1 \cdot (1-sh4) \cdot (1-sh5)) = 0
• (is-s11) \cdot (a-va1^{(2)} - rem2 \cdot (1-sh4) \cdot (1-sh5) - rem1 \cdot (sh4) \cdot (1-sh5)) = 0
• (is-s11) \cdot (a-va1^{(3)} - rem3 \cdot (1-sh4) \cdot (1-sh5) - rem2 \cdot (sh4) \cdot (1-sh5) - rem1 \cdot (1-sh4) \cdot (sh5)) = 0
• (is-s11) \cdot (a-va1^{(4)} - rem4 \cdot (1-sh4) \cdot (1-sh5) - rem3 \cdot (sh4) \cdot (1-sh5) - rem2 \cdot (1-sh4) \cdot (sh5) - rem1 \cdot (sh4) \cdot (sh5)) = 0

# Range check a-va1^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component

# Range check b-va1^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component

# Range check b-va1^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
```

The set of constraints above implements the shift left operation in three main stages.

In a first stage, we define an initial set of constraints to extract the shift bits  $\mathtt{sh1},\ldots,\mathtt{sh5}$  from caval

In a second stage, we define an additional set of constraints to perform a temporary left shift of b-val based on the values of the shift bits  $\mathtt{sh1}$ ,  $\mathtt{sh2}$ ,  $\mathtt{sh3}$ . In this stage, the variable  $\mathtt{b-val} \coloneqq (\mathtt{b-val}^{(1)}, \ldots, \mathtt{b-val}^{(4)})$  is left shifted by shamt bits and the result is stored in the variables ( $\mathtt{rem1}, \ldots, \mathtt{rem4}$ ), where  $\mathtt{shamt} = \mathtt{sh1} + 2 \cdot \mathtt{sh2} + 4 \cdot \mathtt{sh3}$ . This is done as follows:

- First, we define an auxiliary variable exp3 which enforces that exp3 = 2<sup>shamt</sup>, thus avoiding the need to explicitly define the variable shamt.
- Second, we compute the left shift of b-val<sup>(1)</sup> by shamt bits by setting rem1 to be the remainder of the division of b-val<sup>(1)</sup> · exp3 by 2<sup>8</sup>. The quotient qt1 of this division is used to store the bits from b-val<sup>(1)</sup> that carry over into the second byte rem2.
- Third, we compute the left shift of b-val<sup>(2)</sup> by shamt bits by setting rem2 to be the remainder of the division of qt1 + b-val<sup>(2)</sup> · exp3 by 2<sup>8</sup>. The quotient qt2 of this division is used to store the bits from b-val<sup>(2)</sup> that carry over into the third byte rem3.
- Fourth, we compute the left shift of b-val<sup>(3)</sup> by shamt bits by setting rem3 to be the remainder of the division of qt2 + b-val<sup>(3)</sup> · exp3 by 2<sup>8</sup>. The quotient qt3 of this division is used to store the bits from b-val<sup>(3)</sup> that carry over into the fourth byte rem4.
- Fifth, we compute the left shift of b-val<sup>(4)</sup> by shamt bits by setting rem4 to be the remainder of the division of qt3+b-val<sup>(4)</sup> exp3 by 2<sup>8</sup>. The quotient qt4 of this division contains the bits from b-val<sup>(4)</sup> that gets discarded after this shamt-bit left shift.
- Finally, in order to guarantee that the divisions above are computed correctly, we implement range checks for qtj and remj for j=1,2,3,4 that guarantee that these values lie in the proper range (i.e., remj  $\in [0, \exp 3 1]$  and qtj  $\in [0, 2^8 1]$ ). In particular, in order to guarantee that remj  $\in [0, \exp 3 1]$ , we also introduce auxiliary variables remj-aux for j=1,2,3,4 and range check these as well.

in a final stage, we use the values of the shift bits  $\mathtt{sh4}, \mathtt{sh5}$  to complete the left shift operation and compute the final value of  $\mathtt{a-val} = (\mathtt{a-val}^{(1)}, \ldots, \mathtt{a-val}^{(4)})$  from the temporary values  $(\mathtt{rem1}, \ldots, \mathtt{rem4})$ . This is done as follows:

- if (sh4, sh5) = (0,0), then no additional left shift is needed and we simply set a-val<sup>(j)</sup> = remj for j = 1, 2, 3, 4;
- if  $(\mathtt{sh4},\mathtt{sh5}) = (1,0)$ , then we need to left shift the temporary values  $(\mathtt{rem1},\ldots,\mathtt{rem4})$  by one byte. In this case,  $\mathtt{a-val}^{(1)} = 0$  and  $\mathtt{a-val}^{(j+1)} = \mathtt{remj}$  for j=1,2,3;
- if (sh4, sh5) = (0, 1), then we need to left shift the temporary values (rem1, ..., rem4) by two bytes. In this case,  $a\text{-val}^{(1)} = a\text{-val}^{(2)} = 0$  and  $a\text{-val}^{(j+2)} = remj$  for j = 1, 2;
- if  $(\mathtt{sh4},\mathtt{sh5}) = (1,1)$ , then we need to left shift the temporary values  $(\mathtt{rem1},\ldots,\mathtt{rem4})$  by three bytes. In this case,  $\mathtt{a-val}^{(1)} = \mathtt{a-val}^{(2)} = \mathtt{a-val}^{(3)} = 0$  and  $\mathtt{a-val}^{(4)} = \mathtt{rem1}$ .

#### 8.5.6 SRL Instruction

The parameters and functionality for the SRL instruction are as follows:

- opcode: SRL
- Parameters: (a-val, b-val, c-val)
- Instruction selector: is-srl = 1
- Functionality:  $a\text{-val} \leftarrow b\text{-val} \gg (c\text{-val} \& 0x0000001F)$

The mapping from the srl and srli instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a             | op-b | op-c | imm-c | a-val | b-val  | c-val  |
|------------|------------------|------|------|-------|-------|--------|--------|
| srl        | $^{\mathrm{rd}}$ | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2] |
| srli       | $^{\mathrm{rd}}$ | rs1  | rs2  | 1     | R[rd] | R[rs1] | i      |

where i is a 5-bit immediate value.

### Constraints assuming large fields

```
# Extracting shift bits from c-val
   (\mathtt{is\text{-}srl}) \cdot (\mathtt{sh1} + \mathtt{sh2} \cdot 2 + \mathtt{sh3} \cdot 2^2 + \mathtt{sh4} \cdot 2^3 + \mathtt{sh5} \cdot 2^4 + \mathtt{h\text{-}rem} \cdot 2^5 - \mathtt{c\text{-}val}) = 0
    (\text{is-srl}) \cdot (\text{h-rem} \in [0, 2^{27} - 1]) = 0
    (\mathtt{is}\mathtt{-srl})\cdot(\mathtt{sh1})\cdot(1-\mathtt{sh1})=\vec{0}
     (\texttt{is-srl}) \cdot (\texttt{sh2}) \cdot (1 - \texttt{sh2}) = 0
    (\mathsf{is}\mathsf{-srl}) \cdot (\mathsf{sh3}) \cdot (1-\mathsf{sh3}) = 0
    (\texttt{is-srl}) \cdot (\texttt{sh4}) \cdot (1 - \texttt{sh4}) = 0
    (\texttt{is-srl}) \cdot (\texttt{sh5}) \cdot (1 - \texttt{sh5}) = 0
# Computing auxiliary amount exp5 from shift bits to help with the right shift operation
\bullet \quad (\text{is-srl}) \cdot ((\text{sh1}+1) \cdot ((2^2-1) \text{sh2}+1) \cdot ((2^4-1) \text{sh3}+1) \cdot ((2^8-1) \text{sh4}+1) \cdot ((2^{16}-1) \text{sh5}+1) - \text{exp5}) = 0
# Performing the right shift and storing the result in qt1
• (is-srl) \cdot (b-val - rem1 - qt1 \cdot exp5) = 0
/\!\!/ Range check qt1 \in \left[0, 2^{32} - 1\right]
• (is-sr1) \cdot (qt1 \in [0, 2^{32} - 1]) = 0
/\!\!/ \, \mathtt{Range check rem1} \in [0, \mathtt{exp5} - 1]
• (is-srl) \cdot (exp5 - 1 - rem1 - rem1-aux) = 0
   (\mathtt{is\text{-}srl})\cdot(\mathtt{rem1}\in\left\lceil0,2^{32}-1\right\rceil)=0
\bullet \quad (\text{is-srl}) \cdot (\text{rem1-aux} \in \left\lceil 0, 2^{32} - 1 \right\rceil) = 0
/\!\!/ Setting qt1 = a-val
• (is-srl) \cdot (qt1 - a-val) = 0
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-srl \in \{0,1\} - Performed in the CPU component
```

The set of constraints above implements the shift right operation in three steps. First, it extracts the shift bits  $\mathtt{sh1},\ldots,\mathtt{sh5}$  from c-val. Next, it computes an auxiliary variable  $\mathtt{exp5}$  from the shift bits so that  $\mathtt{exp5} = 2^{\mathtt{shamt}}$  whenever the value b-val needs to be shifted by shamt bits. Finally, the last constraint performs the shift right operation by setting a-val to be the quotient of the division of b-val by  $\mathtt{exp5}$ .

```
(\texttt{is-srl}) \cdot (\texttt{sh4}) \cdot (1 - \texttt{sh4}) = 0
     (\texttt{is-srl}) \cdot (\texttt{sh5}) \cdot (1 - \texttt{sh5}) = 0
# Computing auxiliary amount exp3 from shift bits sh1, sh2, sh3 for a partial right shift operation
    (is-srl) \cdot ((sh1+1) \cdot ((2^2-1)sh2+1) \cdot ((2^4-1)sh3+1) - exp3) = 0
# Performing a partial right shift operation using shift bits sh1, sh2, sh3
    (\texttt{is-srl}) \cdot (\texttt{b-val}^{(4)} - \texttt{rem4} - \texttt{qt4} \cdot \texttt{exp3}) = 0
     (\text{is-srl}) \cdot (\text{b-val}^{(3)} + \text{rem4} \cdot 2^{8} - \text{rem3} - \text{qt3} \cdot \text{exp3}) = 0
    (\texttt{is-srl}) \cdot (\texttt{b-val}^{(2)} + \texttt{rem3} \cdot 2^8 - \texttt{rem2} - \texttt{qt2} \cdot \texttt{exp3}) = 0
    (\texttt{is-srl}) \cdot (\texttt{b-val}^{(1)} + \texttt{rem2} \cdot 2^8 - \texttt{rem1} - \texttt{qt1} \cdot \texttt{exp3}) = 0
/\!\!/ Range check \mathtt{qtj} \in \left[0, 2^8 - 1
ight] for j = 1, 2, 3, 4
    (\texttt{is-srl}) \cdot (\texttt{qt1} \in \left[0, 2^8 - 1\right]) = 0
     (is-srl) \cdot (qt2 \in [0, 2^8 - 1]) = 0
     (is-sr1) \cdot (qt3 \in [0, 2^8 - 1]) = 0
    (is-srl) \cdot (qt4 \in [0, 2^8 - 1]) = 0
# Range check remj \in [0, \exp 3 - 1] for j = 1, 2, 3, 4
   (\texttt{is-srl}) \cdot (\texttt{exp3} - 1 - \texttt{rem1} - \texttt{rem1-aux}) = 0
     (\texttt{is-srl}) \cdot (\texttt{exp3} - 1 - \texttt{rem2} - \texttt{rem2-aux}) = 0
     (\texttt{is-srl}) \cdot (\texttt{exp3} - 1 - \texttt{rem3} - \texttt{rem3-aux}) = 0
     (\texttt{is-sr1}) \cdot (\texttt{exp3} - 1 - \texttt{rem4} - \texttt{rem4-aux}) = 0
    (is-srl) \cdot (rem1 \in [0, 2^8 - 1]) = 0
    (is-sr1) \cdot (rem2 \in [0, 2^8 - 1]) = 0
     (is-srl) \cdot (rem3 \in [0, 2^8 - 1]) = 0
     (is-srl) \cdot (rem4 \in [0, 2^8 - 1]) = 0
     (is-srl) \cdot (rem1-aux \in [0, 2^8-1]) = 0
     (is-srl) \cdot (rem2-aux \in [0, 2^8 - 1]) = 0
     (is-sr1) \cdot (rem3-aux \in [0, 2^8 - 1]) = 0
    (is-srl) \cdot (rem4-aux \in [0, 2^8 - 1]) = 0
// Computing final right shift using remaining bits of the shift amount
/\!\!/ sh4 = 1 implies an additional 1-byte right shift
/\!\!/ sh5 = 1 implies an additional 2-byte right shift
   (\mathtt{is\text{-}srl}) \cdot (\mathtt{a\text{-}val}^{(4)} - \mathtt{qt4} \cdot (1 - \mathtt{sh4}) \cdot (1 - \mathtt{sh5})) = 0
     (\texttt{is-srl}) \cdot (\texttt{a-val}^{(3)} - \texttt{qt3} \cdot (1 - \texttt{sh4}) \cdot (1 - \texttt{sh5}) - \texttt{qt4} \cdot (\texttt{sh4}) \cdot (1 - \texttt{sh5})) = 0
    (\mathtt{is-srl}) \cdot (\mathtt{a-val}^{(2)} - \mathtt{qt2} \cdot (1 - \mathtt{sh4}) \cdot (1 - \mathtt{sh5}) - \mathtt{qt3} \cdot (\mathtt{sh4}) \cdot (1 - \mathtt{sh5}) - \mathtt{qt4} \cdot (1 - \mathtt{sh4}) \cdot (\mathtt{sh5})) = 0
    (\text{is-srl}) \cdot (\text{a-val}^{(1)} - \text{qt1} \cdot (1 - \text{sh4}) \cdot (1 - \text{sh5}) - \text{qt2} \cdot (\text{sh4}) \cdot (1 - \text{sh5}) - \text{qt3} \cdot (1 - \text{sh4}) \cdot (\text{sh5}) - \text{qt4} \cdot (\text{sh4}) \cdot (\text{sh5})) = 0
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
// Range check b-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check c-val^{(j)}\in igl[0,2^8-1igr] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-srl \in \{0,1\} - Performed in the CPU component
```

The set of constraints above implements the shift right operation in three main stages.

In a first stage, we define an initial set of constraints to extract the shift bits sh1,...,sh5 from c-val.

In a second stage, we define an additional set of constraints to perform a temporary right shift of b-val based on the values of the shift bits  $\mathtt{sh1},\mathtt{sh2},\mathtt{sh3}$ . In this stage, the variable b-val :=  $(\mathtt{b-val}^{(1)},\ldots,\mathtt{b-val}^{(4)})$  is shifted to the right by shamt bits and the result is stored in the variables  $(\mathtt{qt1},\ldots,\mathtt{qt4})$ , where shamt =  $\mathtt{sh1} + 2 \cdot \mathtt{sh2} + 4 \cdot \mathtt{sh3}$ . This is done as follows:

- First, we define an auxiliary variable exp3 which enforces that exp3 = 2<sup>shamt</sup>, thus avoiding the need to explicitly define the variable shamt.
- Second, we compute the right shift of b-val<sup>(4)</sup> by shamt bits by setting qt4 to be the quotient of the division of b-val<sup>(4)</sup> by exp3. The remainder rem4 of this division is used to store the bits from b-val<sup>(4)</sup> that carry over into the third byte qt3.

- Third, we compute the right shift of b-val<sup>(3)</sup> by shamt bits by setting qt3 to be the quotient of the division of rem4·2<sup>8</sup> + b-val<sup>(3)</sup> by exp3. The remainder rem3 of this division is used to store the bits from b-val<sup>(3)</sup> that carry over into the second byte qt2.
- Fourth, we compute the right shift of b-val<sup>(2)</sup> by shamt bits by setting qt2 to be the quotient of the division of rem3 · 2<sup>8</sup> + b-val<sup>(2)</sup> by exp3. The remainder rem2 of this division is used to store the bits from b-val<sup>(2)</sup> that carry over into the first byte qt1.
- Fifth, we compute the right shift of b-val<sup>(1)</sup> by shamt bits by setting qt1 to be the quotient of the division of rem2 · 2<sup>8</sup> + b-val<sup>(1)</sup> by exp3. The remainder rem1 of this division contains the bits from b-val<sup>(1)</sup> that gets discarded after this shamt-bit right shift.
- Finally, in order to guarantee that the divisions above are computed correctly, we implement range checks for qtj and remj for j=1,2,3,4 that guarantee that these values lie in the proper range (i.e., remj  $\in [0, \exp 3 1]$  and qtj  $\in [0, 2^8 1]$ ). In particular, in order to guarantee that remj  $\in [0, \exp 3 1]$ , we also introduce auxiliary variables remj-aux for j=1,2,3,4 and range check these as well.

In a final stage, we use the values of the shift bits  $\mathtt{sh4}, \mathtt{sh5}$  to complete the right shift operation and compute the final value of  $\mathtt{a-val} = (\mathtt{a-val}^{(1)}, \ldots, \mathtt{a-val}^{(4)})$  from the temporary values  $(\mathtt{qt1}, \ldots, \mathtt{qt4})$ . This is done as follows:

- if (sh4, sh5) = (0, 0), then no additional right shift is needed and we simply set a-val<sup>(j)</sup> = qtj for j = 1, 2, 3, 4;
- if  $(\mathtt{sh4},\mathtt{sh5}) = (1,0)$ , then we need to right shift the temporary values  $(\mathtt{qt1},\ldots,\mathtt{qt4})$  by one byte. In this case,  $\mathtt{a-val}^{(4)} = 0$  and  $\mathtt{a-val}^{(j-1)} = \mathtt{qtj}$  for j = 2,3,4;
- if  $(\mathtt{sh4},\mathtt{sh5}) = (0,1)$ , then we need to right shift the temporary values  $(\mathtt{qt1},\ldots,\mathtt{qt4})$  by two bytes. In this case,  $\mathtt{a-val}^{(4)} = \mathtt{a-val}^{(3)} = 0$  and  $\mathtt{a-val}^{(j-2)} = \mathtt{qtj}$  for j=3,4;
- if  $(\mathtt{sh4},\mathtt{sh5}) = (1,1)$ , then we need to right shift the temporary values  $(\mathtt{qt1},\ldots,\mathtt{qt4})$  by three bytes. In this case,  $\mathtt{a-val}^{(4)} = \mathtt{a-val}^{(3)} = \mathtt{a-val}^{(3)} = 0$  and  $\mathtt{a-val}^{(1)} = \mathtt{qt4}$ .

#### 8.5.7 SRA Instruction

The parameters and functionality for the SRA instruction are as follows:

- opcode: SRA
- Parameters: (a-val, b-val, c-val)
- Instruction selector: is-sra = 1
- Functionality:  $a\text{-val} \leftarrow b\text{-val} \gg (c\text{-val} \& 0x0000001F) (sign preserving)$
- Observation: Sign preserving means that vacated positions are filled with the sign bit and not necessarily with 0s as it was done in the SRL instruction Section 8.5.6.

The mapping from the **sra** and **srai** instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val  |
|------------|------|------|------|-------|-------|--------|--------|
| sra        | rd   | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2] |
| srai       | rd   | rs1  | rs2  | 1     | R[rd] | R[rs1] | i      |

where i is a 5-bit immediate value.

### Constraints assuming large fields

// Extracting shift bits from c-val

- $\bullet \quad (\mathtt{is-sra}) \cdot (\mathtt{sh1} + \mathtt{sh2} \cdot 2 + \mathtt{sh3} \cdot 2^2 + \mathtt{sh4} \cdot 2^3 + \mathtt{sh5} \cdot 2^4 + \mathtt{h-rem} \cdot 2^5 \mathtt{c-val}) = 0$
- (is-sra)  $\cdot$  (h-rem  $\in [0, 2^{27} 1]) = 0$
- $(is-sra) \cdot (sh1) \cdot (1-sh1) = \vec{0}$
- $\bullet \quad (\texttt{is-sra}) \cdot (\texttt{sh2}) \cdot (1-\texttt{sh2}) = 0$
- $\bullet \quad (\mathtt{is}\mathtt{-sra})\cdot (\mathtt{sh3})\cdot (1-\mathtt{sh3})=0$

```
(\texttt{is-sra}) \cdot (\texttt{sh4}) \cdot (1 - \texttt{sh4}) = 0
     (\texttt{is-sra}) \cdot (\texttt{sh5}) \cdot (1 - \texttt{sh5}) = 0
   Extracting sign bit from b-val for arithmetic right shift
    (\texttt{is-sra}) \cdot (\texttt{h-rem-b} + \texttt{h-sgn-b} \cdot 2^{31} - \texttt{b-val}) = 0
     (is-sra) \cdot (h-rem-b \in [0, 2^{31} - 1]) = 0
    (is-sra) \cdot (h-sgn-b) \cdot (1 - h-sgn-b) = 0
// Computing auxiliary amounts exp5 and exp5-aux from shift bits
// exp5 and exp5-aux are used for logical and arithmetic right shift operations
// Note that \exp 5 \cdot \exp 5-aux = 2^{32}
     \begin{array}{l} (\text{is-sra}) \cdot ((\text{sh1}+1) \cdot ((2^2-1)\text{sh2}+1) \cdot ((2^4-1)\text{sh3}+1) \cdot ((2^8-1)\text{sh4}+1) \cdot ((2^{16}-1)\text{sh5}+1) - \text{exp5}) = 0 \\ (\text{is-sra}) \cdot (2 \cdot (2-\text{sh1}) \cdot (2^2-(2^2-1)\text{sh2}) \cdot (2^4-(2^4-1)\text{sh3}) \cdot (2^8-(2^8-1)\text{sh4}) \cdot (2^{16}-(2^{16}-1)\text{sh5}) - \text{exp5-aux}) = 0 \end{array} 
# Performing the logical right shift and storing the result in qt1
   (\texttt{is-sra}) \cdot (\texttt{b-val} - \texttt{rem1} - \texttt{qt1} \cdot \texttt{exp5}) = 0
/\!\!/ Range check qt1 \in \left[0, 2^{32} - 1\right]
   (\mathtt{is	ext{-}sra})\cdot(\mathtt{qt1}\in\left[0,2^{32}-1
ight])=0
/\!\!/ Range check rem1 \in [0, \exp 5 - 1]
    (\texttt{is-sra}) \cdot (\texttt{exp5} - 1 - \texttt{rem1} - \texttt{rem1-aux}) = 0
    (\mathtt{is\text{-}sra})\cdot(\mathtt{rem1}\in\left\lceil 0,2^{32}-1\right\rceil)=0
   (is-sra) \cdot (rem1-aux \in [0, 2^{32}-1]) = 0
# Computing arithmetic right shift from logical right shift
   (is-sra) \cdot (qt1 + h-sgn-b \cdot (exp5 - 1) \cdot exp5-aux - a-val) = 0
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-sra \in \{0,1\} - Performed in the CPU component
```

The set of constraints above is similar to those for the SRL instruction in Section 8.5.6. The main difference is that we need to additionally account for the sign bit h-sgn-b of b-val when performing the right shift operation. To achieve this goal, the set of constraints first stores the result of a logical right shift into qt1 and then replicates the sign bit h-sgn-b into the vacated positions by adding h-sgn-b  $\cdot (\exp 5 - 1) \cdot \exp 5$ -aux to qt1.

To see why the last step works as desired, please note that, by construction,  $\exp 5 = 2^{\operatorname{shamt}}$  and  $\exp 5 \cdot \exp 5 - \operatorname{aux} = 2^{32}$ , where  $\operatorname{shamt} = \operatorname{sh} 1 + 2 \cdot \operatorname{sh} 2 + 4 \cdot \operatorname{sh} 3 + 8 \cdot \operatorname{sh} 4 + 16 \cdot \operatorname{sh} 5$  indicates the number of bits being right shifted. Hence, the field element  $(\exp 5 - 1) \cdot \exp 5$ -aux corresponds to the 32-bit string containing shamt one bits at the high-order bit positions and 32 – shamt zero bits at the low-order bit positions. Thus, by multiplying the latter quantity by h-sgn-b and adding the result to qt1, we achieve the desired functionality of replicating the sign bit into the vacated positions.

```
# Extracting shift bits from c-val

• (is-sra) · (sh1 + sh2 · 2 + sh3 · 2^2 + sh4 · 2^3 + sh5 · 2^4 + h-rem · 2^5 - c-val) = 0

• (is-sra) · (h-rem \in [0, 2^{27} - 1]) = 0

• (is-sra) · (sh1) · (1 - sh1) = 0

• (is-sra) · (sh2) · (1 - sh2) = 0

• (is-sra) · (sh3) · (1 - sh3) = 0

• (is-sra) · (sh4) · (1 - sh4) = 0

• (is-sra) · (sh5) · (1 - sh5) = 0

# Extracting sign bit from b-val for arithmetic right shift

• (is-sra) · (h-rem-b + h-sgn-b · 2^7 - b-val^{(4)}) = 0

• (is-sra) · (h-rem-b) ∈ [0, 2^7 - 1]) = 0

• (is-sra) · (h-sgn-b) · (1 - h-sgn-b) = 0

# Computing auxiliary amounts exp3 and exp3-aux from sh1, sh2, sh3
```

```
/\!\!/ exp3 and exp3-aux are used for partial logical and arithmetic right shift operations
// Note that \mbox{exp3}\cdot\mbox{exp3-aux}=2^8
   (\mathtt{is\text{-}sra}) \cdot ((\mathtt{sh1}+1) \cdot ((2^2-1)\mathtt{sh2}+1) \cdot ((2^4-1)\mathtt{sh3}+1) - \mathtt{exp3}) = 0
    (is-sra) \cdot (2 \cdot (2-sh1) \cdot (2^2-(2^2-1)sh2) \cdot (2^4-(2^4-1)sh3) - exp3-aux) = 0
# Performing a partial logical right shift operation using shift bits sh1, sh2, sh3
   (\texttt{is-sra}) \cdot (\texttt{b-val}^{(4)} - \texttt{rem4} - \texttt{qt4} \cdot \texttt{exp3}) = 0
     (is-sra) \cdot (b-val^{(3)} + rem4 \cdot 2^8 - rem3 - qt3 \cdot exp3) = 0
     (is-sra) \cdot (b-val<sup>(2)</sup> + rem3 \cdot 2<sup>8</sup> - rem2 - qt2 \cdot exp3) = 0
    (is-sra) \cdot (b-val^{(1)} + rem2 \cdot 2^8 - rem1 - qt1 \cdot exp3) = 0
# Range check qtj \in [0, 2^8 - 1] for j = 1, 2, 3, 4
  (is-sra) \cdot (qt1 \in [0, 2^8 - 1]) = 0
   (is-sra) \cdot (qt2 \in [0, 2^8 - 1]) = 0
  (\texttt{is-sra}) \cdot (\texttt{qt3} \in \left\lceil 0, 2^8 - 1 \right\rceil) = 0
    (is-sra) \cdot (qt4 \in [0, 2^8 - 1]) = 0
// Range check remj \in [0, \exp 3 - 1] for j = 1, 2, 3, 4
   (\texttt{is-sra}) \cdot (\texttt{exp3} - 1 - \texttt{rem1} - \texttt{rem1-aux}) = 0
    (is-sra) \cdot (exp3 - 1 - rem2 - rem2-aux) = 0
    (\texttt{is-sra}) \cdot (\texttt{exp3} - 1 - \texttt{rem3} - \texttt{rem3-aux}) = 0
    (\texttt{is-sra}) \cdot (\texttt{exp3} - 1 - \texttt{rem4} - \texttt{rem4-aux}) = 0
    (is-sra) \cdot (rem1 \in [0, 2^8 - 1]) = 0
    (is-sra) \cdot (rem2 \in [0, 2^8 - 1]) = 0
    (is-sra) \cdot (rem3 \in [0, 2^8 - 1]) = 0
    (is-sra) \cdot (rem4 \in [0, 2^8 - 1]) = 0
    (\mathtt{is\text{-}sra}) \cdot (\mathtt{rem1\text{-}aux} \in \left\lceil 0, 2^{8} - 1 \right\rceil) = 0
    (is-sra) \cdot (rem2-aux \in [0, 2^8 - 1]) = 0
    (\texttt{is-sra}) \cdot (\texttt{rem3-aux} \in \left[0, 2^8 - 1\right]) = 0
    (is-sra) \cdot (rem4-aux \in [0, 2^8 - 1]) = 0
// Computing final logical right shift using remaining bits of the shift amount
/\!\!/ sh4 = 1 implies an additional 1-byte logical right shift
\# sh5 = 1 implies an additional 2-byte logical right shift
  (is-sra) \cdot (srl4 - gt4 \cdot (1 - sh4) \cdot (1 - sh5)) = 0
    (is-sra) \cdot (srl3 - qt3 \cdot (1 - sh4) \cdot (1 - sh5) - qt4 \cdot (sh4) \cdot (1 - sh5)) = 0
    (\mathsf{is}\mathsf{-sra}) \cdot (\mathsf{srl2} - \mathsf{qt2} \cdot (1 - \mathsf{sh4}) \cdot (1 - \mathsf{sh5}) - \mathsf{qt3} \cdot (\mathsf{sh4}) \cdot (1 - \mathsf{sh5}) - \mathsf{qt4} \cdot (1 - \mathsf{sh4}) \cdot (\mathsf{sh5})) = 0
    (\mathtt{is\text{-}sra}) \cdot (\mathtt{srl1} - \mathtt{qt1} \cdot (1 - \mathtt{sh4}) \cdot (1 - \mathtt{sh5}) - \mathtt{qt2} \cdot (\mathtt{sh4}) \cdot (1 - \mathtt{sh5}) - \mathtt{qt3} \cdot (1 - \mathtt{sh4}) \cdot (\mathtt{sh5}) - \mathtt{qt4} \cdot (\mathtt{sh4}) \cdot (\mathtt{sh5})) = 0
// Computing auxiliary mask for the replication of the sign bit
\bullet \quad (\mathtt{is\text{-}sra}) \cdot (\mathtt{h\text{-}sgn\text{-}b} \cdot (\mathtt{exp3} - 1) \cdot \mathtt{exp3\text{-}aux} - \mathtt{sra\text{-}mask}) = 0
# Replicating sign bit into vacated positions during logical right shift
\# a-val^{(4)}= srl4 + h-sgn-b \cdot (exp3 - 1) \cdot exp3-aux) when additionally shifting 0 bytes (sh4 = sh5 = 0)
\# a-val^{(4)} = \text{h-sgn-b} \cdot (2^8 - 1) when additionally shifting 1, 2 or 3 bytes (sh4 + sh5 - sh4 · sh5 = 1)
• (is-sra) \cdot (a-val<sup>(4)</sup> - (1 - sh4) \cdot (1 - sh5) \cdot (sr14 + sra-mask) - (sh4 + sh5 - sh4 \cdot sh5) \cdot h-sgn-b \cdot (2<sup>8</sup> - 1)) = 0
\# a-val^{(3)}= srl3 when additionally shifting 0 bytes (sh4 = sh5 = 0)
\# a-val^{(3)} = \text{srl3} + \text{h-sgn-b} \cdot (\text{exp3} - 1) \cdot \text{exp3-aux}) when additionally shifting 1 byte (sh4 = 1, sh5 = 0)
/\!\!/ a-val<sup>(3)</sup> = h-sgn-b·(2<sup>8</sup>-1) when additionally shifting 2 or 3 bytes (sh5 = 1)
• (is-sra) \cdot (a-val^{(3)} - (1-sh4) \cdot (1-sh5) \cdot (sr13) - (sh4) \cdot (1-sh5) \cdot (sr13 + sra-mask) - (sh5) \cdot h-sgn-b \cdot (2^8-1)) = 0
/\!\!/ a-val^{(2)}= srl2 when additionally shifting 0 or 1 bytes (sh5 = 0)
\# a-val^{(2)}= srl2 + h-sgn-b·(exp3 - 1)·exp3-aux) when additionally shifting 2 bytes (sh4 = 0, sh5 = 1)
/\!\!/ a-val^{(2)}= h-sgn-b \cdot (2^8-1) when additionally shifting 3 bytes (sh4 = sh5 = 1)
\bullet \quad (\text{is-sra}) \cdot (\text{a-val}^{(2)} - (1 - \text{sh5}) \cdot (\text{sr12}) - (1 - \text{sh4}) \cdot (\text{sh5}) \cdot (\text{sr12} + \text{sra-mask}) - (\text{sh4}) \cdot (\text{sh5}) \cdot \text{h-sgn-b} \cdot (2^8 - 1)) = 0
/\!\!/ a-val^{(1)}= srl1 when additionally shifting 0, 1, or 2 bytes (sh4 \cdot sh5 =0)
\# a-val^{(1)}= srl1 + h-sgn-b \cdot (exp3 - 1) \cdot exp3-aux) when additionally shifting 3 bytes (sh4 = sh5 = 1)
\bullet \quad (\mathtt{is\text{-}sra}) \cdot (\mathtt{a\text{-}val}^{(1)} - (1 - \mathtt{sh4} \cdot \mathtt{sh5}) \cdot (\mathtt{srl1}) - (\mathtt{sh4}) \cdot (\mathtt{sh5}) \cdot (\mathtt{srl1} + \mathtt{sra\text{-}mask})) = 0
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
# Range check b-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check c-val^{(j)}\in igl[0,2^8-1igr] for j=1,2,3,4 - Performed in the CPU component
```

The set of constraints above is similar to those for the SRL instruction in Section 8.5.6. As in the large field case described above, the main difference with respect to the SRL constraints is that we need to additionally account for the sign bit h-sgn-b of b-val when performing the right shift operation. To achieve this goal, the set of constraints first stores the result of a logical right shift into (qt1, qt2, qt3, qt4) and then replicates the sign bit h-sgn-b into the vacated positions.

To see why the last step works as desired, first note that, by construction,  $\exp 3 = 2^{\operatorname{shamt}}$  and  $\exp 3 \cdot \exp 3 - \operatorname{aux} = 2^8$ , where shamt  $= \operatorname{sh} 1 + 2 \cdot \operatorname{sh} 2 + 4 \cdot \operatorname{sh} 3$  indicates the number of bits being temporarily right shifted. Hence, the field element  $(\exp 3 - 1) \cdot \exp 3 - \operatorname{aux}$  corresponds to the 8-bit string containing shamt one bits at the high-order bit positions and  $8 - \operatorname{shamt}$  zero bits at the low-order bit positions.

The final value of the arithmetic right shift operation is then obtained by adding h-sgn-b·(exp3 – 1)·exp3-aux to the appropriate limb of a-val by considering the values of the shift bits sh4, sh5. More precisely,

- if (sh4, sh5) = (0,0), then the amount h-sgn-b · (exp3 1) · exp3-aux is added to sr14 when computing the value of the 4-th limb a-val<sup>(4)</sup>. No additional changes are needed for the other limbs and we simply set a-val<sup>(j)</sup> = qtj for j = 1,2,3.
  if (sh4, sh5) = (1,0), then a-val<sup>(4)</sup> is set to 255 (corresponding to the all-1s byte) and the
- if  $(\mathtt{sh4},\mathtt{sh5}) = (1,0)$ , then  $\mathtt{a-val}^{(4)}$  is set to 255 (corresponding to the all-1s byte) and the amount  $\mathtt{h-sgn-b} \cdot (\mathtt{exp3}-1) \cdot \mathtt{exp3-aux}$  is added to  $\mathtt{sr13}$  when computing the value of the 3-rd limb  $\mathtt{a-val}^{(3)}$ . No additional changes are needed for the other limbs and we simply set  $\mathtt{a-val}^{(j)} = \mathtt{qtj}$  for j=1,2.
- if (sh4, sh5) = (0,1), then a-val<sup>(3)</sup> and a-val<sup>(4)</sup> are set to 255 and the amount h-sgn-b·(exp3-1)·exp3-aux is added to sr12 when computing the value of the 2-nd limb a-val<sup>(2)</sup>. No additional changes are needed for the first limb and we simply set a-val<sup>(1)</sup> = qt1.
- if  $(\mathtt{sh4},\mathtt{sh5}) = (1,1)$ , then  $\mathtt{a-val}^{(2)}$ ,  $\mathtt{a-val}^{(3)}$ , and  $\mathtt{a-val}^{(4)}$  are set to 255 and the amount  $\mathtt{h-sgn-b} \cdot (\mathtt{exp3}-1) \cdot \mathtt{exp3-aux}$  is added to  $\mathtt{srl1}$  when computing the value of the 1-st limb  $\mathtt{a-val}^{(1)}$ .

# 8.5.8 XOR Instruction

The parameters and functionality for the XOR instruction are as follows:

- opcode: XOR
- Parameters: (a-val, b-val, c-val)
- Instruction selector: is-xor = 1
- Functionality: a-val  $\leftarrow$  b-val  $\oplus$  c-val

The mapping from the xor and xori instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a             | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------------------|------|------|-------|-------|--------|---------|
| xor        | $^{\mathrm{rd}}$ | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2]  |
| xori       | $_{ m rd}$       | rs1  | rs2  | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function and i is a 12-bit immediate value.

### Constraints assuming large fields

```
# Performing xor lookup queries
    (\text{is-xor}) \cdot ((\text{a-val-low}^{(1)}, \text{b-val-low}^{(1)}, \text{c-val-low}^{(1)}) \in \text{lookup}_{\text{vor}}) = 0
    (is-xor) \cdot ((a-val-high^{(1)}, b-val-high^{(1)}, c-val-high^{(1)}) \in lookup_{vor}) = 0
    (\texttt{is-xor}) \cdot ((\texttt{a-val-low}^{(2)}, \texttt{b-val-low}^{(2)}, \texttt{c-val-low}^{(2)}) \in \texttt{lookup}_{\texttt{xor}}) = 0
    (\texttt{is-xor}) \cdot ((\texttt{a-val-high}^{(2)}, \texttt{b-val-high}^{(2)}, \texttt{c-val-high}^{(2)}) \in \texttt{lookup}_{\texttt{xor}}) = 0
    (\text{is-xor}) \cdot ((\text{a-val-low}^{(3)}, \text{b-val-low}^{(3)}, \text{c-val-low}^{(3)}) \in \text{lookup}_{\text{xor}}) = 0
    (\mathsf{is}\mathsf{-xor}) \cdot ((\mathsf{a}\mathsf{-val}\mathsf{-high}^{(3)}, \mathsf{b}\mathsf{-val}\mathsf{-high}^{(3)}, \mathsf{c}\mathsf{-val}\mathsf{-high}^{(3)}) \in \mathsf{lookup}_{\mathsf{xor}}) = 0
    (\text{is-xor}) \cdot ((\text{a-val-low}^{(4)}, \text{b-val-low}^{(4)}, \text{c-val-low}^{(4)}) \in \text{lookup}_{\text{xor}}) = 0
    (\mathsf{is}\mathsf{-xor}) \cdot ((\mathsf{a}\mathsf{-val}\mathsf{-high}^{(4)}, \mathsf{b}\mathsf{-val}\mathsf{-high}^{(4)}, \mathsf{c}\mathsf{-val}\mathsf{-high}^{(4)}) \in \mathsf{lookup}_{\mathsf{xor}}) = 0
/\!\!/ Range check a-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check a-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check b-val-low^{(j)}\in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check b-val-high^{(j)}\in [0,2^4-1] for j=1,2,3,4 - Implied by xor lookup query
// Range check c-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check c-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
// Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-xor \in \{0,1\} - Performed in the CPU component
```

```
# Extracting 4-bit limbs from a-val, b-val, and c-val
    (is-xor) \cdot (a-val-low^{(1)} + a-val-high^{(1)} \cdot 2^4 - a-val^{(1)}) = 0
     (is-xor) \cdot (a-val-low^{(2)} + a-val-high^{(2)} \cdot 2^4 - a-val^{(2)}) = 0
    (is-xor) \cdot (a-val-low^{(3)} + a-val-high^{(3)} \cdot 2^4 - a-val^{(3)}) = 0
     (\texttt{is-xor}) \cdot (\texttt{a-val-low}^{(4)} + \texttt{a-val-high}^{(4)} \cdot 2^4 - \texttt{a-val}^{(4)}) = 0
    (\text{is-xor}) \cdot (\text{b-val-low}^{(1)} + \text{b-val-high}^{(1)} \cdot 2^4 - \text{b-val}^{(1)}) = 0
    (\mathsf{is}\mathsf{-xor}) \cdot (\mathsf{b}\mathsf{-val}\mathsf{-low}^{(2)} + \mathsf{b}\mathsf{-val}\mathsf{-high}^{(2)} \cdot 2^4 - \mathsf{b}\mathsf{-val}^{(2)}) = 0
     (is-xor) \cdot (b-val-low^{(3)} + b-val-high^{(3)} \cdot 2^4 - b-val^{(3)}) = 0
     (is-xor) · (b-val-low<sup>(4)</sup> + b-val-high<sup>(4)</sup> · 2^4 - b-val<sup>(4)</sup>) = 0
     (\texttt{is-xor}) \cdot (\texttt{c-val-low}^{(1)} + \texttt{c-val-high}^{(1)} \cdot 2^4 - \texttt{c-val}^{(1)}) = 0
     (\text{is-xor}) \cdot (\text{c-val-low}^{(2)} + \text{c-val-high}^{(2)} \cdot 2^4 - \text{c-val}^{(2)}) = 0
     (is-xor) \cdot (c-val-low^{(3)} + c-val-high^{(3)} \cdot 2^4 - c-val^{(3)}) = 0
     (is-xor) \cdot (c-val-low^{(4)} + c-val-high^{(4)} \cdot 2^4 - c-val^{(4)}) = 0
# Performing xor lookup queries
     (is-xor) \cdot ((a-val-low^{(1)}, b-val-low^{(1)}, c-val-low^{(1)}) \in lookup_{vor}) = 0
     (is-xor) \cdot ((a-val-high^{(1)}, b-val-high^{(1)}, c-val-high^{(1)}) \in lookup_{xor}) = 0
     (\text{is-xor}) \cdot ((\text{a-val-low}^{(2)}, \text{b-val-low}^{(2)}, \text{c-val-low}^{(2)}) \in \text{lookup}_{\text{xor}}) = 0
     (is-xor) \cdot ((a-val-high^{(2)}, b-val-high^{(2)}, c-val-high^{(2)}) \in lookup_{xor}) = 0
     (\texttt{is-xor}) \cdot ((\texttt{a-val-low}^{(3)}, \texttt{b-val-low}^{\bar{(3)}}, \texttt{c-val-low}^{(3)}) \in \texttt{lookup}_{\texttt{xor}}) = 0
     (\mathsf{is}\mathsf{-xor}) \cdot ((\mathsf{a}\mathsf{-val}\mathsf{-high}^{(3)}, \mathsf{b}\mathsf{-val}\mathsf{-high}^{(3)}, \mathsf{c}\mathsf{-val}\mathsf{-high}^{(3)}) \in \mathsf{lookup}_{\mathsf{xor}}) = 0
     (\texttt{is-xor}) \cdot ((\texttt{a-val-low}^{(4)}, \texttt{b-val-low}^{(4)}, \texttt{c-val-low}^{(4)}) \in \texttt{lookup}_{\texttt{xor}}) = 0
     (\mathsf{is}\mathsf{-xor}) \cdot ((\mathsf{a}\mathsf{-val}\mathsf{-high}^{(4)}, \mathsf{b}\mathsf{-val}\mathsf{-high}^{(4)}, \mathsf{c}\mathsf{-val}\mathsf{-high}^{(4)}) \in \mathsf{lookup}_{\mathsf{xor}}) = 0
/\!\!/ Range check a-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check a-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check b-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check b-val-high^{(j)} \in \left[0, 2^4 - 1
ight] for j = 1, 2, 3, 4 - Implied by xor lookup query
/\!\!/ Range check c-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
/\!\!/ Range check c-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by xor lookup query
```

```
\label{eq:Range_check_a-val} \begin{subarray}{l} $\#$ Range_check_a-val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_b-val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c-val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check_check
```

#### 8.5.9 AND Instruction

The parameters and functionality for the AND instruction are as follows:

opcode: AND
Parameters: (a-val, b-val, c-val)
Instruction selector: is-and = 1
Functionality: a-val ← b-val & c-val

The mapping from the and and instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------|------|------|-------|-------|--------|---------|
| and        | rd   | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2]  |
| andi       | rd   | rs1  | rs2  | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function and i is a 12-bit immediate value.

### Constraints assuming large fields

```
# Extracting 4-bit limbs from a-val, b-val, and c-val
                 (\text{is-and}) \cdot (\text{a-val-low}^{(1)} \qquad +\text{a-val-high}^{(1)} \cdot 2^4 \quad +\text{a-val-low}^{(2)} \cdot 2^8 \quad +\text{a-val-high}^{(2)} \cdot 2^{12} + \cdots + (\text{a-val-low}^{(1)} \cdot 2^4) \cdot 2^{12} + \cdots + (\text{a-val-low}^{(1)} \cdot 2^4
                                                                                  a\text{-val-low}^{(3)} \cdot 2^{16} + a\text{-val-high}^{(3)} \cdot 2^{20} + a\text{-val-low}^{(4)} \cdot 2^{24} + a\text{-val-high}^{(4)} \cdot 2^{28} - a\text{-val} = 0
                   (\texttt{is-and}) \cdot (\texttt{b-val-low}^{(1)} \qquad +\texttt{b-val-high}^{(1)} \cdot 2^4 \quad +\texttt{b-val-low}^{(2)} \cdot 2^8 \quad +\texttt{b-val-high}^{(2)} \cdot 2^{12} + \texttt{b-val-high}^{(2)} \cdot 2^{12}
                                                                                  b-val-low<sup>(3)</sup> \cdot 2<sup>16</sup> +b-val-high<sup>(3)</sup> \cdot 2<sup>20</sup> +b-val-low<sup>(4)</sup> \cdot 2<sup>24</sup> +b-val-high<sup>(4)</sup> \cdot 2<sup>28</sup> - b-val) = 0
                  (\text{is-and}) \cdot (\text{c-val-low}^{(1)} \\ + \text{c-val-high}^{(1)} \cdot 2^4 \\ + \text{c-val-low}^{(2)} \cdot 2^8 \\ + \text{c-val-high}^{(2)} \cdot 2^{12} + \cdots \\ + \text{c-val-high}^{(2)} \cdot 
                                                                                  // Performing and lookup queries
                 (\mathsf{is}\text{-}\mathsf{and})\cdot((\mathsf{a}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(1)},\mathsf{b}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(1)},\mathsf{c}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(1)})\in\mathsf{lookup}_{\mathsf{and}})=0
                    (\texttt{is-and}) \cdot ((\texttt{a-val-high}^{(1)}, \texttt{b-val-high}^{(1)}, \texttt{c-val-high}^{(1)}) \in \texttt{lookup}_{\texttt{and}}) = 0
                    (\texttt{is-and}) \cdot ((\texttt{a-val-low}^{(2)}, \texttt{b-val-low}^{(2)}, \texttt{c-val-low}^{(2)}) \in \texttt{lookup}_{\texttt{and}}) = 0
                    (\mathsf{is}\text{-and}) \cdot ((\mathsf{a}\text{-val-high}^{(2)}, \mathsf{b}\text{-val-high}^{(2)}, \mathsf{c}\text{-val-high}^{(2)}) \in \mathsf{lookup}_{\mathsf{and}}) = 0
                   (\texttt{is-and}) \cdot ((\texttt{a-val-low}^{(3)}, \texttt{b-val-low}^{(3)}, \texttt{c-val-low}^{(3)}) \in \texttt{lookup}_{\texttt{and}}) = 0
                 \begin{aligned} &(\text{is-and}) \cdot ((\text{a-val-low}^{(3)}, \text{b-val-high}^{(3)}, \text{c-val-high}^{(3)}) \in \text{lookup}_{\text{and}}) = 0 \\ &(\text{is-and}) \cdot ((\text{a-val-low}^{(4)}, \text{b-val-low}^{(4)}, \text{c-val-low}^{(4)}) \in \text{lookup}_{\text{and}}) = 0 \\ &(\text{is-and}) \cdot ((\text{a-val-high}^{(4)}, \text{b-val-high}^{(4)}, \text{c-val-high}^{(4)}) \in \text{lookup}_{\text{and}}) = 0 \\ &(\text{is-and}) \cdot ((\text{a-val-high}^{(4)}, \text{b-val-high}^{(4)}, \text{c-val-high}^{(4)}) \in \text{lookup}_{\text{and}}) = 0 \end{aligned}
/\!\!/ Range check a-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check a-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check b-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check b-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check c-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check c-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
// Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
// Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-and \in \{0,1\} - Performed in the CPU component
```

```
# Extracting 4-bit limbs from a-val, b-val, and c-val
     (\texttt{is-and}) \cdot (\texttt{a-val-low}^{(1)} + \texttt{a-val-high}^{(1)} \cdot 2^4 - \texttt{a-val}^{(1)}) = 0
      (is-and) \cdot (a-val-low<sup>(2)</sup> + a-val-high<sup>(2)</sup> \cdot 2<sup>4</sup> - a-val<sup>(2)</sup>) = 0
      (is-and) \cdot (a-val-low<sup>(3)</sup> + a-val-high<sup>(3)</sup> \cdot 2<sup>4</sup> - a-val<sup>(3)</sup>) = 0
      (is-and) \cdot (a-val-low<sup>(4)</sup> + a-val-high<sup>(4)</sup> \cdot 2<sup>4</sup> - a-val<sup>(4)</sup>) = 0
      (is-and) \cdot (b-val-low<sup>(1)</sup> + b-val-high<sup>(1)</sup> \cdot 2<sup>4</sup> - b-val<sup>(1)</sup>) = 0
      (is-and) \cdot (b-val-low<sup>(2)</sup> + b-val-high<sup>(2)</sup> \cdot 2<sup>4</sup> - b-val<sup>(2)</sup>) = 0
      (is-and) \cdot (b-val-low<sup>(3)</sup> + b-val-high<sup>(3)</sup> \cdot 2<sup>4</sup> - b-val<sup>(3)</sup>) = 0
      (\texttt{is-and}) \cdot (\texttt{b-val-low}^{(4)} + \texttt{b-val-high}^{(4)} \cdot 2^4 - \texttt{b-val}^{(4)}) = 0
      (\texttt{is-and}) \cdot (\texttt{c-val-low}^{(1)} + \texttt{c-val-high}^{(1)} \cdot 2^4 - \texttt{c-val}^{(1)}) = 0
      (is-and) \cdot (c-val-low<sup>(2)</sup> + c-val-high<sup>(2)</sup> \cdot 2<sup>4</sup> - c-val<sup>(2)</sup>) = 0
      (\text{is-and}) \cdot (\text{c-val-low}^{(3)} + \text{c-val-high}^{(3)} \cdot 2^4 - \text{c-val}^{(3)}) = 0
      (is-and) \cdot (c-val-low<sup>(4)</sup> + c-val-high<sup>(4)</sup> \cdot 2<sup>4</sup> - c-val<sup>(4)</sup>) = 0
# Performing and lookup queries
     (\texttt{is-and}) \cdot ((\texttt{a-val-low}^{(1)}, \texttt{b-val-low}^{(1)}, \texttt{c-val-low}^{(1)}) \in \texttt{lookup}_{\texttt{and}}) = 0
      (\text{is-and}) \cdot ((\text{a-val-high}^{(1)}, \text{b-val-high}^{(1)}, \text{c-val-high}^{(1)}) \in \text{lookup}_{\text{and}}) = 0
      (\texttt{is-and}) \cdot ((\texttt{a-val-low}^{(2)}, \texttt{b-val-low}^{(2)}, \texttt{c-val-low}^{(2)}) \in \texttt{lookup}_{\texttt{and}}) = 0
      (\texttt{is-and}) \cdot ((\texttt{a-val-high}^{(2)}, \texttt{b-val-high}^{(2)}, \texttt{c-val-high}^{(2)}) \in \texttt{lookup}_{\texttt{and}}) = 0
      (\mathsf{is}\text{-}\mathsf{and}) \cdot ((\mathsf{a}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(3)}, \mathsf{b}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(\overline{3})}, \mathsf{c}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(\overline{3})}) \in \mathsf{lookup}_{\mathtt{and}}) = 0
     \begin{aligned} &(\text{is-and}) \cdot ((\text{a-val-high}^{(3)}, \text{b-val-high}^{(3)}, \text{c-val-high}^{(3)}) \in \text{lookup}_{\text{and}}) = 0 \\ &(\text{is-and}) \cdot ((\text{a-val-low}^{(4)}, \text{b-val-low}^{(4)}, \text{c-val-low}^{(4)}) \in \text{lookup}_{\text{and}}) = 0 \\ &(\text{is-and}) \cdot ((\text{a-val-high}^{(4)}, \text{b-val-high}^{(4)}, \text{c-val-high}^{(4)}) \in \text{lookup}_{\text{and}}) = 0 \end{aligned}
/\!\!/ Range check \operatorname{a-val-low}^{(j)} \in \left[0, 2^4 - 1
ight] for j = 1, 2, 3, 4 - Implied by and lookup query
/\!\!/ Range check a-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check b-val-low^{(j)} \in \left[0,2^4-1
ight]^{\!\!-} for j=1,2,3,4 - Implied by and lookup query
// Range check b-val-high^{(j)} \in \left[0,2^4-1\right] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check c-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by and lookup query
// Range check c-val-high^{(j)} \in \left[0,2^4-1\right] for j=1,2,3,4 - Implied by and lookup query
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
# Range check b-val^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4 - Performed in the CPU component
// Range check c-val^{(j)}\in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-and \in \{0,1\} - Performed in the CPU component
```

### 8.5.10 OR Instruction

The parameters and functionality for the OR instruction are as follows:

opcode: OR
Parameters: (a-val, b-val, c-val)
Instruction selector: is-or = 1
Functionality: a-val = b-val | c-val

The mapping from the or and ori instructions in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------|------|------|-------|-------|--------|---------|
| or         | rd   | rs1  | rs2  | 0     | R[rd] | R[rs1] | R[rs2]  |
| ori        | rd   | rs1  | rs2  | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function and i is a 12-bit immediate value.

## Constraints assuming large fields

```
\# Extracting 4-bit limbs from a-val, b-val, and c-val 
• (is-or) \cdot (a-val-low<sup>(1)</sup> +a-val-high<sup>(1)</sup> \cdot 2<sup>4</sup> +a-val-low<sup>(2)</sup> \cdot 2<sup>8</sup> +a-val-high<sup>(2)</sup> \cdot 2<sup>12</sup>+
```

```
 \text{a-val-low}^{(3)} \cdot 2^{16} + \text{a-val-high}^{(3)} \cdot 2^{20} + \text{a-val-low}^{(4)} \cdot 2^{24} + \text{a-val-high}^{(4)} \cdot 2^{28} - \text{a-val}) = 0 
    (\texttt{is-or}) \cdot (\texttt{b-val-low}^{(1)}
                                       +b-val-high^{(1)} \cdot 2^4 + b-val-low^{(2)} \cdot 2^8 + b-val-high^{(2)} \cdot 2^{12} + b
                 \text{b-val-low}^{(3)} \cdot 2^{16} + \text{b-val-high}^{(3)} \cdot 2^{20} + \text{b-val-low}^{(4)} \cdot 2^{24} + \text{b-val-high}^{(4)} \cdot 2^{28} - \text{b-val}) = 0
    (\texttt{is-or}) \cdot (\texttt{c-val-low}^{(1)}
                                       +c-val-high^{(1)} \cdot 2^4 + c-val-low^{(2)} \cdot 2^8 + c-val-high^{(2)} \cdot 2^{12} + c
                 // Performing or lookup queries
    (\mathtt{is\text{-}or}) \cdot ((\mathtt{a\text{-}val\text{-}low}^{(1)}, \mathtt{b\text{-}val\text{-}low}^{(1)}, \mathtt{c\text{-}val\text{-}low}^{(1)}) \in \mathtt{lookup}_{\mathtt{or}}) = 0
    (is-or) \cdot ((a-val-high^{(1)}, b-val-high^{(1)}, c-val-high^{(1)}) \in lookup_{or}) = 0
    (is-or) \cdot ((a-val-low^{(2)}, b-val-low^{(2)}, c-val-low^{(2)}) \in lookup_{or}) = 0
    (\mathtt{is-or}) \cdot ((\mathtt{a-val-high}^{(2)}, \mathtt{b-val-high}^{(2)}, \mathtt{c-val-high}^{(2)}) \in \mathtt{lookup}_{\mathtt{or}}) = 0
    (is-or) \cdot ((a-val-low^{(3)}, b-val-low^{(3)}, c-val-low^{(3)}) \in lookup_{or}) = 0
    (\texttt{is-or}) \cdot ((\texttt{a-val-high}^{(3)}, \texttt{b-val-high}^{(3)}, \texttt{c-val-high}^{(3)}) \in \texttt{lookup}_{\texttt{or}}) = 0
    (is-or) \cdot ((a-val-low^{(4)}, b-val-low^{(4)}, c-val-low^{(4)}) \in lookup_{or}) = 0
    (is-or) \cdot ((a-val-high^{(4)}, b-val-high^{(4)}, c-val-high^{(4)}) \in lookup_{or}) = 0
/\!\!/ Range check a-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by or lookup query
/\!\!/ Range check a-val-high^{(j)} \in \left[0,2^4-1\right] for j=1,2,3,4 - Implied by or lookup query
// Range check b-val-low^{(j)} \in \left[0,2^4-1\right] for j=1,2,3,4 - Implied by or lookup query
/\!\!/ Range check b-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by or lookup query
// Range check c-val-low (j) \in \left[0,2^4-1\right] for j=1,2,3,4 - Implied by or lookup query
/\!\!/ Range check c-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by or lookup query
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
// Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-or \in \{0,1\} - Performed in the CPU component
```

```
# Extracting 4-bit limbs from a-val, b-val, and c-val
     (\texttt{is-or}) \cdot (\texttt{a-val-low}^{(1)} + \texttt{a-val-high}^{(1)} \cdot 2^4 - \texttt{a-val}^{(1)}) = 0
      (\text{is-or}) \cdot (\text{a-val-low}^{(2)} + \text{a-val-high}^{(2)} \cdot 2^4 - \text{a-val}^{(2)}) = 0
      (is-or) \cdot (a-val-low^{(3)} + a-val-high^{(3)} \cdot 2^4 - a-val^{(3)}) = 0
      (is-or) \cdot (a-val-low^{(4)} + a-val-high^{(4)} \cdot 2^4 - a-val^{(4)}) = 0
      (\text{is-or}) \cdot (\text{b-val-low}^{(1)} + \text{b-val-high}^{(1)} \cdot 2^4 - \text{b-val}^{(1)}) = 0
     (\mathsf{is\text{-}or}) \cdot (\mathsf{b\text{-}val\text{-}low}^{(2)} + \mathsf{b\text{-}val\text{-}high}^{(2)} \cdot 2^4 - \mathsf{b\text{-}val}^{(2)}) = 0
      (\text{is-or}) \cdot (\text{b-val-low}^{(3)} + \text{b-val-high}^{(3)} \cdot 2^4 - \text{b-val}^{(3)}) = 0
      (\text{is-or}) \cdot (\text{b-val-low}^{(4)} + \text{b-val-high}^{(4)} \cdot 2^4 - \text{b-val}^{(4)}) = 0
      (\mathsf{is}\mathsf{-or})\cdot(\mathsf{c}\mathsf{-val}\mathsf{-low}^{(1)}+\mathsf{c}\mathsf{-val}\mathsf{-high}^{(1)}\cdot 2^4-\mathsf{c}\mathsf{-val}^{(1)})=0
      (is-or) \cdot (c-val-low^{(2)} + c-val-high^{(2)} \cdot 2^4 - c-val^{(2)}) = 0
      (\text{is-or}) \cdot (\text{c-val-low}^{(3)} + \text{c-val-high}^{(3)} \cdot 2^4 - \text{c-val}^{(3)}) = 0
      (is-or) \cdot (c-val-low^{(4)} + c-val-high^{(4)} \cdot 2^4 - c-val^{(4)}) = 0
   Performing or lookup queries
      (\mathtt{is\text{-}or}) \cdot ((\mathtt{a\text{-}val\text{-}low}^{(1)}, \mathtt{b\text{-}val\text{-}low}^{(1)}, \mathtt{c\text{-}val\text{-}low}^{(1)}) \in \mathtt{lookup}_{\mathtt{or}}) = 0
      (\mathsf{is\text{-}or}) \cdot ((\mathsf{a\text{-}val\text{-}high}^{(1)}, \mathsf{b\text{-}val\text{-}high}^{(1)}, \mathsf{c\text{-}val\text{-}high}^{(1)}) \in \mathsf{lookup}_{\mathsf{or}}) = 0
      (\mathsf{is}\text{-}\mathsf{or}) \cdot ((\mathsf{a}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(2)}, \mathsf{b}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(2)}, \mathsf{c}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(2)}) \in \mathsf{lookup}_{\mathsf{or}}) = 0
      (\mathsf{is}\text{-or}) \cdot ((\mathsf{a}\text{-val-high}^{(2)}, \mathsf{b}\text{-val-high}^{(2)}, \mathsf{c}\text{-val-high}^{(2)}) \in \mathsf{lookup}_{\mathsf{or}}) = 0
      (is-or) \cdot ((a-val-low^{(3)}, b-val-low^{(3)}, c-val-low^{(3)}) \in lookup_{or}) = 0
      (\texttt{is-or}) \cdot ((\texttt{a-val-high}^{(3)}, \texttt{b-val-high}^{(3)}, \texttt{c-val-high}^{(3)}) \in \texttt{lookup}_{\texttt{or}}) = 0
      (\mathsf{is}\text{-}\mathsf{or}) \cdot ((\mathsf{a}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(4)}, \mathsf{b}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(4)}, \mathsf{c}\text{-}\mathsf{val}\text{-}\mathsf{low}^{(\bar{4})}) \in \mathsf{lookup}_{\mathsf{or}}) = 0
      (is-or) \cdot ((a-val-high^{(4)}, b-val-high^{(4)}, c-val-high^{(4)}) \in lookup_{or}) = 0
/\!\!/ Range check a-val-low^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by or lookup query
/\!\!/ Range check a-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by or lookup query
// Range check b-val-low ^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by or lookup query
/\!\!/ Range check b-val-high^{(j)} \in \left[0,2^4-1
ight] for j=1,2,3,4 - Implied by or lookup query
```

```
\label{eq:Range_check_c_val_low} \begin{subarray}{ll} $\#$ Range_check_c_val_low^{(j)} \in \left[0,2^4-1\right]$ for $j=1,2,3,4$ - Implied by or lookup query $\#$ Range_check_c_val_high^{(j)} \in \left[0,2^4-1\right]$ for $j=1,2,3,4$ - Implied by or lookup query $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range_check_c_val^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4
```

# 8.6 Basic Instruction Set: Branch Instructions

### 8.6.1 BEQ Instruction

The parameters and functionality for the BEQ instruction are as follows:

```
opcode: BEQ
Parameters: (a-val, b-val, c-val)
Instruction selector: is-beq = 1
Functionality:

pc-next ← pc + c-val if a-val = b-val
pc-next ← pc + 4 if a-val ≠ b-val
```

The mapping from the beq instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| beq        | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 1-12 of the immediate value, and bit 0 of the immediate value is equal to 0.

### Constraints assuming large fields

```
// Comparing a-val, b-val
/\!\!/ h-neq-flag = 0 indicates a-val = b-val
/\!\!/ h-neq-flag = 1 indicates a-val \neq b-val
 (\texttt{is-beq}) \cdot ((\texttt{a-val} - \texttt{b-val}) \cdot \texttt{h-neq-flag-aux} - \texttt{h-neq-flag}) = 0
  (\mathtt{is\text{-}beq}) \cdot (\mathtt{h\text{-}neq\text{-}flag}) \cdot (1-\mathtt{h\text{-}neq\text{-}flag}) = 0
/\!\!/ Enforcing h-neq-flag-aux \neq 0
\bullet \quad (\mathtt{is}\mathtt{-beq}) \cdot (\mathtt{h}\mathtt{-neq}\mathtt{-flag}\mathtt{-aux} \cdot \mathtt{h}\mathtt{-neq}\mathtt{-flag}\mathtt{-aux}\mathtt{-inv} - 1) = 0
/\!\!/ Setting pc-next based on comparison result, unless the next row is the first row
/\!\!/ pc-next = pc + c-val if h-neq-flag = 0
/\!\!/ pc-next = pc + 4 if h-neq-flag = 1
• (is-beq) \cdot ((1 - h-neq-flag) \cdot c-val + (h-neq-flag) \cdot 4 + pc - pc-next - h-carry \cdot 2<sup>32</sup>) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-beq) \cdot (h-carry) \cdot (1 - h-carry) = 0
// Range check \mathrm{pc} \in \left[0, 2^{32} - 1\right] - Guaranteed by the program memory checking
// Range check pc	ext{-next} \in \left[0, 2^{32} - 1
ight] - Guaranteed by the program memory checking
/\!\!/ Range check a-val \in \left\lceil 0, 2^{32} - 1 
ight
ceil - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-beq \in \{0,1\} - Performed in the CPU component
```

In the description above, the goal of the first three constraints is to ensure that h-neq-flag = 0 whenever a-val = b-val and h-neq-flag = 1 if  $a-val \neq b-val$ . To see why this is the case, first notice

that the second and third constraints guarantee that h-neq-flag  $\in \{0,1\}$  and h-neq-flag-aux  $\neq 0$ . Now there two cases to consider:

- If a-val = b-val, then the first constraint can only be satisfied if the term (a-val b-val) · h-neq-flag-aux h-neq-flag = 0, which implies h-neq-flag = 0. Note that h-neq-flag-aux can be set to any non-zero value in this case.
- If a-val  $\neq$  b-val, then the term (a-val b-val) · h-neq-flag-aux h-neq-flag in the first constraint can only be satisfied if h-neq-flag-aux = 1/(a-val b-val) and h-neq-flag = 1.

The remaining constraints then simply enforce the correct increment to the program counter pc when computing pc-next by taking into account the value of the flag h-neq-flag. The value h-carry is simply introduce to help handle carries during the addition operation.

#### Constraints assuming small fields

```
// Comparing a-val, b-val two limbs at a time
/\!\!/ \text{ h-neq12-flag} = 0 \text{ indicates } (\mathbf{a}\text{-val}^{(1)}, \mathbf{a}\text{-val}^{(2)}) = (\mathbf{b}\text{-val}^{(1)}, \mathbf{b}\text{-val}^{(2)})
/\!\!/ h-neq12-flag = 1 indicates (a-val<sup>(1)</sup>, a-val<sup>(2)</sup>) \neq (b-val<sup>(1)</sup>, b-val<sup>(2)</sup>)
/\!\!/ h-neq34-flag = 0 indicates (a-val<sup>(3)</sup>, a-val<sup>(4)</sup>) = (b-val<sup>(3)</sup>, b-val<sup>(4)</sup>)
/\!\!/ h-neg34-flag = 1 indicates (a-val<sup>(3)</sup>, a-val<sup>(4)</sup>) \neq (b-val<sup>(3)</sup>, b-val<sup>(4)</sup>)
     (\mathsf{is}\mathsf{-beq}) \cdot ((\mathsf{a}\mathsf{-val}^{(1)} + 2^8 \cdot \mathsf{a}\mathsf{-val}^{(2)} - \mathsf{b}\mathsf{-val}^{(1)} - 2^8 \cdot \mathsf{b}\mathsf{-val}^{(2)}) \cdot \mathsf{h}\mathsf{-neq12}\mathsf{-flag}\mathsf{-aux} - \mathsf{h}\mathsf{-neq12}\mathsf{-flag}) = 0
     (is-beq) \cdot ((a-val^{(3)} + 2^8 \cdot a-val^{(4)} - b-val^{(3)} - 2^8 \cdot b-val^{(4)}) \cdot h-neq34-flag-aux - h-neq34-flag) = 0
      (\texttt{is-beq}) \cdot (\texttt{h-neq12-flag}) \cdot (1 - \texttt{h-neq12-flag}) = 0
     (\texttt{is-beq}) \cdot (\texttt{h-neq34-flag}) \cdot (1-\texttt{h-neq34-flag}) = 0
// Enforcing h-neq12-flag-aux \neq 0, h-neq34-flag-aux \neq 0
     (\mathtt{is\text{-}beq}) \cdot (\mathtt{h\text{-}neq12\text{-}flag\text{-}aux} \cdot \mathtt{h\text{-}neq12\text{-}flag\text{-}aux\text{-}inv} - 1) = 0
     (is-beq) \cdot (h-neq34-flag-aux \cdot h-neq34-flag-aux-inv - 1) = 0
/\!\!/ h-neq-flag = 0 indicates a-val = b-val
/\!\!/ h-neq-flag = 1 indicates a-val \neq b-val
• (is-beq) \cdot ((1 - h-neq12-flag) \cdot (1 - h-neq34-flag) - (1 - h-neq-flag)) = 0
\slash\hspace{-0.4em}\slash\hspace{-0.4em}\slash\hspace{-0.4em}\slash\hspace{-0.4em}\slash\hspace{-0.4em}\slash\hspace{-0.4em} Setting pc-next based on comparison result, unless the next row is the first row
/\!\!/ pc-next = pc + c-val if h-neq-flag = 0
/\!\!/ pc-next = pc + 4 if h-neq-flag = 1
/\!\!/ h-carry^{(j)} \in \{0,1\} for j=1,2,3,4 used for carry handling
   (\mathtt{is-beq}) \cdot ((1-\mathtt{h-neq-flag}) \cdot \mathtt{c-val}^{(1)} + (\mathtt{h-neq-flag}) \cdot 4 + \mathrm{pc}^{(1)} - \mathrm{pc-next}^{(1)} - \mathtt{h-carry}^{(1)} \cdot 2^8) = 0
      \begin{array}{l} (\text{is-beq}) \cdot ((1-\text{h-neq-flag}) \cdot \text{c-val}^{(2)} + \text{pc}^{(2)} + \text{h-carry}^{(1)} - \text{pc-next}^{(2)} - \text{h-carry}^{(2)} \cdot 2^8) = 0 \\ (\text{is-beq}) \cdot ((1-\text{h-neq-flag}) \cdot \text{c-val}^{(3)} + \text{pc}^{(3)} + \text{h-carry}^{(2)} - \text{pc-next}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0 \\ (\text{is-beq}) \cdot ((1-\text{h-neq-flag}) \cdot \text{c-val}^{(4)} + \text{pc}^{(4)} + \text{h-carry}^{(3)} - \text{pc-next}^{(4)} - \text{h-carry}^{(4)} \cdot 2^8) = 0 \\ (\text{is-beq}) \cdot ((1-\text{h-neq-flag}) \cdot \text{c-val}^{(4)} + \text{pc}^{(4)} + \text{h-carry}^{(3)} - \text{pc-next}^{(4)} - \text{h-carry}^{(4)} \cdot 2^8) = 0 \\ \end{array} 
# Enforcing h-carry^{(j)} \in \{0,1\} for j = 1,2,3,4
\bullet \quad (\mathtt{is\text{-}beq}) \cdot (\mathtt{h\text{-}carry}^{(1)}) \cdot (1-\mathtt{h\text{-}carry}^{(1)}) = 0
     (\texttt{is-beq}) \cdot (\texttt{h-carry}^{(2)}) \cdot (1 - \texttt{h-carry}^{(2)}) = 0
     (\mathtt{is\text{-}beq}) \cdot (\mathtt{h\text{-}carry}^{(3)}) \cdot (1 - \mathtt{h\text{-}carry}^{(3)}) = 0
     (\texttt{is-beq}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
// Range check \operatorname{pc}^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4 - Guaranteed by the program memory checking
/\!\!/ Range check \mathrm{pc	ext{-}next}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check c-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-beq \in \{0,1\} - Performed in the CPU component
/\!\!/ Range check h-neq-flag \in \{0,1\} - Implied by h-neq12-flag and h-neq34-flag range checks
```

The set of constraints is similar to the ones used in large field case in Section 8.6.1, but introduces additional variables to help perform the comparison of the limbs of a-val and b-val. More precisely, while the flag h-neq12-flag will contain the result of the comparison of the first two limbs

of a-val and b-val, h-neq34-flag will contain the result of the comparison of the last two limbs. The flag h-neq-flag can then easily be derived from h-neq12-flag and h-neq34-flag via the term  $((1 - h-neq12-flag) \cdot (1 - h-neq34-flag) - (1 - h-neq-flag))$ , which guarantees that h-neq-flag = 0 only if h-neq12-flag = h-neq34-flag = 0 as desired.

### 8.6.2 BNE Instruction

The parameters and functionality for the BNE instruction are as follows:

```
opcode: BNE
Parameters: (a-val, b-val, c-val)
Instruction selector: is-bne = 1
Functionality:

pc-next ← pc + c-val if a-val ≠ b-val
pc-next ← pc + 4 if a-val = b-val
```

The mapping from the bne instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| bne        | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 1-12 of the immediate value, and bit 0 of the immediate value is equal to 0.

### Constraints assuming large fields

```
// Comparing a-val, b-val
/\!\!/ h-neq-flag = 0 indicates a-val = b-val
/\!\!/ \ \mathtt{h-neq-flag} = 1 \ \mathtt{indicates} \ \mathtt{a-val} \neq \mathtt{b-val}
    (\texttt{is-bne}) \cdot ((\texttt{a-val} - \texttt{b-val}) \cdot \texttt{h-neq-flag-aux} - \texttt{h-neq-flag}) = 0
    (\texttt{is-bne}) \cdot (\texttt{h-neq-flag}) \cdot (1 - \texttt{h-neq-flag}) = 0
/\!\!/ Enforcing h-neq-flag-aux \neq 0
• (is-bne) \cdot (h-neq-flag-aux \cdot h-neq-flag-aux-inv -1) = 0
/\!\!/ Setting pc-next based on comparison result, unless the next row is the first row
/\!\!/ pc-next = pc + c-val if h-neq-flag = 1
/\!\!/ pc\text{-next} = pc + 4 \text{ if h-neq-flag} = 0
  (\mathtt{is\text{-}bne}) \cdot ((\mathtt{h\text{-}neq\text{-}flag}) \cdot \mathtt{c\text{-}val} + (1-\mathtt{h\text{-}neq\text{-}flag}) \cdot 4 + \mathtt{pc} - \mathtt{pc\text{-}next} - \mathtt{h\text{-}carry} \cdot 2^{32}) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-bne) \cdot (h-carry) \cdot (1 - h-carry) = 0
// Range check \mathrm{pc} \in \left[0, 2^{32} - 1\right] - Guaranteed by the program memory checking
/\!\!/ Range check pc-next \in \left[0,2^{32}-1
ight] - Guaranteed by the program memory checking
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-bne \in \{0,1\} - Performed in the CPU component
```

The set of constraints above is similar to one for the beq instruction in Section 8.6.1, except that pc-next is computed slightly differently.

```
/\!\!/ Comparing a-val, b-val two limbs at a time /\!\!/ h-neq12-flag =0 indicates (a-val^{(1)},a-val^{(2)}) = (b-val^{(1)},b-val^{(2)}) /\!\!/ h-neq12-flag =1 indicates (a-val^{(1)},a-val^{(2)}) \neq (b-val^{(1)},b-val^{(2)})
```

```
/\!\!/ h-neq34-flag = 0 indicates (a-val<sup>(3)</sup>, a-val<sup>(4)</sup>) = (b-val<sup>(3)</sup>, b-val<sup>(4)</sup>)
/\!\!/ h-neq34-flag = 1 indicates (a-val<sup>(3)</sup>, a-val<sup>(4)</sup>) \neq (b-val<sup>(3)</sup>, b-val<sup>(4)</sup>)
     (\texttt{is-bne}) \cdot ((\texttt{a-val}^{(1)} + 2^8 \cdot \texttt{a-val}^{(2)} - \texttt{b-val}^{(1)} - 2^8 \cdot \texttt{b-val}^{(2)}) \cdot \texttt{h-neq12-flag-aux} - \texttt{h-neq12-flag}) = 0
     (is-bne) \cdot ((a-val^{(3)} + 2^8 \cdot a-val^{(4)} - b-val^{(3)} - 2^8 \cdot b-val^{(4)}) \cdot h-neq34-flag-aux - h-neq34-flag) = 0
     (\texttt{is-bne}) \cdot (\texttt{h-neq12-flag}) \cdot (1-\texttt{h-neq12-flag}) = 0
     (\texttt{is-bne}) \cdot (\texttt{h-neq34-flag}) \cdot (1 - \texttt{h-neq34-flag}) = 0
/\!\!/ Enforcing h-neq12-flag-aux \neq 0, h-neq34-flag-aux \neq 0
     (is-bne) \cdot (h-neq12-flag-aux \cdot h-neq12-flag-aux-inv - 1) = 0
     (is-bne) \cdot (h-neq34-flag-aux \cdot h-neq34-flag-aux-inv - 1) = 0
/\!\!/ h-neq-flag = 0 indicates a-val = b-val
/\!\!/ h-neq-flag = 1 indicates a-val \neq b-val
• (is-bne) \cdot (1 - h-neq12-flag) \cdot (1 - h-neq34-flag) - (1 - h-neq-flag) = 0
/\!\!/ Setting pc-next based on comparison result, unless the next row is the first row
/\!\!/ pc-next = pc + c-val if h-neq-flag = 1
/\!\!/ pc\text{-next} = pc + 4 \text{ if h-neq-flag} = 0
/\!\!/ h-carry^{(j)} \in \{0,1\} for j=1,2,3,4 used for carry handling
     \begin{array}{l} (\mathtt{is\text{-}bne}) \cdot ((\mathtt{h\text{-}neq\text{-}flag}) \cdot \mathtt{c\text{-}val}^{(1)} + (1 - \mathtt{h\text{-}neq\text{-}flag}) \cdot 4 + \mathtt{pc}^{(1)} - \mathtt{pc\text{-}next}^{(1)} - \mathtt{h\text{-}carry}^{(1)} \cdot 2^8) = 0 \\ (\mathtt{is\text{-}bne}) \cdot ((\mathtt{h\text{-}neq\text{-}flag}) \cdot \mathtt{c\text{-}val}^{(2)} + \mathtt{pc}^{(2)} + \mathtt{h\text{-}carry}^{(1)} - \mathtt{pc\text{-}next}^{(2)} - \mathtt{h\text{-}carry}^{(2)} \cdot 2^8) = 0 \\ (\mathtt{is\text{-}bne}) \cdot ((\mathtt{h\text{-}neq\text{-}flag}) \cdot \mathtt{c\text{-}val}^{(3)} + \mathtt{pc}^{(3)} + \mathtt{h\text{-}carry}^{(2)} - \mathtt{pc\text{-}next}^{(3)} - \mathtt{h\text{-}carry}^{(3)} \cdot 2^8) = 0 \\ \end{array} 
     (\texttt{is-bne}) \cdot ((\texttt{h-neq-flag}) \cdot \texttt{c-val}^{(4)} + \texttt{pc}^{(4)} + \texttt{h-carry}^{(3)} - \texttt{pc-next}^{(4)} - \texttt{h-carry}^{(4)} \cdot 2^8) = 0
/\!\!/ Enforcing h-carry^{(j)} \in \{0,1\} for j=1,2,3,4
     (\mathtt{is\text{-}bne}) \cdot (\mathtt{h\text{-}carry}^{(1)}) \cdot (1-\mathtt{h\text{-}carry}^{(1)}) = 0
     (\text{is-bne}) \cdot (\text{h-carry}^{(2)}) \cdot (1 - \text{h-carry}^{(2)}) = 0
     (\text{is-bne}) \cdot (\text{h-carry}^{(3)}) \cdot (1 - \text{h-carry}^{(3)}) = 0
   (\texttt{is-bne}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
/\!\!/ Range check \mathrm{pc}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check \mathrm{pc	ext{-}next}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
# Range check c-val^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4 - Performed in the CPU component
/\!\!/ Range check is-bne \in \{0,1\} - Performed in the CPU component
/\!\!/ Range check h-neq-flag \in \{0,1\} - Implied by h-neq12-flag and h-neq34-flag range checks
```

The set of constraints above is similar to one for the beq instruction in Section 8.6.1, except that pc-next is computed slightly differently.

### 8.6.3 BLTU Instruction

The parameters and functionality for the BLTU instruction are as follows:

```
opcode: BLTU
Parameters: (a-val, b-val, c-val)
Instruction selector: is-bltu = 1
Functionality:

pc-next ← pc + c-val if a-val < b-val (<u>unsigned</u> comparison)
pc-next ← pc + 4 if a-val ≥ b-val (<u>unsigned</u> comparison)
```

The mapping from the bltu instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| bltu       | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 1-12 of the immediate value, and bit 0 of the immediate value is equal to 0.

# Constraints assuming large fields

```
# Performing unsigned comparison between a-val and b-val using SUB borrow bit
• (is-bltu) \cdot (a-val - b-val - h-rem + h-ltu-flag \cdot \cdot \cdot \cdot 2<sup>32</sup>) = 0
/\!\!/ Enforcing h-ltu-flag \in \{0,1\}
  (\texttt{is-bltu}) \cdot (\texttt{h-ltu-flag}) \cdot (1 - \texttt{h-ltu-flag}) = 0
# Enforcing h-rem \in [0, 2^{32} - 1]
  (\mathtt{is\text{-}bltu})\cdot(\mathtt{h\text{-}rem}\in\left\lceil 0,2^{32}-1\right\rceil)=0
// Setting pc-next based on comparison result, unless the next row is the first row in the STARK trace
/\!\!/ pc-next = pc + c-val if h-ltu-flag = 1
/\!\!/ pc-next = pc + 4 if h-ltu-flag = 0
• (is-bltu) \cdot ((h-ltu-flag) \cdot c-val + (1 - \text{h-ltu-flag}) \cdot 4 + \text{pc} - \text{pc-next} - \text{h-carry} \cdot 2^{32}) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-bltu) \cdot (h-carry) \cdot (1 - h-carry) = 0
// Range check \mathrm{pc} \in \left[0, 2^{32} - 1\right] - Guaranteed by the program memory checking
/\!\!/ Range check \mathrm{pc}	ext{-next} \in \left[0, 2^{32} - 1
ight] - Guaranteed by the program memory checking
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
# Range check b-val \in [0, 2^{32} - 1] - Performed in the CPU component
// Range check c-val \in \left[ 0,2^{32}-1 \right] - Performed in the CPU component
/\!\!/ Range check is-bltu \in \{0,1\} - Performed in the CPU component
```

The set of constraints above first performs an unsigned comparison between a-val and b-val by computing the subtraction a-val - b-val and setting the comparison flag h-ltu-flag to the borrow bit. To see why this works, please notice that the borrow bit h-ltu-flag will always be equal to 1 when a-val  $\leq$  b-val and equal to 0 if a-val > b-val, as desired.

The remaining constraints then simply enforce the correct increment to the program counter pc when computing pc-next by taking into account the value of the flag h-ltu-flag. The value h-carry is simply introduce to help handle carries during the addition operation.

```
// Borrow handling
   (\mathtt{is\text{-}bltu}) \cdot (\mathtt{b\text{-}val}^{(1)} + \mathtt{h\text{-}borrow}^{(1)} \cdot 2^8 - \mathtt{c\text{-}val}^{(1)} - \mathtt{h\text{-}rem}^{(1)}) = 0
     (\texttt{is-bltu}) \cdot (\texttt{b-val}^{(2)} + \texttt{h-borrow}^{(2)} \cdot 2^8 - \texttt{c-val}^{(2)} - \texttt{h-rem}^{(2)} - \texttt{h-borrow}^{(1)}) = 0
   (\text{is-bltu}) \cdot (\text{b-val}^{(3)} + \text{h-borrow}^{(3)} \cdot 2^8 - \text{c-val}^{(3)} - \text{h-rem}^{(3)} - \text{h-borrow}^{(2)}) = 0
   (is-bltu) \cdot (b-val^{(4)} + h-borrow^{(4)} \cdot 2^8 - c-val^{(4)} - h-rem^{(4)} - h-borrow^{(3)}) = 0
// Enforcing h-borrow (j) \in \{0,1\} for j=1,2,3,4
   (\texttt{is-bltu}) \cdot (\texttt{h-borrow}^{(1)}) \cdot (1 - \texttt{h-borrow}^{(1)}) = 0
     (\mathtt{is\text{-}bltu}) \cdot (\mathtt{h\text{-}borrow}^{(2)}) \cdot (1 - \mathtt{h\text{-}borrow}^{(2)}) = 0
   (\texttt{is-bltu}) \cdot (\texttt{h-borrow}^{(3)}) \cdot (1 - \texttt{h-borrow}^{(3)}) = 0
   (\mathtt{is\text{-}bltu}) \cdot (\mathtt{h\text{-}borrow}^{(4)}) \cdot (1-\mathtt{h\text{-}borrow}^{(4)}) = 0
// Enforcing h-rem^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4
   (\mathtt{is\text{-bltu}}) \cdot (\mathtt{h\text{-rem}}^{(1)} \in \left[0, 2^8 - 1\right]) = 0
   (\texttt{is-bltu}) \cdot (\texttt{h-rem}^{(2)} \in \left[0, 2^8 - 1\right]) = 0
   (\texttt{is-bltu}) \cdot (\texttt{h-rem}^{(3)} \in \left[0, 2^8 - 1\right]) = 0
   (\texttt{is-bltu}) \cdot (\texttt{h-rem}^{(4)} \in \left[0, 2^8 - 1\right]) = 0
/\!\!/ Setting h-ltu-flag = h-borrow<sup>(4)</sup>
• (is-bltu) · (h-borrow<sup>(4)</sup> - h-ltu-flag) = 0
/\!\!/ Setting pc-next based on comparison result, unless the next row is the first row
/\!\!/ pc-next = pc + c-val if h-ltu-flag = 1
/\!\!/ pc-next = pc + 4 if h-ltu-flag = 0
```

The set of constraints above is similar to the large field case in Section 8.6.3, but introduces limbs for the carry and borrow values to be able to properly perform the subtractions and additions.

#### 8.6.4 BLT Instruction

The parameters and functionality for the BLT instruction are as follows:

```
opcode: BLT
Parameters: (a-val, b-val, c-val)
Instruction selector: is-blt = 1
Functionality:

pc-next ← pc + c-val if a-val < b-val (signed comparison)</li>
pc-next ← pc + 4 if a-val ≥ b-val (signed comparison)
```

The mapping from the blt instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| blt        | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 1-12 of the immediate value, and bit 0 of the immediate value is equal to 0.

# Constraints assuming large fields

```
(\mathtt{is\text{-}blt}) \cdot (\mathtt{h\text{-}rem\text{-}b} \in \left[0, 2^{31} - 1\right]) = 0
    (is-blt) \cdot (h-sgn-a) \cdot (1-h-sgn-a) = 0
    (is-blt) \cdot (h-sgn-a) \cdot (1 - h-sgn-b) = 0
// Computing h-lt-flag from h-ltu-flag and sign bits h-sgn-a and h-sgn-b
\bullet \quad (\mathsf{is\text{-}blt}) \cdot ((\mathsf{h\text{-}sgn\text{-}a})(1-\mathsf{h\text{-}sgn\text{-}b}) + \mathsf{h\text{-}ltu\text{-}flag}((\mathsf{h\text{-}sgn\text{-}a})(\mathsf{h\text{-}sgn\text{-}b}) + (1-\mathsf{h\text{-}sgn\text{-}a})(1-\mathsf{h\text{-}sgn\text{-}b})) - \mathsf{h\text{-}lt\text{-}flag}) = 0
// Setting pc-next based on comparison result, unless the next row is the first row in the STARK trace
/\!\!/ pc-next = pc + c-val if h-lt-flag = 1
/\!\!/ pc-next = pc + 4 if h-lt-flag = 0
• (is-blt) \cdot ((h-lt-flag) \cdot c-val + (1 - h-lt-flag) \cdot 4 + pc - pc-next - h-carry \cdot 2^{32}) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-blt) \cdot (h-carry) \cdot (1 - h-carry) = 0
// Range check \mathrm{pc} \in \left[0, 2^{32} - 1
ight] - Guaranteed by the program memory checking
/\!\!/ Range check \mathrm{pc}	ext{-next} \in \left[0, 2^{32} - 1
ight] - Guaranteed by the program memory checking
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
// Range check b-val \in reve{\left[0,2^{32}-1
ight]} - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-blt \in \{0,1\} - Performed in the CPU component
```

The set of constraints above is similar to ones for the bltu instruction in Section 8.6.3, except that it takes the signs of a-val and b-val into account when enforcing the way pc-next is computed..

```
// Computing unsigned comparison flag h-ltu-flag using SUB borrow bit
   (\text{is-blt}) \cdot (\text{b-val}^{(1)} + \text{h-borrow}^{(1)} \cdot 2^8 - \text{c-val}^{(1)} - \text{h-rem}^{(1)}) = 0
     (\text{is-blt}) \cdot (\text{b-val}^{(2)} + \text{h-borrow}^{(2)} \cdot 2^8 - \text{c-val}^{(2)} - \text{h-rem}^{(2)} - \text{h-borrow}^{(1)}) = 0
    (\text{is-blt}) \cdot (\text{b-val}^{(3)} + \text{h-borrow}^{(3)} \cdot 2^8 - \text{c-val}^{(3)} - \text{h-rem}^{(3)} - \text{h-borrow}^{(2)}) = 0
     (is-blt) \cdot (b-val^{(4)} + h-borrow^{(4)} \cdot 2^8 - c-val^{(4)} - h-rem^{(4)} - h-borrow^{(3)}) = 0
# Enforcing h-borrow<sup>(j)</sup> \in \{0,1\} for j=1,2,3,4
• (is-blt) \cdot (h-borrow<sup>(1)</sup>) \cdot (1 - h-borrow<sup>(1)</sup>) = 0
• (is-blt) \cdot (h-borrow<sup>(2)</sup>) \cdot (1 - h-borrow<sup>(2)</sup>) = 0
   (\texttt{is-blt}) \cdot (\texttt{h-borrow}^{(3)}) \cdot (1 - \texttt{h-borrow}^{(3)}) = 0
   (\texttt{is-blt}) \cdot (\texttt{h-borrow}^{(4)}) \cdot (1 - \texttt{h-borrow}^{(4)}) = 0
/\!\!/ Enforcing h-rem^{(j)} \in \left[0,2^8-1\right] for j=1,2,3,4
   (\mathtt{is\text{-}blt}) \cdot (\mathtt{h\text{-}rem}^{(1)} \in \left\lceil 0, 2^8 - 1 \right\rceil) = 0
    (\mathtt{is\text{-}blt}) \cdot (\mathtt{h\text{-}rem}^{(2)} \in \left[0, 2^8 - 1\right]) = 0
   (\text{is-blt}) \cdot (\text{h-rem}^{(3)} \in [0, 2^8 - 1]) = 0
   (\text{is-blt}) \cdot (\text{h-rem}^{(4)} \in [0, 2^8 - 1]) = 0
/\!\!/ \; \mathtt{Setting} \  \, \mathtt{h\text{-}ltu\text{-}flag} = \mathtt{h\text{-}borrow}^{(4)}
• (is-blt) · (h-borrow<sup>(4)</sup> - h-ltu-flag) = 0
// Extracting sign bits from b-val and c-val
   (\texttt{is-blt}) \cdot (\texttt{h-rem-b} + \texttt{h-sgn-b} \cdot 2^7 - \texttt{b-val}^{(4)}) = 0
   (\texttt{is-blt}) \cdot (\texttt{h-rem-c} + \texttt{h-sgn-c} \cdot 2^7 - \texttt{c-val}^{(4)}) = 0
    (\mathtt{is	ext{-}blt})\cdot(\mathtt{h	ext{-}rem	ext{-}b}\in\left\lceil 0,2^7-1
ight
ceil)=0
   (\text{is-blt}) \cdot (\text{h-rem-c} \in \left[0, 2^7 - 1\right]) = 0
   (\texttt{is-blt}) \cdot (\texttt{h-sgn-b}) \cdot (1 - \texttt{h-sgn-b}) = 0
   (\texttt{is-blt}) \cdot (\texttt{h-sgn-c}) \cdot (1 - \texttt{h-sgn-c}) = 0
// Computing h-lt-flag from h-ltu-flag and sign bits h-sgn-a and h-sgn-b
\bullet \quad (\mathsf{is\text{-}blt}) \cdot ((\mathsf{h\text{-}sgn\text{-}a})(1-\mathsf{h\text{-}sgn\text{-}b}) + \mathsf{h\text{-}ltu\text{-}flag}((\mathsf{h\text{-}sgn\text{-}a})(\mathsf{h\text{-}sgn\text{-}b}) + (1-\mathsf{h\text{-}sgn\text{-}a})(1-\mathsf{h\text{-}sgn\text{-}b})) - \mathsf{h\text{-}lt\text{-}flag}) = 0
```

```
/\!\!/ Setting pc-next based on comparison result, unless the next row is the first row
/\!\!/ pc-next = pc + c-val if h-lt-flag = 1
/\!\!/ pc-next = pc + 4 if h-lt-flag = 0
/\!\!/ h-carry^{(j)} \in \{0,1\} for j=1,2,3,4 used for carry handling
    (\mathtt{is\text{-}blt}) \cdot ((\mathtt{h}\mathtt{-}\mathtt{lt\text{-}flag}) \cdot \mathtt{c\text{-}val}^{(1)} + (1-\mathtt{h}\mathtt{-}\mathtt{lt\text{-}flag}) \cdot 4 + \mathtt{pc}^{(1)} - \mathtt{pc\text{-}next}^{(1)} - \mathtt{h\text{-}carry}^{(1)} \cdot 2^8) = 0
    (\texttt{is-blt}) \cdot ((\texttt{h-lt-flag}) \cdot \texttt{c-val}^{(2)} + \texttt{pc}^{(2)} + \texttt{h-carry}^{(1)} - \texttt{pc-next}^{(2)} - \texttt{h-carry}^{(2)} \cdot 2^8) = 0
    (\text{is-blt}) \cdot ((\text{h-lt-flag}) \cdot \text{c-val}^{(3)} + \text{pc}^{(3)} + \text{h-carry}^{(2)} - \text{pc-next}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0
    (\texttt{is-blt}) \cdot ((\texttt{h-lt-flag}) \cdot \texttt{c-val}^{(4)} + \texttt{pc}^{(4)} + \texttt{h-carry}^{(3)} - \texttt{pc-next}^{(4)} - \texttt{h-carry}^{(4)} \cdot 2^8) = 0
// Enforcing h-carry (j) \in \{0,1\} for j=1,2,3,4
    (\mathtt{is\text{-}blt})^{-}(\mathtt{h\text{-}carry}^{(1)})\cdot(1-\mathtt{h\text{-}carry}^{(1)})=0
    (\text{is-blt}) \cdot (\text{h-carry}^{(2)}) \cdot (1 - \text{h-carry}^{(2)}) = 0
    (\text{is-blt}) \cdot (\text{h-carry}^{(3)}) \cdot (1 - \text{h-carry}^{(3)}) = 0
   (\texttt{is-blt}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
/\!\!/ Range check \mathrm{pc}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check 	ext{pc-next}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
# Range check b-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
# Range check c-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-blt \in \{0,1\} - Performed in the CPU component
```

The set of constraints above is similar to ones for the bltu instruction in Section 8.6.3, except that it takes the signs of a-val and b-val into account when enforcing the way pc-next is computed..

#### 8.6.5 BGEU Instruction

The parameters and functionality for the BGEU instruction are as follows:

```
opcode: BGEU
Parameters: (a-val, b-val, c-val)
Instruction selector: is-bgeu = 1
Functionality:

pc-next ← pc + c-val if a-val ≥ b-val (unsigned comparison)
pc-next ← pc + 4 if a-val < b-val (unsigned comparison)</li>
```

The mapping from the bgeu instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| bgeu       | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 1-12 of the immediate value, and bit 0 of the immediate value is equal to 0.

```
• (is-bgeu) \cdot ((1-h-ltu-flag) \cdot c-val + (h-ltu-flag) \cdot 4 + pc - pc-next - h-carry \cdot 2<sup>32</sup>) =0 \# Enforcing h-carry \in {0,1} • (is-bgeu) \cdot (h-carry) \cdot (1-h-carry) =0 \# Range check pc \in [0,2<sup>32</sup> - 1] - Guaranteed by the program memory checking \# Range check pc-next \in [0,2<sup>32</sup> - 1] - Guaranteed by the program memory checking \# Range check a-val \in [0,2<sup>32</sup> - 1] - Performed in the CPU component \# Range check b-val \in [0,2<sup>32</sup> - 1] - Performed in the CPU component \# Range check is-bgeu \in {0,1} - Performed in the CPU component
```

The set of constraints above is similar to one for the bltu instruction in Section 8.6.3, except that pc-next is computed slightly differently based on the results of the h-ltu-flag flag.

```
// Borrow handling
     (is-bgeu) \cdot (b-val^{(1)} + h-borrow^{(1)} \cdot 2^8 - c-val^{(1)} - h-rem^{(1)}) = 0
     (\text{is-bgeu}) \cdot (\text{b-val}^{(2)} + \text{h-borrow}^{(2)} \cdot 2^8 - \text{c-val}^{(2)} - \text{h-rem}^{(2)} - \text{h-borrow}^{(1)}) = 0
     (\text{is-bgeu}) \cdot (\text{b-val}^{(3)} + \text{h-borrow}^{(3)} \cdot 2^8 - \text{c-val}^{(3)} - \text{h-rem}^{(3)} - \text{h-borrow}^{(2)}) = 0
     (\text{is-bgeu}) \cdot (\text{b-val}^{(4)} + \text{h-borrow}^{(4)} \cdot 2^8 - \text{c-val}^{(4)} - \text{h-rem}^{(4)} - \text{h-borrow}^{(3)}) = 0
/\!\!/ Enforcing h-borrow<sup>(j)</sup> \in \{0,1\} for j=1,2,3,4
• (is-bgeu) \cdot (h-borrow<sup>(1)</sup>) \cdot (1 - h-borrow<sup>(1)</sup>) = 0
     (\texttt{is-bgeu}) \cdot (\texttt{h-borrow}^{(2)}) \cdot (1 - \texttt{h-borrow}^{(2)}) = 0
• (is-bgeu) \cdot (h-borrow<sup>(3)</sup>) \cdot (1 - h-borrow<sup>(3)</sup>) = 0
     (\texttt{is-bgeu}) \cdot (\texttt{h-borrow}^{(4)}) \cdot (1 - \texttt{h-borrow}^{(4)}) = 0
/\!\!/ Enforcing h-rem^{(j)} \in \left[0,2^8-1\right] for j=1,2,3,4
   (\texttt{is-bgeu}) \cdot (\texttt{h-rem}^{(1)} \in \left[0, 2^8 - 1\right]) = 0
   (\text{is-bgeu}) \cdot (\text{h-rem}^{(2)} \in [0, 2^8 - 1]) = 0
   (\text{is-bgeu}) \cdot (\text{h-rem}^{(3)} \in [0, 2^8 - 1]) = 0
     (\texttt{is-bgeu}) \cdot (\texttt{h-rem}^{(4)} \in \left\lceil 0, 2^8 - 1 \right\rceil) = 0
/\!\!/ Setting h-ltu-flag = h-borrow<sup>(4)</sup>
• (is-bgeu) \cdot (h-borrow<sup>(4)</sup> - h-ltu-flag) = 0
/\!\!/ Setting pc-next based on comparison result, unless the next row is the first row
/\!\!/ pc-next = pc + c-val if h-ltu-flag = 0
/\!\!/ pc-next = pc + 4 if h-ltu-flag = 1
// h-carry ^{(j)} \in \{0,1\} for j=1,2,3,4 used for carry handling
   \begin{array}{l} (\text{is-bgeu}) \cdot ((1-\text{h-ltu-flag}) \cdot \text{c-val}^{(1)} + (\text{h-ltu-flag}) \cdot 4 + \text{pc}^{(1)} - \text{pc-next}^{(1)} - \text{h-carry}^{(1)} \cdot 2^8) = 0 \\ (\text{is-bgeu}) \cdot ((1-\text{h-ltu-flag}) \cdot \text{c-val}^{(2)} + \text{pc}^{(2)} + \text{h-carry}^{(1)} - \text{pc-next}^{(2)} - \text{h-carry}^{(2)} \cdot 2^8) = 0 \\ (\text{is-bgeu}) \cdot ((1-\text{h-ltu-flag}) \cdot \text{c-val}^{(3)} + \text{pc}^{(3)} + \text{h-carry}^{(2)} - \text{pc-next}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0 \\ (\text{is-bgeu}) \cdot ((1-\text{h-ltu-flag}) \cdot \text{c-val}^{(4)} + \text{pc}^{(4)} + \text{h-carry}^{(3)} - \text{pc-next}^{(4)} - \text{h-carry}^{(4)} \cdot 2^8) = 0 \\ \end{array} 
# Enforcing h-carry^{(j)} \in \{0,1\} for j = 1,2,3,4
   (\texttt{is-bgeu}) \cdot (\texttt{h-carry}^{(1)}) \cdot (1 - \texttt{h-carry}^{(1)}) = 0
     (is-bgeu) \cdot (h-carry<sup>(2)</sup>) \cdot (1 - h-carry<sup>(2)</sup>) = 0 (is-bgeu) \cdot (h-carry<sup>(3)</sup>) \cdot (1 - h-carry<sup>(3)</sup>) = 0
     (\texttt{is-bgeu}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
/\!\!/ Range check \mathrm{pc}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check \operatorname{pc-next}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)}\in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check c-val^{(j)}\in igl[0,2^8-1igr] for j=1,2,3,4 - Performed in the CPU component
```

```
/\!\!/ Range check is-bgeu \in \{0,1\} - Performed in the CPU component
```

The set of constraints above is similar to one for the bltu instruction in Section 8.6.3, except that pc-next is computed slightly differently based on the results of the h-ltu-flag flag.

#### 8.6.6 BGE Instruction

The parameters and functionality for the BGE instruction are as follows:

```
opcode: BGE
Parameters: (a-val, b-val, c-val)
Instruction selector: is-bge = 1
Functionality:

pc-next ← pc + c-val if a-val ≥ b-val (<u>signed</u> comparison)
pc-next ← pc + 4 if a-val < b-val (<u>signed</u> comparison)
```

The mapping from the bge instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| bge        | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 1-12 of the immediate value, and bit 0 of the immediate value is equal to 0.

```
# Performing unsigned comparison between a-val and b-val using SUB borrow bit
• (is-bge) \cdot (a-val - b-val - h-rem + h-borrow \cdot 2^{32}) = 0
/\!\!/ Enforcing h-borrow \in \{0,1\}
• (is-bge) \cdot (h-borrow) \cdot (1 - h-borrow) = 0
/\!\!/ Enforcing h-rem \in \left[0,2^{32}-1\right]
  (\texttt{is-bge}) \cdot (\texttt{h-rem} \in \left[0, 2^{32} - 1\right]) = 0
// Setting h-ltu-flag = h-borrow
  (is-bge) \cdot (h-borrow - h-ltu-flag) = 0
// Extracting sign bits from a-val and b-val
   (\texttt{is-bge}) \cdot (\texttt{h-rem-a} + \texttt{h-sgn-a} \cdot 2^{31} - \texttt{a-val}) = 0
    (\texttt{is-bge}) \cdot (\texttt{h-rem-b} + \texttt{h-sgn-b} \cdot 2^{31} - \texttt{b-val}) = 0
   (\texttt{is-bge}) \cdot (\texttt{h-rem-a} \in \left\lceil 0, 2^{31} - 1 \right\rceil) = 0
    (is-bge) \cdot (h-rem-b \in [0, 2^{31} - 1]) = 0
    (is-bge) \cdot (h-sgn-a) \cdot (1-h-sgn-a) = 0
    (\texttt{is-bge}) \cdot (\texttt{h-sgn-a}) \cdot (1-\texttt{h-sgn-b}) = 0
// Computing h-lt-flag from h-ltu-flag and sign bits h-sgn-a and h-sgn-b
    (\mathsf{is\text{-}bge}) \cdot ((\mathsf{h\text{-}sgn\text{-}a})(1-\mathsf{h\text{-}sgn\text{-}b}) + \mathsf{h\text{-}ltu\text{-}flag}((\mathsf{h\text{-}sgn\text{-}a})(\mathsf{h\text{-}sgn\text{-}b}) + (1-\mathsf{h\text{-}sgn\text{-}a})(1-\mathsf{h\text{-}sgn\text{-}b})) - \mathsf{h\text{-}lt\text{-}flag}) = 0
// Setting pc-next based on comparison result, unless the next row is the first row in the STARK trace
/\!\!/ pc-next = pc + c-val if h-lt-flag = 0
/\!\!/ pc-next = pc + 4 if h-lt-flag = 1
 • (is-bge) \cdot ((1 - h-lt-flag) \cdot c-val + (h-lt-flag) \cdot 4 + pc - pc-next - h-carry \cdot 2<sup>32</sup>) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-blt) \cdot (h-carry) \cdot (1 - h-carry) = 0
// Range check \mathrm{pc} \in \left[0, 2^{32} - 1\right] - Guaranteed by the program memory checking
// Range check pc	ext{-next} \in \left[0, 2^{32} - 1
ight] - Guaranteed by the program memory checking
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
```

```
/\!\!/ Range check c-val \in \left[0,2^{32}-1\right] - Performed in the CPU component /\!\!/ Range check is-bge \in \{0,1\} - Performed in the CPU component
```

The set of constraints above is similar to ones for the bgeu instruction in Section 8.6.5, except that it takes the signs of a-val and b-val into account when enforcing the way pc-next is computed.

```
// Computing unsigned comparison flag h-ltu-flag using SUB borrow bit
    (is-bge) \cdot (b-val^{(1)} + h-borrow^{(1)} \cdot 2^8 - c-val^{(1)} - h-rem^{(1)}) = 0
     (\mathsf{is}\mathsf{-bge}) \cdot (\mathsf{b}\mathsf{-val}^{(2)} + \mathsf{h}\mathsf{-borrow}^{(2)} \cdot 2^8 - \mathsf{c}\mathsf{-val}^{(2)} - \mathsf{h}\mathsf{-rem}^{(2)} - \mathsf{h}\mathsf{-borrow}^{(1)}) = 0
     (\text{is-bge}) \cdot (\text{b-val}^{(3)} + \text{h-borrow}^{(3)} \cdot 2^8 - \text{c-val}^{(3)} - \text{h-rem}^{(3)} - \text{h-borrow}^{(2)}) = 0
     (\text{is-bge}) \cdot (\text{b-val}^{(4)} + \text{h-borrow}^{(4)} \cdot 2^8 - \text{c-val}^{(4)} - \text{h-rem}^{(4)} - \text{h-borrow}^{(3)}) = 0
/\!\!/ Enforcing h-borrow^{(j)}\in\{0,1\} for j=1,2,3,4
   (\mathsf{is}\mathsf{-bge}) \cdot (\mathsf{h}\mathsf{-borrow}^{(1)}) \cdot (1-\mathsf{h}\mathsf{-borrow}^{(1)}) = 0
   (\texttt{is-bge}) \cdot (\texttt{h-borrow}^{(2)}) \cdot (1 - \texttt{h-borrow}^{(2)}) = 0
   (\mathtt{is\text{-}bge}) \cdot (\mathtt{h\text{-}borrow}^{(3)}) \cdot (1-\mathtt{h\text{-}borrow}^{(3)}) = 0
     (\texttt{is-bge}) \cdot (\texttt{h-borrow}^{(4)}) \cdot (1 - \texttt{h-borrow}^{(4)}) = 0
// Enforcing h-rem (j) \in \left[0, 2^8 - 1\right] for j = 1, 2, 3, 4
     (\text{is-bge}) \cdot (\text{h-rem}^{(1)} \in [0, 2^8 - 1]) = 0
   (\texttt{is-bge}) \cdot (\texttt{h-rem}^{(2)} \in \left[0, 2^8 - 1\right]) = 0
     (\text{is-bge}) \cdot (\text{h-rem}^{(3)} \in [0, 2^8 - 1]) = 0
     (\text{is-bge}) \cdot (\text{h-rem}^{(4)} \in [0, 2^8 - 1]) = 0
/\!\!/ Setting h-ltu-flag = h-borrow<sup>(4)</sup>
     (is-bge) \cdot (h-borrow^{(4)} - h-ltu-flag) = 0
// Extracting sign bits from b-val and c-val
     (\texttt{is-bge}) \cdot (\texttt{h-rem-b} + \texttt{h-sgn-b} \cdot 2^7 - \texttt{b-val}^{(4)}) = 0
     (\texttt{is-bge}) \cdot (\texttt{h-rem-c} + \texttt{h-sgn-c} \cdot 2^7 - \texttt{c-val}^{(4)}) = 0
     (\texttt{is-bge}) \cdot (\texttt{h-rem-b} \in \left[0, 2^7 - 1\right]) = 0
   (\texttt{is-bge}) \cdot (\texttt{h-rem-c} \in \left\lceil 0, 2^7 - 1 \right\rceil) = 0
     (\texttt{is-bge}) \cdot (\texttt{h-sgn-b}) \cdot (1 - \texttt{h-sgn-b}) = 0
     (\texttt{is-bge}) \cdot (\texttt{h-sgn-c}) \cdot (1 - \texttt{h-sgn-c}) = 0
// Computing h-lt-flag from h-ltu-flag and sign bits h-sgn-a and h-sgn-b
    (\mathsf{is\text{-}bge}) \cdot ((\mathsf{h\text{-}sgn\text{-}a})(1-\mathsf{h\text{-}sgn\text{-}b}) + \mathsf{h\text{-}lt\text{-}flag}((\mathsf{h\text{-}sgn\text{-}a})(\mathsf{h\text{-}sgn\text{-}b}) + (1-\mathsf{h\text{-}sgn\text{-}a})(1-\mathsf{h\text{-}sgn\text{-}b})) - \mathsf{h\text{-}lt\text{-}flag}) = 0
/\!\!/ Setting pc-next based on comparison result, unless the next row is the first row
/\!\!/ pc-next = pc + c-val if h-lt-flag = 0
/\!\!/ pc-next = pc + 4 if h-lt-flag = 1
# h-carry^{(j)} \in \{0,1\} for j=1,2,3,4 used for carry handling
     (\texttt{is-bge}) \cdot ((1-\texttt{h-lt-flag}) \cdot \texttt{c-val}^{(1)} + (\texttt{h-lt-flag}) \cdot 4 + pc^{(1)} - pc - \texttt{next}^{(1)} - \texttt{h-carry}^{(1)} \cdot 2^8) = 0
      \begin{array}{l} (\text{is-bge}) \cdot ((1-\text{h-lt-flag}) \cdot \text{c-val}^{(2)} + \text{pc}^{(2)} + \text{h-carry}^{(1)} - \text{pc-next}^{(2)} - \text{h-carry}^{(2)} \cdot 2^8) = 0 \\ (\text{is-bge}) \cdot ((1-\text{h-lt-flag}) \cdot \text{c-val}^{(3)} + \text{pc}^{(3)} + \text{h-carry}^{(2)} - \text{pc-next}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0 \\ \end{array} 
     (\texttt{is-bge}) \cdot ((1-\texttt{h-lt-flag}) \cdot \texttt{c-val}^{(4)} + \texttt{pc}^{(4)} + \texttt{h-carry}^{(3)} - \texttt{pc-next}^{(4)} - \texttt{h-carry}^{(4)} \cdot 2^8) = 0
/\!\!/ Enforcing h-carry^{(j)}\in\{0,1\} for j=1,2,3,4
     (\mathtt{is\text{-}bge}) \cdot (\mathtt{h\text{-}carry}^{(1)}) \cdot (1 - \mathtt{h\text{-}carry}^{(1)}) = 0
     (\texttt{is-bge}) \cdot (\texttt{h-carry}^{(2)}) \cdot (1 - \texttt{h-carry}^{(2)}) = 0
     (is-bge) · (h-carry<sup>(3)</sup>) · (1 - h-carry<sup>(3)</sup>) = 0
     (\texttt{is-bge}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
/\!\!/ Range check \mathrm{pc}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
// Range check pc-next^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
```

```
\label{eq:Range_check_b-val} \begin{subarray}{ll} $\#$ Range check ${\rm c-val}^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $$\#$ Range check ${\rm c-val}^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $$\#$ Range check is-bge $\in \{0,1\}$ - Performed in the CPU component $$\#$ represents the co
```

The set of constraints above is similar to ones for the bgeu instruction in Section 8.6.5, except that it takes the signs of a-val and b-val into account when enforcing the way pc-next is computed.

#### 8.7 Basic Instruction Set: Load Instructions

#### 8.7.1 LB Instruction

The parameters and functionality for the LB instruction are as follows:

```
    opcode: LB
    Parameters: (a-val, b-val, c-val)
    Instruction selector: is-lb = 1
    Functionality:

            base-addr := b-val + c-val mod 2<sup>32</sup>
            a-val ← sext(M[base-addr])
```

The mapping from the 1b instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a             | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------------------|------|------|-------|-------|--------|---------|
| 1b         | $^{\mathrm{rd}}$ | rs1  | i    | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

# Constraints assuming large fields

```
/\!\!/ Computing memory read address b-val + c-val
\bullet \quad (\mathtt{is\text{-}lb}) \cdot (\mathtt{b\text{-}val} + \mathtt{c\text{-}val} - \mathtt{h\text{-}ram\text{-}base\text{-}addr} - \mathtt{h\text{-}carry} \cdot 2^{32}) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-lb) \cdot (h-carry) \cdot (1 - h-carry) = 0
# Reading byte from memory address h-ram-base-addr
   (\mathtt{is-lb}) \cdot (\mathrm{Read}_{\mathrm{RAM}}(\mathtt{clk},\mathtt{h-ram-base-addr},0) - \mathtt{ram-val1}) = 0
// Extracting sign bit from ram-val1
   (\mathtt{is\text{-}lb}) \cdot (\mathtt{h\text{-}ram\text{-}val\text{-}rem} + \mathtt{h\text{-}ram\text{-}val\text{-}sgn} \cdot 2^7 - \mathtt{ram\text{-}val1}) = 0
    (is-lb) \cdot (h-ram-val-rem \in [0, 2^7 - 1]) = 0
   (\mathtt{is	ext{-}lb}) \cdot (\mathtt{h	ext{-}ram	ext{-}val	ext{-}sgn}) \cdot (ar{1} - \mathtt{h	ext{-}ram	ext{-}val	ext{-}sgn}) = 0
// Performing sign extension of ram-val1
   (\mathtt{is\text{-}lb}) \cdot (\mathtt{ram\text{-}val1} + \mathtt{h\text{-}ram\text{-}val\text{-}sgn} \cdot (2^{24} - 1) \cdot (2^8) - \mathtt{h\text{-}ram\text{-}sext\text{-}val}) = 0
# Setting output to h-ram-sext-val
 \bullet \quad (\mathtt{is\text{-}lb}) \cdot (\mathtt{a\text{-}val} - \mathtt{h\text{-}ram\text{-}sext\text{-}val}) = 0 
/\!\!/ Range check h-ram-base-addr \in \left[0,2^{32}-1
ight] - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-val1 \in \left[0,2^8-1
ight] - Performed in the RAM component
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-lb \in \{0,1\} - Performed in the CPU component
```

```
/\!\!/ Computing memory read address b-val + c-val
/\!\!/ h-carry ^{(j)} for j=1,2,3,4 used for carry handling
    (\mathtt{is\text{-}lb}) \cdot (\mathtt{b\text{-}val}^{(1)} + \mathtt{c\text{-}val}^{(1)} - \mathtt{h\text{-}ram\text{-}base\text{-}addr}^{(1)} - \mathtt{h\text{-}carry}^{(1)} \cdot 2^8) = 0
    \begin{array}{l} (\text{is-lb}) \cdot (\text{b-val}^{(2)} + \text{c-val}^{(2)} + \text{h-carry}^{(1)} - \text{h-ram-base-addr}^{(2)} - \text{h-carry}^{(2)} \cdot 2^8) = 0 \\ (\text{is-lb}) \cdot (\text{b-val}^{(3)} + \text{c-val}^{(3)} + \text{h-carry}^{(2)} - \text{h-ram-base-addr}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0 \end{array}
     (\texttt{is-lb}) \cdot (\texttt{b-val}^{(4)} + \texttt{c-val}^{(4)} + \texttt{h-carry}^{(3)} - \texttt{h-ram-base-addr}^{(4)} - \texttt{h-carry}^{(4)} \cdot 2^8) = 0
   Enforcing h-carry(j) \in \{0,1\} for j=1,2,3,4
     (\texttt{is-lb}) \cdot (\texttt{h-carry}^{(1)}) \cdot (1 - \texttt{h-carry}^{(1)}) = 0
     (\text{is-lb}) \cdot (\text{h-carry}^{(2)}) \cdot (1 - \text{h-carry}^{(2)}) = 0
     (\texttt{is-lb}) \cdot (\texttt{h-carry}^{(3)}) \cdot (1 - \texttt{h-carry}^{(3)}) = 0
    (\texttt{is-lb}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
/\!\!/ h-ram-base-addr = (h-ram-base-addr(1), h-ram-base-addr(3), h-ram-base-addr(3), h-ram-base-addr(4))
# Reading byte ram-val1 from memory address h-ram-base-addr
    (is-lb) \cdot (Read_{RAM}(clk, h-ram-base-addr, 0) - ram-val1) = 0
# Extracting sign bit from ram-val1
    (\mathtt{is\text{-}lb}) \cdot (\mathtt{h\text{-}ram\text{-}val\text{-}rem} + \mathtt{h\text{-}ram\text{-}val\text{-}sgn} \cdot 2^7 - \mathtt{ram\text{-}val1}) = 0
    (is-lb) \cdot (h-ram-val-rem \in [0, 2^7 - 1]) = 0
    (is-lb) \cdot (h-ram-val-sgn) \cdot (1 - h-ram-val-sgn) = 0
// Performing sign extension of ram-val1
    (\texttt{is-lb}) \cdot (\texttt{ram-val1} - \texttt{h-ram-sext-val}^{(1)}) = 0
      \begin{array}{l} \text{(is-lb)} \cdot (\text{h-ram-val-sgn} \cdot (2^8-1) - \text{h-ram-sext-val}^{(2)}) = 0 \\ \text{(is-lb)} \cdot (\text{h-ram-val-sgn} \cdot (2^8-1) - \text{h-ram-sext-val}^{(3)}) = 0 \end{array} 
     (\text{is-lb}) \cdot (\text{h-ram-val-sgn} \cdot (2^8 - 1) - \text{h-ram-sext-val}^{(4)}) = 0
// Setting output to h-ram-sext-val
    (\mathtt{is\text{-}lb}) \cdot (\mathtt{a\text{-}val}^{(1)} - \mathtt{h\text{-}ram\text{-}sext\text{-}val}^{(1)}) = 0
    (\texttt{is-lb}) \cdot (\texttt{a-val}^{(2)} - \texttt{h-ram-sext-val}^{(2)}) = 0
    (\text{is-lb}) \cdot (\text{a-val}^{(3)} - \text{h-ram-sext-val}^{(3)}) = 0
    (is-lb) \cdot (a-val^{(4)} - h-ram-sext-val^{(4)}) = 0
/\!\!/ Range check h-ram-base-addr^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-val1 \in \left[0,2^8-1
ight] - Performed in the RAM component
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)}\in igl[0,2^8-1igr] for j=1,2,3,4 - Performed in the CPU component
# Range check c-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-lb \in \{0,1\} - Performed in the CPU component
```

# 8.7.2 LH Instruction

The parameters and functionality for the LH instruction are as follows:

```
opcode: LH
Parameters: (a-val, b-val, c-val)
Instruction selector: is-lh = 1
Functionality:
```

- base-addr := b-val + c-val mod  $2^{32}$ - a-val  $\leftarrow \text{sext}(M[\text{base-addr} + 1] \parallel M[\text{base-addr}])$ 

• Requirements: base-addr needs to be a multiple of 2 for memory alignment

The mapping from the 1h instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a             | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------------------|------|------|-------|-------|--------|---------|
| lh         | $^{\mathrm{rd}}$ | rs1  | i    | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

## Constraints assuming large fields

```
/\!\!/ Computing memory read address b-val + c-val \equiv h-ram-base-addr \cdot 2
// h-ram-base-addr used to enforce memory alignment
\bullet \quad (\mathtt{is-lh}) \cdot (\mathtt{b-val} + \mathtt{c-val} - 2 \cdot \mathtt{h-ram-base-addr-aux} - \mathtt{h-carry} \cdot 2^{32}) = 0
   (\mathtt{is-lh}) \cdot (2 \cdot \mathtt{h-ram-base-addr-aux} - \mathtt{h-ram-base-addr}) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-lh) \cdot (h-carry) \cdot (1 - h-carry) = 0
/\!\!/ Enforcing h-ram-base-addr-aux \in \left[0,2^{31}-1\right]
  (\texttt{is-lh}) \cdot (\texttt{h-ram-base-addr-aux} \in \left\lceil 0, 2^{31} - 1 \right\rceil) = 0
# Reading byte ram-val1 from memory address h-ram-base-addr
/\!\!/ Reading byte ram-val2 from memory address h-ram-base-addr + 1
    (is-lh) \cdot (Read_{RAM}(clk, h-ram-base-addr, 0) - ram-val1) = 0
    (\mathtt{is\text{-}lh}) \cdot (\mathrm{Read}_{\mathrm{RAM}}(\mathtt{clk}, \mathtt{h\text{-}ram\text{-}base\text{-}addr}, 1) - \mathtt{ram\text{-}val2}) = 0
# Extracting sign bit from ram-val2
• (is-lh) · (h-ram-val-rem + h-ram-val-sgn · 2^7 - ram-val2) = 0
  (is-lh) \cdot (h-ram-val-rem \in [0, 2^7 - 1]) = 0
• (is-lh) · (h-ram-val-sgn) · (1 - h-ram-val-sgn) = 0
// Performing sign extension of (ram-val1, ram-val2)
\bullet \quad (\mathtt{is-lh}) \cdot (\mathtt{ram-val1} + \mathtt{ram-val2} \cdot 2^8 + \mathtt{h-ram-val-sgn} \cdot (2^{16} - 1) \cdot (2^{16}) - \mathtt{h-ram-sext-val}) = 0
// Setting output to h-ram-sext-val
• (is-lh) \cdot (a-val - h-ram-sext-val) = 0
/\!\!/ Range check h-ram-base-addr \in \left[0,2^{32}-1
ight] - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-vali \in \left[0, 2^8 - 1
ight] for i = 1, 2 - Performed in the RAM component
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-lh \in \{0,1\} - Performed in the CPU component
```

```
/\!\!/ Computing memory read address b-val + c-val \equiv h-ram-base-addr \cdot 2
# h-ram-base-addr used to enforce memory alignment
// h-carry(j) for j = 1, 2, 3, 4 used for carry handling
    (\mathtt{is-lh}) \cdot (\mathtt{b-val}^{(1)} + \mathtt{c-val}^{(1)} - 2 \cdot \mathtt{h-ram-base-addr-aux} - \mathtt{h-carry}^{(1)} \cdot 2^8) = 0
      (is-lh) \cdot (2 \cdot h-ram-base-addr-aux — h-ram-base-addr^{(1)}) = 0
   \begin{array}{l} (\text{is-lh}) \cdot (2 \cdot \text{h-lam-base-addr-adx} - \text{h-lam-base-addr}^{(2)}) = 0 \\ (\text{is-lh}) \cdot (\text{b-val}^{(2)} + \text{c-val}^{(2)} + \text{h-carry}^{(1)} - \text{h-ram-base-addr}^{(2)} - \text{h-carry}^{(2)} \cdot 2^8) = 0 \\ (\text{is-lh}) \cdot (\text{b-val}^{(3)} + \text{c-val}^{(3)} + \text{h-carry}^{(2)} - \text{h-ram-base-addr}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0 \\ (\text{is-lh}) \cdot (\text{b-val}^{(4)} + \text{c-val}^{(4)} + \text{h-carry}^{(3)} - \text{h-ram-base-addr}^{(4)} - \text{h-carry}^{(4)} \cdot 2^8) = 0 \end{array}
// Enforcing h-carry (j) \in \{0,1\} for j=1,2,3,4
 \bullet \quad (\texttt{is-lh}) \cdot (\texttt{h-carry}^{(1)}) \cdot (1 - \texttt{h-carry}^{(1)}) = 0 
      (\mathtt{is\text{-}lh}) \cdot (\mathtt{h\text{-}carry}^{(2)}) \cdot (1-\mathtt{h\text{-}carry}^{(2)}) = 0
      \begin{array}{l} \text{(is-lh)} \cdot (\text{h-carry}^{(3)}) \cdot (1 - \text{h-carry}^{(3)}) = 0 \\ \text{(is-lh)} \cdot (\text{h-carry}^{(4)}) \cdot (1 - \text{h-carry}^{(4)}) = 0 \end{array}
/\!\!/ Enforcing h-ram-base-addr-aux \in \left[0,2^7-1
ight]
    (\mathtt{is-lh}) \cdot (\mathtt{h-ram-base-addr-aux} \in \left[0,2^7-1\right]) = 0
 \textit{\#} \ \texttt{h-ram-base-addr} = (\texttt{h-ram-base-addr}^{(1)}, \texttt{h-ram-base-addr}^{(3)}, \texttt{h-ram-base-addr}^{(3)}, \texttt{h-ram-base-addr}^{(4)}) 
# Reading byte ram-val1 from memory address h-ram-base-addr
/\!\!/ Reading byte ram-val2 from memory address h-ram-base-addr + 1
• (is-lh) \cdot (\text{Read}_{RAM}(\text{clk}, \text{h-ram-base-addr}, 0) - \text{ram-val1}) = 0
```

```
\bullet \quad (\mathtt{is\text{-}lh}) \cdot (\mathrm{Read}_{\mathrm{RAM}}(\mathtt{clk}, \mathtt{h\text{-}ram\text{-}base\text{-}addr}, 1) - \mathtt{ram\text{-}val2}) = 0
// Extracting sign bit from ram-val2
    (\mathtt{is\text{-}lh}) \cdot (\mathtt{h\text{-}ram\text{-}val\text{-}rem} + \mathtt{h\text{-}ram\text{-}val\text{-}sgn} \cdot 2^7 - \mathtt{ram\text{-}val2}) = 0
    (is-lh) \cdot (h-ram-val-rem \in [0, 2^7 - 1]) = 0
    (is-lh) \cdot (h-ram-val-sgn) \cdot (1 - h-ram-val-sgn) = 0
// Performing sign extension of (ram-val1,ram-val2)
    (\mathtt{is\text{-}lh}) \cdot (\mathtt{ram\text{-}val1} - \mathtt{h\text{-}ram\text{-}sext\text{-}val}^{(1)}) = 0
    (\mathtt{is\text{-}lh}) \cdot (\mathtt{ram\text{-}val2} - \mathtt{h\text{-}ram\text{-}sext\text{-}val}^{(2)}) = 0
    // Setting output to h-ram-sext-val
    (\mathtt{is\text{-}lh}) \cdot (\mathtt{a\text{-}val}^{(1)} - \mathtt{h\text{-}ram\text{-}sext\text{-}val}^{(1)}) = 0
    (\texttt{is-lh}) \cdot (\texttt{a-val}^{(2)} - \texttt{h-ram-sext-val}^{(2)}) = 0
    (is-lh) \cdot (a-val^{(3)} - h-ram-sext-val^{(3)}) = 0
    (\texttt{is-lh}) \cdot (\texttt{a-val}^{(4)} - \texttt{h-ram-sext-val}^{(4)}) = 0
/\!\!/ Range check h-ram-base-addr^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-valj\in \left[0,2^8-1
ight] for j=1,2 - Performed in the RAM component
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)}\in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
# Range check c-val^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4 - Performed in the CPU component
/\!\!/ Range check is-lh \in \{0,1\} - Performed in the CPU component
```

#### 8.7.3 LW Instruction

The parameters and functionality for the LW instruction are as follows:

```
opcode: LW
Parameters: (a-val, b-val, c-val)
Instruction selector: is-lw = 1
```

• Functionality:

```
- base-addr := b-val + c-val mod 2^{32}

- a-val ← M[\text{base-addr} + 3] \parallel M[\text{base-addr} + 2] \parallel M[\text{base-addr} + 1] \parallel M[\text{base-addr}]
```

• Requirements: base-addr needs to be a multiple of 4 for memory alignment

The mapping from the lw instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------|------|------|-------|-------|--------|---------|
| lw         | rd   | rs1  | i    | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

```
/\!\!/ Reading byte ram-val2 from memory address h-ram-base-addr +\,1
/\!\!/ Reading byte ram-val3 from memory address h-ram-base-addr +\;2
/\!\!/ Reading byte ram-val4 from memory address h-ram-base-addr +3
• (is-lw) \cdot (\operatorname{Read}_{\operatorname{RAM}}(\operatorname{clk},\operatorname{h-ram-base-addr},0) - \operatorname{ram-val1}) = 0
   (is-lw) \cdot (Read_{RAM}(clk, h-ram-base-addr, 1) - ram-val2) = 0
\bullet \quad (\mathtt{is\text{-lw}}) \cdot (\mathrm{Read}_{\mathrm{RAM}}(\mathtt{clk}, \mathtt{h\text{-ram-base-addr}}, 2) - \mathtt{ram\text{-val3}}) = 0
  (\mathtt{is-lw}) \cdot (\mathrm{Read}_{\mathrm{RAM}}(\mathtt{clk},\mathtt{h-ram-base-addr},3) - \mathtt{ram-val4}) = 0
/\!\!/ Computing h-ram-sext-val from ram-vali
• (is-lw) \cdot (ram-val1 + ram-val2 \cdot 2^8 + ram-val3 \cdot 2^{16} + ram-val4 \cdot 2^{24} - h-ram-sext-val) = 0
// Setting output to h-ram-sext-val
• (is-lw) \cdot (a-val - h-ram-sext-val) = 0
/\!\!/ Range check h-ram-base-addr \in \left[0,2^{32}-1
ight] - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-valj \in \left[0, 2^8-1
ight] for j=1,2,3,4 - Performed in the RAM component
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
// Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
// Range check c-val \in \left[ ar{0}, 2^{32} - 1 
ight] - Performed in the CPU component
/\!\!/ Range check is-lw \in \{0,1\} - Performed in the CPU component
```

```
/\!\!/ Computing memory read address b-val + c-val \equiv h-ram-base-addr \cdot 4
# h-ram-base-addr used to enforce memory alignment
# h-carry for j = 1, 2, 3, 4 used for carry handling
• (is-lw) \cdot (b-val^{(1)} + c-val^{(1)} - 4 \cdot h-ram-base-addr-aux - h-carry^{(1)} \cdot 2^8) = 0
   (\mathtt{is-lw}) \cdot (4 \cdot \mathtt{h-ram-base-addr-aux} - \mathtt{h-ram-base-addr}^{(1)}) = 0
   \begin{array}{l} (\text{is-lw}) \cdot (\text{b-val}^{(2)} + \text{c-val}^{(2)} + \text{h-carry}^{(1)} - \text{h-ram-base-addr}^{(2)} - \text{h-carry}^{(2)} \cdot 2^8) = 0 \\ (\text{is-lw}) \cdot (\text{b-val}^{(3)} + \text{c-val}^{(3)} + \text{h-carry}^{(2)} - \text{h-ram-base-addr}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0 \\ (\text{is-lw}) \cdot (\text{b-val}^{(4)} + \text{c-val}^{(4)} + \text{h-carry}^{(3)} - \text{h-ram-base-addr}^{(4)} - \text{h-carry}^{(4)} \cdot 2^8) = 0 \end{array}
/\!\!/ Enforcing h-ram-base-addr-aux \in [0, 2^6 - 1]
• (is-lw) \cdot (h-ram-base-addr-aux \in [0, 2^6 - 1]) = 0
# Enforcing h-carry(j) \in \{0,1\} for j = 1,2,3,4
    (\texttt{is-lw}) \cdot (\texttt{h-carry}^{(1)}) \cdot (1 - \texttt{h-carry}^{(1)}) = 0
    (is-lw) \cdot (h-carry^{(2)}) \cdot (1-h-carry^{(2)}) = 0
    (\text{is-lw}) \cdot (\text{h-carry}^{(3)}) \cdot (1 - \text{h-carry}^{(3)}) = 0
    (\text{is-lw}) \cdot (\text{h-carry}^{(4)}) \cdot (1 - \text{h-carry}^{(4)}) = 0
\# h-ram-base-addr = (h-ram-base-addr ^{(1)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(4)})
# Reading byte ram-val1 from memory address h-ram-base-addr
\# Reading byte ram-val2 from memory address h-ram-base-addr + 1
\# Reading byte ram-val3 from memory address h-ram-base-addr +2
\# Reading byte ram-val4 from memory address h-ram-base-addr +3
• (is-lw) \cdot (\text{Read}_{RAM}(\text{clk}, \text{h-ram-base-addr}, 0) - \text{ram-val1}) = 0
    (is-lw) \cdot (Read_{RAM}(clk, h-ram-base-addr, 1) - ram-val2) = 0
    (\texttt{is-lw}) \cdot (\texttt{Read}_{RAM}(\texttt{clk}, \texttt{h-ram-base-addr}, 2) - \texttt{ram-val3}) = 0
    (is-lw) \cdot (Read_{RAM}(clk, h-ram-base-addr, 3) - ram-val4) = 0
/\!\!/ Setting a-val^{(j)} to ram-valj for j=1,2,3,4
   (\texttt{is-lw}) \cdot (\texttt{a-val}^{(1)} - \texttt{ram-val1}) = 0
   (\mathsf{is}\text{-lw}) \cdot (\mathsf{a}\text{-val}^{(2)} - \mathsf{ram}\text{-val2}) = 0
   (\mathsf{is}\text{-lw}) \cdot (\mathsf{a}\text{-val}^{(3)} - \mathsf{ram}\text{-val3}) = 0
  (\mathtt{is-lw}) \cdot (\mathtt{a-val}^{(4)} - \mathtt{ram-val4}) = 0
/\!\!/ Range check h-ram-base-addr^{(j)}\in [0,2^8-1] for j=1,2,3,4 - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-valj \in \left[0, 2^8 - 1
ight] for j = 1, 2, 3, 4 - Performed in the RAM component
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
// Range check b-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
```

```
/\!\!/ Range check c-val ^{(j)}\in\left[0,2^8-1\right] for j=1,2,3,4 - Performed in the CPU component /\!\!/ Range check is-lw \in\{0,1\} - Performed in the CPU component
```

#### 8.7.4 LBU Instruction

The parameters and functionality for the LBU instruction are as follows:

```
opcode: LBUParameters: (a-val, b-val, c-val)
```

Parameters: (a-vai, b-vai, c-vai
 Instruction selector: is-lbu = 1

• Functionality:

- base-addr  $\coloneqq$  b-val + c-val mod  $2^{32}$ 

 $- \text{ a-val} \leftarrow \text{zext}(M[\text{base-addr}])$ 

where zext is the zero extension function

The mapping from the 1bu instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a             | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------------------|------|------|-------|-------|--------|---------|
| lbu        | $^{\mathrm{rd}}$ | rs1  | i    | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

#### Constraints assuming large fields

```
/\!\!/ Computing memory read address b-val + c-val
\bullet \quad (\mathtt{is\text{-}lbu}) \cdot (\mathtt{b\text{-}val} + \mathtt{c\text{-}val} - \mathtt{h\text{-}ram\text{-}base\text{-}addr} - \mathtt{h\text{-}carry} \cdot 2^{32}) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-lbu) \cdot (h-carry) \cdot (1 - h-carry) = 0
# Reading byte from memory address h-ram-base-addr
• (is-lbu) \cdot (Read_{RAM}(clk, h-ram-base-addr, 0) - ram-val1) = 0
# Performing zero extension of ram-val1
• (is-lbu) \cdot (ram-val1 - h-ram-zext-val) = 0
# Setting a-val to h-ram-zext-val
\bullet \quad (\mathtt{is\text{-}lbu}) \cdot (\mathtt{a\text{-}val} - \mathtt{h\text{-}ram\text{-}zext\text{-}val}) = 0
/\!\!/ Range check h-ram-base-addr \in \left[0,2^{32}-1
ight] - Guaranteed by the RAM memory checking
// Range check ram-val1 \in \left\lceil 0, 2^8 - 1 \right
ceil - Performed in the RAM component
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
# Range check b-val \in [0,2^{32}-1] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check is-lbu \in \{0,1\} - Performed in the CPU component
```

```
(\mathtt{is\text{-}lbu}) \cdot (\mathtt{h\text{-}carry}^{(1)}) \cdot (1-\mathtt{h\text{-}carry}^{(1)}) = 0
    (\texttt{is-lbu}) \cdot (\texttt{h-carry}^{(2)}) \cdot (1 - \texttt{h-carry}^{(2)}) = 0
    (\texttt{is-lbu}) \cdot (\texttt{h-carry}^{(3)}) \cdot (1 - \texttt{h-carry}^{(3)}) = 0
    (\texttt{is-lbu}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
/\!\!/ h-ram-base-addr = (h-ram-base-addr ^{(1)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(4)})
# Reading byte ram-val1 from memory address h-ram-base-addr
  (\mathtt{is}\mathtt{-lbu}) \cdot (\mathrm{Read}_{\mathrm{RAM}}(\mathtt{clk},\mathtt{h}\mathtt{-ram}\mathtt{-base}\mathtt{-addr},0) - \mathtt{ram}\mathtt{-val1}) = 0
# Performing zero extension of ram-val1
    (\texttt{is-lbu}) \cdot (\texttt{ram-val1} - \texttt{h-ram-zext-val}^{(1)}) = 0
    (is-lbu) \cdot (h-ram-zext-val^{(2)}) = 0
    (is-lbu) \cdot (h-ram-zext-val^{(3)}) = 0
    (is-lbu) \cdot (h-ram-zext-val^{(4)}) = 0
// Setting output to h-ram-zext-val
    (\mathtt{is\text{-}lbu}) \cdot (\mathtt{a\text{-}val}^{(1)} - \mathtt{h\text{-}ram\text{-}zext\text{-}val}^{(1)}) = 0
    \begin{array}{l} (\text{is-lbu}) \cdot (\text{a-val}^{(2)} - \text{h-ram-zext-val}^{(2)}) = 0 \\ (\text{is-lbu}) \cdot (\text{a-val}^{(3)} - \text{h-ram-zext-val}^{(3)}) = 0 \\ (\text{is-lbu}) \cdot (\text{a-val}^{(4)} - \text{h-ram-zext-val}^{(4)}) = 0 \end{array}
/\!\!/ Range check h-ram-base-addr^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-val1 \in \left[0,2^8-1
ight] - Performed in the RAM component
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
// Range check c-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-lbu \in \{0,1\} - Performed in the CPU component
```

#### 8.7.5 LHU Instruction

The parameters and functionality for the LHU instruction are as follows:

```
• opcode: LHU
• Parameters: (a-val, b-val, c-val)
• Instruction selector: is-lhu = 1
• Functionality:

- base-addr := b-val + c-val mod 2^{32}
- a-val \leftarrow M[\text{base-addr} + 1] \parallel M[\text{base-addr}]
```

where zext is the zero extension function

• Requirements: base-addr needs to be a multiple of 2 for memory alignment

The mapping from the 1hu instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------|------|------|-------|-------|--------|---------|
| lhu        | rd   | rs1  | i    | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

```
\label{eq:computing memory read address b-val} $$ + c-val \equiv h-ram-base-addr \cdot 2 $$ // h-ram-base-addr used to enforce memory alignment $$ (is-lhu) \cdot (b-val + c-val - 2 \cdot h-ram-base-addr-aux - h-carry \cdot 2^{32}) = 0 $$ (is-lhu) \cdot (2 \cdot h-ram-base-addr-aux - h-ram-base-addr) = 0 $$ // Enforcing h-carry \in \{0,1\} $$ (is-lhu) \cdot (h-carry) \cdot (1-h-carry) = 0 $$
```

```
/\!\!/ Enforcing h-ram-base-addr-aux \in \left[0,2^{31}-1
ight]
• (is-lhu) · (h-ram-base-addr-aux \in [0, 2^{31} - 1]) = 0
# Reading byte ram-val1 from memory address h-ram-base-addr
/\!\!/ Reading byte ram-val2 from memory address h-ram-base-addr + 1
   (is-lhu) \cdot (Read_{RAM}(clk, h-ram-base-addr, 0) - ram-val1) = 0
   (\mathtt{is\text{-}lhu}) \cdot (\mathrm{Read}_{\mathrm{RAM}}(\mathtt{clk}, \mathtt{h\text{-}ram\text{-}base\text{-}addr}, 1) - \mathtt{ram\text{-}val2}) = 0
// Performing zero extension of (ram-val1,ram-val2)
• (is-lhu) \cdot (ram-val1 + ram-val2 \cdot 2<sup>8</sup> - h-ram-zext-val) = 0
// Setting output to h-ram-zext-val
• (is-lhu) \cdot (a-val - h-ram-zext-val) = 0
/\!\!/ Range check h-ram-base-addr \in \left[0,2^{32}-1
ight] - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-vali\in \left[0,2^8-1
ight] for i=1,2 - Performed in the RAM component
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
# Range check c-val \in [0,2^{32}-1] - Performed in the CPU component
// Range check is-lhu \in \{0,1\} - Performed in the CPU component
```

```
/\!\!/ Computing memory read address b-val + c-val \equiv h-ram-base-addr \cdot 2
# h-ram-base-addr used to enforce memory alignment
# h-carry for j = 1, 2, 3, 4 used for carry handling
• (is-lhu) \cdot (b-val<sup>(1)</sup> + c-val<sup>(1)</sup> - 2 \cdot h-ram-base-addr-aux - h-carry<sup>(1)</sup> \cdot 2<sup>8</sup>) = 0
   (is-lhu) \cdot (2 \cdot h-ram-base-addr-aux - h-ram-base-addr^{(1)}) = 0
  (\texttt{is-lhu}) \cdot (\texttt{b-val}^{(2)} + \texttt{c-val}^{(2)} + \texttt{h-carry}^{(1)} - \texttt{h-ram-base-addr}^{(2)} - \texttt{h-carry}^{(2)} \cdot 2^8) = 0
   (\text{is-lhu}) \cdot (\text{b-val}^{(3)} + \text{c-val}^{(3)} + \text{h-carry}^{(2)} - \text{h-ram-base-addr}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0
   (\text{is-lhu}) \cdot (\text{b-val}^{(4)} + \text{c-val}^{(4)} + \text{h-carry}^{(3)} - \text{h-ram-base-addr}^{(4)} - \text{h-carry}^{(4)} \cdot 2^8) = 0
# Enforcing h-carry(j) \in \{0,1\} for j = 1,2,3,4
   (\mathtt{is-lhu}) \cdot (\mathtt{h-carry}^{(1)}) \cdot (1-\mathtt{h-carry}^{(1)}) = 0
   (\text{is-lhu}) \cdot (\text{h-carry}^{(2)}) \cdot (1 - \text{h-carry}^{(2)}) = 0
    (is-lhu) \cdot (h-carry^{(3)}) \cdot (1-h-carry^{(3)}) = 0
   (\texttt{is-lhu}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
# Enforcing h-ram-base-addr-aux \in [0, 2^7 - 1]
   (is-lhu) \cdot (h-ram-base-addr-aux \in [0, 2^7 - 1]) = 0
/\!\!/ h-ram-base-addr = (h-ram-base-addr ^{(1)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(4)})
# Reading byte ram-val1 from memory address h-ram-base-addr
/\!\!/ Reading byte ram-val2 from memory address h-ram-base-addr +1
    (is-lhu) \cdot (Read_{RAM}(clk, h-ram-base-addr, 0) - ram-val1) = 0
   (\texttt{is-lhu}) \cdot (\texttt{Read}_{RAM}(\texttt{clk}, \texttt{h-ram-base-addr}, 1) - \texttt{ram-val2}) = 0
// Performing zero extension of (ram-val1, ram-val2)
• (is-lhu) \cdot (ram-val1 - h-ram-zext-val<sup>(1)</sup>) = 0
    (is-lhu) \cdot (ram-val2 - h-ram-zext-val^{(2)}) = 0
   (is-lhu) \cdot (h-ram-sext-val^{(3)}) = 0
   (is-lhu) \cdot (h-ram-sext-val^{(4)}) = 0
// Setting output to h-ram-zext-val
  (\texttt{is-lhu}) \cdot (\texttt{a-val}^{(1)} - \texttt{h-ram-zext-val}^{(1)}) = 0
   (is-lhu) \cdot (a-val^{(2)} - h-ram-zext-val^{(2)}) = 0
  (is-lhu) \cdot (a-val^{(3)} - h-ram-zext-val^{(3)}) = 0
  (\text{is-lhu}) \cdot (\text{a-val}^{(4)} - \text{h-ram-zext-val}^{(4)}) = 0
/\!\!/ Range check h-ram-base-addr^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-valj \in \left[0, 2^8 - 1
ight] for j = 1, 2 - Performed in the RAM component
/\!\!/ Range check a-val^{(j)}\in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
```

```
\label{eq:Range_check_b-val} \begin{subarray}{l} $\#$ Range check ${\tt b-val}^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range check ${\tt c-val}^{(j)} \in \left[0,2^8-1\right]$ for $j=1,2,3,4$ - Performed in the CPU component $\#$ Range check is-lhu $\in \{0,1\}$ - Performed in the CPU component
```

## 8.8 Basic Instruction Set: Store Instructions

#### 8.8.1 SB Instruction

The parameters and functionality for the SB instruction are as follows:

```
opcode: SB
Parameters: (a-val, b-val, c-val)
Instruction selector: is-sb = 1
```

- Functionality:
  - base-addr  $\coloneqq$  b-val + c-val mod  $2^{32}$
  - $-M[\text{base-addr}] \leftarrow \text{a-val \& 0x000000FF}$

The mapping from the sb instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| sb         | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

#### Constraints assuming large fields

```
/\!\!/ Computing memory write address b-val + c-val \equiv h-ram-base-addr
  (is-sb) \cdot (b-val + c-val - h-ram-base-addr - h-carry \cdot 2^{32}) = 0
  Enforcing h-carry \in \{0,1\}
   (\texttt{is-sb}) \cdot (\texttt{h-carry}) \cdot (1 - \texttt{h-carry}) = 0
# Extracting ram-val1 from bits 0-7 of a-val
   (\texttt{is-sb}) \cdot (\texttt{a-val} - \texttt{h-ram-val-rem} \cdot 2^8 - \texttt{ram-val1}) = 0
   (\texttt{is-sb}) \cdot (\texttt{ram-val1} \in \left[0, 2^8 - 1\right]) = 0
   (\mathtt{is\text{-}sb}) \cdot (\mathtt{h\text{-}ram\text{-}val\text{-}rem} \in \left\lceil 0, 2^{\overset{\checkmark}{2}4} - 1 \right\rceil) = 0
# Writing byte ram-val1 at memory address h-ram-base-addr
  (is-sb) \cdot (Write_{RAM}(clk, ram-val1, h-ram-base-addr, 0) - ram-val1) = 0
/\!\!/ Range check h-ram-base-addr \in \left[0,2^{32}-1
ight] - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-val1 \in \left[0,2^8-1
ight] - Also performed in the RAM component
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[ 0,2^{32}-1 
ight] - Performed in the CPU component
/\!\!/ Range check is-sb \in \{0,1\} - Performed in the CPU component
```

```
// Enforcing h-carry (j) \in \{0,1\} for j=1,2,3,4
   (\texttt{is-sb}) \cdot (\texttt{h-carry}^{(1)}) \cdot (1 - \texttt{h-carry}^{(1)}) = 0
    (\texttt{is-sb}) \cdot (\texttt{h-carry}^{(2)}) \cdot (1 - \texttt{h-carry}^{(2)}) = 0
   (is-sb) \cdot (h-carry^{(3)}) \cdot (1-h-carry^{(3)}) = 0
   (\mathsf{is}\mathsf{-sb}) \cdot (\mathsf{h}\mathsf{-carry}^{(4)}) \cdot (1 - \mathsf{h}\mathsf{-carry}^{(4)}) = 0
/\!\!/ Extracting ram-val1 from bits 0-7 of a-val
• (is-sb) \cdot (a-val^{(1)} - ram-val1) = 0
/\!\!/ h-ram-base-addr = (h-ram-base-addr ^{(1)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(4)})
// Writing byte ram-val1 at memory address h-ram-base-addr
• (is-sb) \cdot (Write_{RAM}(clk, ram-val1, h-ram-base-addr, 0) - ram-val1) = 0
/\!\!/ Range check h-ram-base-addr^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-val1 \in \left[0,2^8-1
ight] - Follows from a-val^{(1)} constraint
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)}\in igl[0,2^8-1igr] for j=1,2,3,4 - Performed in the CPU component
// Range check c-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-sb \in \{0,1\} - Performed in the CPU component
```

#### 8.8.2 SH Instruction

The parameters and functionality for the SH instruction are as follows:

```
opcode: SH
Parameters: (a-val, b-val, c-val)
Instruction selector: is-sh = 1
```

- Functionality:
  - base-addr := b-val + c-val mod  $2^{32}$
  - $-M[\text{base-addr}] \leftarrow \text{a-val \& 0x000000FF}$
  - $-M[\text{base-addr}+1] \leftarrow \text{a-val \& 0x0000FF00}$
- Requirements: base-addr needs to be a multiple of 2 for memory alignment

The mapping from the sh instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| sh         | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

```
• (is-sh) · (h-ram-val-rem \in \left[0,2^{16}-1\right]) = 0 

# h-ram-base-addr = (h-ram-base-addr^{(1)}, h-ram-base-addr^{(3)}, h-ram-base-addr^{(3)}, h-ram-base-addr^{(4)}) 

# Writing byte ram-val1 at memory address h-ram-base-addr + 1 

• (is-sh) · (Write_{RAM}(clk, ram-val1, h-ram-base-addr, 0) - ram-val1) = 0 

• (is-sh) · (Write_{RAM}(clk, ram-val2, h-ram-base-addr, 1) - ram-val2) = 0 

# Range check h-ram-base-addr \in \left[0,2^{32}-1\right] - Guaranteed by the RAM memory checking 

# Range check ram-vali \in \left[0,2^{8}-1\right] for i=1,2 - Also performed in the RAM component 

# Range check a-val \in \left[0,2^{32}-1\right] - Performed in the CPU component 

# Range check c-val \in \left[0,2^{32}-1\right] - Performed in the CPU component 

# Range check is-sh \in \left\{0,1\right\} - Performed in the CPU component
```

#### Constraints assuming small fields

```
/\!\!/ Computing memory write address b-val + c-val \equiv h-ram-base-addr \cdot 2
# h-ram-base-addr used to enforce memory alignment
# h-carry (j) for j=1,2,3,4 used for carry handling
   (\text{is-sh}) \cdot (\text{b-val}^{(1)} + \text{c-val}^{(1)} - 2 \cdot \text{h-ram-base-addr-aux} - \text{h-carry}^{(1)} \cdot 2^8) = 0
     (\mathtt{is}\mathtt{-sh}) \cdot (2 \cdot \mathtt{h}\mathtt{-ram}\mathtt{-base}\mathtt{-addr}\mathtt{-aux} - \mathtt{h}\mathtt{-ram}\mathtt{-base}\mathtt{-addr}^{(1)}) = 0
    \begin{array}{l} (\text{is-sh}) \cdot (\text{b-val}^{(2)} + \text{c-val}^{(2)} + \text{h-carry}^{(1)} - \text{h-ram-base-addr}^{(2)} - \text{h-carry}^{(2)} \cdot 2^8) = 0 \\ (\text{is-sh}) \cdot (\text{b-val}^{(3)} + \text{c-val}^{(3)} + \text{h-carry}^{(2)} - \text{h-ram-base-addr}^{(3)} - \text{h-carry}^{(3)} \cdot 2^8) = 0 \\ (\text{is-sh}) \cdot (\text{b-val}^{(4)} + \text{c-val}^{(4)} + \text{h-carry}^{(3)} - \text{h-ram-base-addr}^{(4)} - \text{h-carry}^{(3)} \cdot 2^8) = 0 \end{array}
// Enforcing h-carry (j) \in \{0,1\} for j=1,2,3,4
     (\mathtt{is\text{-}sh}) \cdot (\mathtt{h\text{-}carry}^{(1)}) \cdot (1-\mathtt{h\text{-}carry}^{(1)}) = 0
   \begin{array}{l} (\texttt{is-sh}) \cdot (\texttt{h-carry}^{(2)}) \cdot (\texttt{1-h-carry}^{(2)}) = 0 \\ (\texttt{is-sh}) \cdot (\texttt{h-carry}^{(2)}) \cdot (\texttt{1-h-carry}^{(2)}) = 0 \\ (\texttt{is-sh}) \cdot (\texttt{h-carry}^{(3)}) \cdot (\texttt{1-h-carry}^{(3)}) = 0 \\ (\texttt{is-sh}) \cdot (\texttt{h-carry}^{(4)}) \cdot (\texttt{1-h-carry}^{(4)}) = 0 \end{array}
/\!\!/ Enforcing h-ram-base-addr-aux \in \left[0,2^7-1
ight]
     (\mathtt{is}\mathtt{-sh})\cdot(\mathtt{h}\mathtt{-ram}\mathtt{-base}\mathtt{-addr}\mathtt{-aux}\in\left[0,2^7-1
ight])=0
// Extracting (ram-val1, ram-val2) from bits 0-15 of a-val
     (\mathtt{is}\mathtt{-sh})\cdot(\mathtt{a}\mathtt{-val}^{(1)}-\mathtt{ram}\mathtt{-val1})=0
     (\mathsf{is}\mathsf{-sh}) \cdot (\mathsf{a}\mathsf{-val}^{(2)} - \mathsf{ram}\mathsf{-val2}) = 0
\# h-ram-base-addr = (h-ram-base-addr ^{(1)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(4)})
# Writing byte ram-val1 at memory address h-ram-base-addr
/\!\!/ Writing byte ram-val2 at memory address h-ram-base-addr +\,1
\bullet \quad (\mathtt{is}\mathtt{-sh}) \cdot (\mathrm{Write}_{\mathrm{RAM}}(\mathtt{clk}, \mathtt{ram}\mathtt{-val1}, \mathtt{h}\mathtt{-ram}\mathtt{-base}\mathtt{-addr}, 0) - \mathtt{ram}\mathtt{-val1}) = 0
     (is-sh) \cdot (Write_{RAM}(clk, ram-val2, h-ram-base-addr, 1) - ram-val2) = 0
/\!\!/ Range check h-ram-base-addr^{(j)}\in[0,2^8-1] for j=1,2,3,4 - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-valj \in \left[0, 2^8 - 1
ight] for j = 1, 2 - Performed in the RAM component
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)}\in igl[0,2^8-1igr] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check c-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
// Range check is-sh \in \{0,1\} - Performed in the CPU component
```

#### 8.8.3 SW Instruction

The parameters and functionality for the SW instruction are as follows:

```
opcode: SWParameters: (a-val, b-val, c-val)
```

```
• Instruction selector: is-sw = 1
```

• Functionality:

```
- base-addr = b-val + c-val mod 2^{32}
-M[\text{base-addr}] \leftarrow \text{a-val } \& \text{ 0x000000FF}
```

- $-M[\text{base-addr}+1] \leftarrow \text{a-val } \& \text{ 0x0000FF00}$
- $-M[\text{base-addr}+2] \leftarrow \text{a-val \& 0x00FF0000}$
- $-M[\text{base-addr}+3] \leftarrow \text{a-val } \& \text{ 0xFF0000000}$

• Requirements: base-addr needs to be a multiple of 4 for memory alignment

The mapping from the sw instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val  | b-val  | c-val   |
|------------|------|------|------|-------|--------|--------|---------|
| sw         | rs1  | rs2  | i    | 1     | R[rs1] | R[rs2] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

#### Constraints assuming large fields

```
/\!\!/ Computing memory write address b-val + c-val \equiv h-ram-base-addr \cdot 4
// h-ram-base-addr used to enforce memory alignment
    (\mathtt{is\text{-}sw}) \cdot (\mathtt{b\text{-}val} + \mathtt{c\text{-}val} - 4 \cdot \mathtt{h\text{-}ram\text{-}base\text{-}addr\text{-}aux} - \mathtt{h\text{-}carry} \cdot 2^{32}) = 0
    (\mathtt{is\text{-}sw}) \cdot (4 \cdot \mathtt{h\text{-}ram\text{-}base\text{-}addr\text{-}aux} - \mathtt{h\text{-}ram\text{-}base\text{-}addr}) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
  (\texttt{is-sw}) \cdot (\texttt{h-carry}) \cdot (1 - \texttt{h-carry}) = 0
/\!\!/ Enforcing h-ram-base-addr-aux \in \left[0,2^{30}-1\right]
    (is-sw) \cdot (h-ram-base-addr-aux \in [0, 2^{30}-1]) = 0
// Extracting (ram-val1, ram-val2, ram-val3, ram-val4) from a-val
    (\texttt{is-sw}) \cdot (\texttt{a-val} - \texttt{ram-val4} \cdot 2^{24} - \texttt{ram-val3} \cdot 2^{16} - \texttt{ram-val2} \cdot 2^8 - \texttt{ram-val1}) = 0
    (\texttt{is-sw}) \cdot (\texttt{ram-val1} \in \left[0, 2^8 - 1\right]) = 0
    (is-sw) \cdot (ram-val2 \in [0, 2^8 - 1]) = 0
    (is-sw) \cdot (ram-val3 \in [0, 2^8 - 1]) = 0
    (is-sw) \cdot (ram-val4 \in [0, 2^8 - 1]) = 0
/\!\!/ h-ram-base-addr = (h-ram-base-addr ^{(1)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(4)})
// Writing byte ram-val1 at memory address h-ram-base-addr
/\!\!/ Writing byte ram-val2 at memory address h-ram-base-addr +\,1
/\!\!/ Writing byte ram-val3 at memory address h-ram-base-addr +\,2
/\!\!/ Writing byte ram-val4 at memory address h-ram-base-addr +\,3
    (is-sw) \cdot (Write_{RAM}(clk, ram-val1, h-ram-base-addr, 0) - ram-val1) = 0
    (is-sw) \cdot (Write_{RAM}(clk, ram-val2, h-ram-base-addr, 1) - ram-val2) = 0
    (is-sw) \cdot (Write_{RAM}(clk, ram-val3, h-ram-base-addr, 2) - ram-val3) = 0
    (\mathtt{is\text{-}sw}) \cdot (\mathrm{Write}_{\mathrm{RAM}}(\mathtt{clk}, \mathtt{ram\text{-}val4}, \mathtt{h\text{-}ram\text{-}base\text{-}addr}, 3) - \mathtt{ram\text{-}val4}) = 0
/\!\!/ Range check h-ram-base-addr \in \left[0,2^{32}-1
ight] - Guaranteed by the RAM memory checking
/\!\!/ Range check ram-valj \in \left[0, 2^8-1
ight] for j=1,2,3,4 - Also performed in the RAM component
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[ 0,2^{32}-1 
ight] - Performed in the CPU component
/\!\!/ Range check is-sw \in \{0,1\} - Performed in the CPU component
```

```
/\!\!/ Computing memory write address b-val + c-val \equiv h-ram-base-addr \cdot 4
# h-ram-base-addr used to enforce memory alignment
// h-carry (j) for j=1,2,3,4 used for carry handling
    (\texttt{is-sw}) \cdot (\texttt{b-val}^{(1)} + \texttt{c-val}^{(1)} - 4 \cdot \texttt{h-ram-base-addr-aux} - \texttt{h-carry}^{(1)} \cdot 2^8) = 0
    (is-sw) \cdot (4 \cdot h-ram-base-addr-aux - h-ram-base-addr^{(1)}) = 0
    (\mathtt{is\text{-}sw}) \cdot (\mathtt{b\text{-}val}^{(2)} + \mathtt{c\text{-}val}^{(2)} + \mathtt{h\text{-}carry}^{(1)} - \mathtt{h\text{-}ram\text{-}base\text{-}addr}^{(2)} - \mathtt{h\text{-}carry}^{(2)} \cdot 2^8) = 0
    (\texttt{is-sw}) \cdot (\texttt{b-val}^{(3)} + \texttt{c-val}^{(3)} + \texttt{h-carry}^{(2)} - \texttt{h-ram-base-addr}^{(3)} - \texttt{h-carry}^{(3)} \cdot 2^8) = 0
    (is-sw) \cdot (b-val^{(4)} + c-val^{(4)} + h-carry^{(3)} - h-ram-base-addr^{(4)} - h-carry^{(3)} \cdot 2^8) = 0
# Enforcing h-carry^{(j)} \in \{0,1\} for j = 1,2,3,4
    (\mathtt{is\text{-}sw}) \cdot (\mathtt{h\text{-}carry}^{(1)}) \cdot (1-\mathtt{h\text{-}carry}^{(1)}) = 0
    (is-sw) \cdot (h-carry^{(2)}) \cdot (1-h-carry^{(2)}) = 0
    (\texttt{is-sw}) \cdot (\texttt{h-carry}^{(3)}) \cdot (1 - \texttt{h-carry}^{(3)}) = 0
    (\mathtt{is\text{-}sw}) \cdot (\mathtt{h\text{-}carry}^{(4)}) \cdot (1-\mathtt{h\text{-}carry}^{(4)}) = 0
# Enforcing h-ram-base-addr-aux \in [0, 2^6 - 1]
    (is-sw) \cdot (h-ram-base-addr-aux \in [0, 2^6 - 1]) = 0
// Extracting (ram-val1, ram-val2, ram-val3, ram-val4) from a-val
    (\texttt{is-sw}) \cdot (\texttt{a-val}^{(1)} - \texttt{ram-val1}) = 0
    (is-sw) \cdot (a-val^{(2)} - ram-val2) = 0
    (\texttt{is-sw}) \cdot (\texttt{a-val}^{(3)} - \texttt{ram-val3}) = 0
    (\texttt{is-sw}) \cdot (\texttt{a-val}^{(4)} - \texttt{ram-val4}) = 0
\# h-ram-base-addr = (h-ram-base-addr ^{(1)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(3)}, h-ram-base-addr ^{(4)})
// Writing byte ram-val1 at memory address h-ram-base-addr
/\!\!/ Writing byte ram-val2 at memory address h-ram-base-addr +\,1
/\!\!/ Writing byte ram-val3 at memory address h-ram-base-addr +\,2
/\!\!/ Writing byte ram-val4 at memory address h-ram-base-addr +\,3
    (\mathtt{is\text{-}sw}) \cdot (\mathrm{Write}_{\mathrm{RAM}}(\mathtt{clk}, \mathtt{ram\text{-}val1}, \mathtt{h\text{-}ram\text{-}base\text{-}addr}, 0) - \mathtt{ram\text{-}val1}) = 0
    (is-sw) \cdot (Write_{RAM}(clk, ram-val2, h-ram-base-addr, 1) - ram-val2) = 0
    (is-sw) \cdot (Write_{RAM}(clk, ram-val3, h-ram-base-addr, 2) - ram-val3) = 0
    (\mathtt{is\text{-}sw}) \cdot (\mathrm{Write}_{\mathrm{RAM}}(\mathtt{clk}, \mathtt{ram\text{-}val4}, \mathtt{h\text{-}ram\text{-}base\text{-}addr}, 3) - \mathtt{ram\text{-}val4}) = 0
/\!\!/ Range check h-ram-base-addr^{(j)}\in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the RAM memory checking
# Range check ram-valj \in \left[0, 2^8 - 1\right] for j = 1, 2, 3, 4 - Performed in the RAM component
/\!\!/ Range check a-val^{(j)}\in \left[0,2^8-1\right] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
// Range check c-val^{(j)}\in ar{[}0,2^8-1ar{]} for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-sw \in \{0,1\} - Performed in the CPU component
```

# 8.9 Basic Instruction Set: Jump Instructions

#### 8.9.1 JAL Instruction

The parameters and functionality for the JAL instruction are as follows:

```
    opcode: JAL
    Parameters: (a-val, b-val, c-val)
    Instruction selector: is-jal = 1
    Functionality:

            pc-next ← pc + c-val mod 2<sup>32</sup>
            a-val ← pc + 4 mod 2<sup>32</sup>
```

The mapping from the jal instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val | c-val   |
|------------|------|------|------|-------|-------|-------|---------|
| jal        | rd   | 0    | i    | 1     | R[rd] | 0     | sext(i) |

where sext is the sign extension function, i is a 20-bit value specifying bits 1-20 of the immediate value, and bit 0 of the immediate value is equal to 0.

## Constraints assuming large fields

```
/\!\!/ Setting a-val = pc + 4
# h-carry(j) for j = 1, 2, 3, 4 used for carry handling
    \begin{array}{lll} (\mathtt{is-jal}) \cdot (\mathtt{pc^{(1)}} + 4 & -\mathtt{a-val^{(1)}} - \mathtt{h-carry^{(1)}} \cdot 2^8) = 0 \\ (\mathtt{is-jal}) \cdot (\mathtt{pc^{(2)}} + \mathtt{h-carry^{(1)}} - \mathtt{a-val^{(2)}} - \mathtt{h-carry^{(2)}} \cdot 2^8) = 0 \end{array} 
    (is-jal) \cdot (pc^{(3)} + h\text{-carry}^{(2)} - a\text{-val}^{(3)} - h\text{-carry}^{(3)} \cdot 2^8) = 0
    (is-jal) \cdot (pc^{(4)} + h\text{-carry}^{(3)} - a\text{-val}^{(4)} - h\text{-carry}^{(4)} \cdot 2^8) = 0
// Enforcing h-carry^{(j)} \in \{0,1\} for j=1,2,3,4
   (\texttt{is-jal}) \cdot (\texttt{h-carry}^{(1)}) \cdot (1 - \texttt{h-carry}^{(1)}) = 0
     (\texttt{is-jal}) \cdot (\texttt{h-carry}^{(2)}) \cdot (1 - \texttt{h-carry}^{(2)}) = 0
     (is-jal) \cdot (h-carry^{(3)}) \cdot (1-h-carry^{(3)}) = 0
     (\text{is-jal}) \cdot (\text{h-carry}^{(4)}) \cdot (1 - \text{h-carry}^{(4)}) = 0
/\!\!/ Setting pc-next = pc + c-val
/\!\!/ pc-carry^{(j)} for j=1,2,3,4 used for carry handling
     (\mathtt{is}\mathtt{-jal})\cdot(\mathtt{pc}^{(1)}+\mathtt{c}\mathtt{-val}^{(1)}-\mathtt{pc}\mathtt{-next}^{(1)}-\mathtt{pc}\mathtt{-carry}^{(1)}\cdot2^8)=0
     (\texttt{is-jal}) \cdot (\texttt{pc}^{(2)} + \texttt{c-val}^{(2)} + \texttt{pc-carry}^{(1)} - \texttt{pc-next}^{(2)} - \texttt{pc-carry}^{(2)} \cdot 2^8) = 0
      \begin{array}{l} (\text{is-jal}) \cdot (\text{pc}^{(3)} + \text{c-val}^{(3)} + \text{pc-carry}^{(2)} - \text{pc-next}^{(3)} - \text{pc-carry}^{(3)} \cdot 2^8) = 0 \\ (\text{is-jal}) \cdot (\text{pc}^{(4)} + \text{c-val}^{(4)} + \text{pc-carry}^{(3)} - \text{pc-next}^{(4)} - \text{pc-carry}^{(4)} \cdot 2^8) = 0 \end{array} 
# Enforcing pc-carry^{(j)} \in \{0,1\} for j = 1,2,3,4
   (\texttt{is-jal}) \cdot (\texttt{pc-carry}^{(1)}) \cdot (1 - \texttt{pc-carry}^{(1)}) = 0
     (\texttt{is-jal}) \cdot (\texttt{pc-carry}^{(2)}) \cdot (1 - \texttt{pc-carry}^{(2)}) = 0
     (\texttt{is-jal}) \cdot (\texttt{pc-carry}^{(3)}) \cdot (1 - \texttt{pc-carry}^{(3)}) = 0
     (is-jal) \cdot (pc-carry^{(4)}) \cdot (1 - pc-carry^{(4)}) = 0
/\!\!/ Range check \operatorname{pc}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check 	ext{pc-next}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
// Range check c-val^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4 - Performed in the CPU component
/\!\!/ Range check is-jal \in \{0,1\} - Performed in the CPU component
```

#### 8.9.2 JALR Instruction

The parameters and functionality for the JALR instruction are as follows:

```
    opcode: JALR
    Parameters: (a-val, b-val, c-val)
    Instruction selector: is-jalr = 1
    Functionality:

            pc-next ← (b-val + c-val mod 2<sup>32</sup>) & 0xffffffffe
            a-val ← pc + 4 mod 2<sup>32</sup>
```

The mapping from the jalr instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val  | c-val   |
|------------|------|------|------|-------|-------|--------|---------|
| jalr       | rd   | rs1  | i    | 1     | R[rd] | R[rs1] | sext(i) |

where sext is the sign extension function, i is a 12-bit value specifying bits 0-11 of the immediate value.

#### Constraints assuming large fields

```
/\!\!/ Setting a-val = pc + 4
• (is-jalr) \cdot (4 + pc - a - val - h - carry \cdot 2^{32}) = 0
/\!\!/ Enforcing h-carry \in \{0,1\}
• (is-jalr) \cdot (h-carry) \cdot (1 - h-carry) = 0
// Setting pc-next-aux
/\!\!/ pc-next-aux = b-val + c-val
  (\texttt{is-jalr}) \cdot (\texttt{b-val} + \texttt{c-val} - \texttt{pc-next-} \texttt{aux} - \texttt{pc-carry} \cdot 2^{32}) = 0
/\!\!/ Enforcing pc-carry \in \{0,1\}
• (is-jalr) \cdot (pc-carry) \cdot (1 - pc-carry) = 0
// Setting pc-next = pc-next-aux & 0xFFFFFFE
/\!\!/ pc-rem-aux and pc-qt-aux used to set bit 0 of pc-next-aux to 0
    (is-jalr) \cdot (pc-next-aux - pc-rem-aux - pc-qt-aux \cdot 2) = 0
    (is-jalr) \cdot (pc-qt-aux \cdot 2 - pc-next) = 0
/\!\!/ Enforcing pc-rem-aux \in \{0,1\}
• (is-jalr) \cdot (pc-rem-aux) \cdot (1 - pc-rem-aux) = 0
/\!\!/ Range check for pc-next-aux and pc-qt-aux
\bullet \quad (\mathtt{is\text{-}jalr}) \cdot (\mathtt{pc\text{-}next\text{-}aux} \in \left[0, 2^{32} - 1\right]) = 0
  (\text{is-jalr}) \cdot (\text{pc-qt-aux} \in \left[0, 2^{31} - 1\right]) = 0
// Range check \mathrm{pc} \in \left[0, 2^{32} - 1\right] - Guaranteed by the program memory checking
// Range check pc\text{-next} \in \left[0, 2^{32} - 1\right] - Guaranteed by the program memory checking
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
# Range check b-val \in [0, 2^{32} - 1] - Performed in the CPU component
/\!\!/ Range check c-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/\!\!/ Range check is-jalr \in \{0,1\} - Performed in the CPU component
```

```
• (is-jalr) \cdot (pc^{(4)} + h\text{-carry}^{(3)} - a\text{-val}^{(4)} - h\text{-carry}^{(4)} \cdot 2^8) = 0
# Enforcing h-carry(j) \in \{0,1\} for j = 1,2,3,4
    (is-jalr) \cdot (h-carry^{(1)}) \cdot (1-h-carry^{(1)}) = 0
    (\text{is-jalr}) \cdot (\text{h-carry}^{(2)}) \cdot (1 - \text{h-carry}^{(2)}) = 0
    (\text{is-jalr}) \cdot (\text{h-carry}^{(3)}) \cdot (1 - \text{h-carry}^{(3)}) = 0
    (\text{is-jalr}) \cdot (\text{h-carry}^{(4)}) \cdot (1 - \text{h-carry}^{(4)}) = 0
/\!\!/ Setting pc-next-aux = b-val + c-val
  pc-carry<sup>(j)</sup> for j = 1, 2, 3, 4 used for carry handling
    (is-jalr) \cdot (b-val^{(1)} + c-val^{(1)} - pc-next-aux^{(1)} - pc-carry^{(1)} \cdot 2^8) = 0
    \begin{array}{l} (\text{is-jalr}) \cdot (\text{b-val}^{(2)} + \text{c-val}^{(2)} + \text{pc-next-aux}^{(1)} - \text{pc-next-aux}^{(2)} - \text{pc-carry}^{(2)} \cdot 2^8) = 0 \\ (\text{is-jalr}) \cdot (\text{b-val}^{(3)} + \text{c-val}^{(3)} + \text{pc-carry}^{(2)} - \text{pc-next-aux}^{(3)} - \text{pc-carry}^{(3)} \cdot 2^8) = 0 \\ (\text{is-jalr}) \cdot (\text{b-val}^{(4)} + \text{c-val}^{(4)} + \text{pc-carry}^{(3)} - \text{pc-next-aux}^{(4)} - \text{pc-carry}^{(4)} \cdot 2^8) = 0 \end{array}
# Enforcing pc-carry(j) \in \{0,1\} for j = 1,2,3,4
    (is-jalr) \cdot (pc-carry^{(1)}) \cdot (1 - pc-carry^{(1)}) = 0
    (\texttt{is-jalr}) \cdot (\texttt{pc-carry}^{(2)}) \cdot (1 - \texttt{pc-carry}^{(2)}) = 0
    (is-jalr) \cdot (pc-carry^{(3)}) \cdot (1-pc-carry^{(3)}) = 0
    (is-jalr) \cdot (pc-carry^{(4)}) \cdot (1-pc-carry^{(4)}) = 0
/\!\!/ Setting pc-next = pc-next-aux & 0xFFFFFFE
/\!\!/ pc-rem-aux and pc-qt-aux used to set bit 0 of pc-next-aux^{(1)} to 0
    (is-jalr) \cdot (pc-next-aux^{(1)} - pc-rem-aux - pc-qt-aux \cdot 2) = 0
    (is-jalr) \cdot (pc-qt-aux \cdot 2 - pc-next^{(1)}) = 0
    (is-jalr) · (pc-next-aux<sup>(2)</sup> - pc-next<sup>(2)</sup>) = 0
    (is-jalr) \cdot (pc-next-aux^{(3)} - pc-next^{(3)}) = 0
    (is-jalr) \cdot (pc-next-aux^{(4)} - pc-next^{(4)}) = 0
/\!\!/ Enforcing pc-rem-aux \in \{0,1\}
   (is-jalr) \cdot (pc-rem-aux) \cdot (1 - pc-rem-aux) = 0
/\!\!/ Range check for pc-next-aux and pc-qt-aux
    (\texttt{is-jalr}) \cdot (\texttt{pc-next-aux}^{(j)} \in \left[0, 2^8 - 1\right]) = 0 \text{ for } j = 1, 2, 3, 4
    (is-jalr) \cdot (pc-qt-aux \in [0, 2^8-1]) = 0
/\!\!/ Range check \operatorname{pc}^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Guaranteed by the program memory checking
// Range check \operatorname{pc-next}^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
# Range check b-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check c-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-jalr \in \{0,1\} - Performed in the CPU component
```

## 8.10 Basic Instruction Set: LUI and AUIPC Instructions

# 8.10.1 LUI Instruction

The parameters and functionality for the LUI instruction are as follows:

```
    opcode: LUI
    Parameters: (a-val, b-val, c-val)
    Instruction selector: is-lui = 1
    Functionality:

            pc-next ← pc + 4 mod 2<sup>32</sup>
            a-val ← c-val
```

The mapping from the lui instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a | op-b | op-c | imm-c | a-val | b-val | c-val   |
|------------|------|------|------|-------|-------|-------|---------|
| lui        | rd   | 0    | i    | 1     | R[rd] | 0     | sext(i) |

where sext is the sign extension function, i is a 20-bit value specifying bits 12-31 of the immediate value, and bits 0-11 of the immediate value are equal to 0.

# Constraints assuming large fields

#### Constraints assuming small fields

```
  \begin{tabular}{ll} \# & Setting a-val = c-val \\  \begin{tabular}{ll} (is-lui) \cdot (c-val^{(1)}-a-val^{(1)}) = 0 \\  \begin{tabular}{ll} (is-lui) \cdot (c-val^{(2)}-a-val^{(2)}) = 0 \\  \begin{tabular}{ll} (is-lui) \cdot (c-val^{(3)}-a-val^{(3)}) = 0 \\  \begin{tabular}{ll} (is-lui) \cdot (c-val^{(4)}-a-val^{(4)}) = 0 \\  \begin{tabular}{ll} \# & Setting pc-next = pc+4 is performed in Section 8.4.2 under common constraints \\ \# & Range check pc^{(j)} \in \left[0,2^8-1\right] \text{ for } j=1,2,3,4 \text{ - Guaranteed by the program memory checking } \\ \# & Range check pc-next^{(j)} \in \left[0,2^8-1\right] \text{ for } j=1,2,3,4 \text{ - Performed in the CPU component } \\ \# & Range check a-val^{(j)} \in \left[0,2^8-1\right] \text{ for } j=1,2,3,4 \text{ - Performed in the CPU component } \\ \# & Range check c-val^{(j)} \in \left[0,2^8-1\right] \text{ for } j=1,2,3,4 \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui \in \{0,1\} \text{ - Performed in the CPU component } \\ \# & Range check is-lui + Range check is-lui + Range check is-lui + Range check is-l
```

#### 8.10.2 AUIPC Instruction

The parameters and functionality for the AUIPC instruction are as follows:

```
    opcode: AUIPC
    Parameters: (a-val, b-val, c-val)
    Instruction selector: is-auipc = 1
    Functionality:

            pc-next ← pc + 4 mod 2<sup>32</sup>
            a-val ← pc + c-val mod 2<sup>32</sup>
```

The mapping from the auipc instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a             | op-b | op-c | imm-c | a-val | b-val | c-val   |
|------------|------------------|------|------|-------|-------|-------|---------|
| auipc      | $^{\mathrm{rd}}$ | 0    | i    | 1     | R[rd] | 0     | sext(i) |

where sext is the sign extension function, i is a 20-bit value specifying bits 12-31 of the immediate value, and bits 0-11 of the immediate value are equal to 0.

```
\label{eq:setting a-val} \left\{ \begin{array}{l} \text{Setting a-val} = \operatorname{pc} + \operatorname{c-val} \\ \text{(is-auipc)} \cdot (\operatorname{c-val} + \operatorname{pc} - \operatorname{a-val} - \operatorname{h-carry} \cdot 2^{32}) = 0 \end{array} \right. \left\{ \begin{array}{l} \text{Enforcing h-carry} \in \{0,1\} \\ \text{(is-auipc)} \cdot (\operatorname{h-carry}) \cdot (1 - \operatorname{h-carry}) = 0 \end{array} \right. \left\{ \begin{array}{l} \text{Setting pc-next} = \operatorname{pc} + 4 \text{ is performed in Section 8.4.1 under common constraints} \right. \\ \left\{ \begin{array}{l} \text{Range check pc} \in \left[0,2^{32}-1\right] - \operatorname{Guaranteed} \text{ by the program memory checking} \\ \left. \begin{array}{l} \text{Range check pc-next} \in \left[0,2^{32}-1\right] - \operatorname{Guaranteed} \text{ by the program memory checking} \\ \left. \begin{array}{l} \text{Range check a-val} \in \left[0,2^{32}-1\right] - \operatorname{Performed} \text{ in the CPU component} \\ \left. \begin{array}{l} \text{Range check b-val} \in \left[0,2^{32}-1\right] - \operatorname{Performed} \text{ in the CPU component} \\ \left. \begin{array}{l} \text{Range check c-val} \in \left[0,2^{32}-1\right] - \operatorname{Performed} \text{ in the CPU component} \\ \end{array} \right. \\ \left. \left. \begin{array}{l} \text{Range check is-auipc} \in \{0,1\} - \operatorname{Performed} \text{ in the CPU component} \\ \end{array} \right. \end{array} \right.
```

# Constraints assuming small fields

```
/\!\!/ Setting a-val = pc + c-val
/\!\!/ h-carry ^{(j)} for j=1,2,3,4 used for carry handling
   (\texttt{is-auipc}) \cdot (\texttt{pc}^{(1)} + \texttt{c-val}^{(1)} - \texttt{a-val}^{(1)} - \texttt{h-carry}^{(1)} \cdot 2^8) = 0
    (\texttt{is-auipc}) \cdot (\texttt{pc}^{(2)} + \texttt{c-val}^{(2)} + \texttt{h-carry}^{(1)} - \texttt{a-val}^{(2)} - \texttt{h-carry}^{(2)} \cdot 2^8) = 0
    (\mathtt{is-auipc}) \cdot (\mathtt{pc^{(3)}} + \mathtt{c-val^{(3)}} + \mathtt{h-carry^{(2)}} - \mathtt{a-val^{(3)}} - \mathtt{h-carry^{(3)}} \cdot 2^8) = 0
    (is-auipc) \cdot (pc^{(4)} + c\text{-val}^{(4)} + h\text{-carry}^{(3)} - a\text{-val}^{(4)} - h\text{-carry}^{(4)} \cdot 2^8) = 0
# Enforcing h-carry^{(j)} \in \{0,1\} for j = 1,2,3,4
    (\texttt{is-auipc}) \cdot (\texttt{h-carry}^{(1)}) \cdot (1 - \texttt{h-carry}^{(1)}) = 0
    (\text{is-auipc}) \cdot (\text{h-carry}^{(2)}) \cdot (1 - \text{h-carry}^{(2)}) = 0
    (\text{is-auipc}) \cdot (\text{h-carry}^{(3)}) \cdot (1 - \text{h-carry}^{(3)}) = 0
    (\texttt{is-auipc}) \cdot (\texttt{h-carry}^{(4)}) \cdot (1 - \texttt{h-carry}^{(4)}) = 0
// Setting \mathrm{pc}	ext{-next} = \mathrm{pc} + 4 is performed in Section 8.4.2 under common constraints
/\!\!/ Range check \operatorname{pc}^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check pc\text{-next}^{(j)} \in \left[0,2^{8}-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
// Range check c-val^{(j)} \in [0,2^8-1] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check is-auipc \in \{0,1\} - Performed in the CPU component
```

# 8.11 Basic Instruction Set: System Instructions

## 8.11.1 ECALL Instruction

The parameters and functionality for the ECALL instruction are as follows:

- opcode: ECALL
- Parameters: (a-val, b-val, c-val)
- Instruction selector: is-ecall = 1
- Functionality: see Table 3
- Remark: The ECALL behavior depends on the value of R[x17].

The mapping from the ecall instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a         | op-b | op-c | imm-c | a-val   | b-val  | c-val |
|------------|--------------|------|------|-------|---------|--------|-------|
| ecall      | 0   x2   x10 | x17  | 0    | 1     | R[op-a] | R[x17] | 0     |

```
\# Setting flags depending on the value of b-val
   (is-type-sys) \cdot (is-sys-debug)(b-val - 0x200) = 0
   (is-type-sys) \cdot (is-sys-halt)(b-val - 0x201) = 0
   (is-type-sys) \cdot (is-sys-priv-input)(b-val - 0x400) = 0
   (\texttt{is-type-sys}) \cdot (\texttt{is-sys-cycle-count}) (\texttt{b-val} - \texttt{0x401}) = 0
   (is-type-sys) \cdot (is-sys-stack-reset)(b-val - 0x402) = 0
   (is-type-sys) \cdot (is-sys-heap-reset)(b-val - 0x403) = 0
# Enforcing flags in \{0,1\}
  (\texttt{is-type-sys}) \cdot (\texttt{is-sys-debug}) \cdot (1 - \texttt{is-sys-debug}) = 0
   (is-type-sys) \cdot (is-sys-halt) \cdot (1 - is-sys-halt) = 0
   (\texttt{is-type-sys}) \cdot (\texttt{is-sys-priv-input}) \cdot (1-\texttt{is-sys-priv-input}) = 0
    (is-type-sys) \cdot (is-sys-cycle-count) \cdot (1 - is-sys-cycle-count) = 0
   (\texttt{is-type-sys}) \cdot (\texttt{is-sys-stack-reset}) \cdot (1-\texttt{is-sys-stack-reset}) = 0
   (is-type-sys) \cdot (is-sys-heap-reset) \cdot (1 - is-sys-heap-reset) = 0
# Enforcing that only one of these flags is set
  (\texttt{is-type-sys}) \cdot (\texttt{is-sys-debug} + \texttt{is-sys-halt} + \texttt{is-sys-priv-input} + \\
                       \verb|is-sys-cycle-count+is-sys-stack-reset+is-sys-heap-reset-1| = 0
# Enforcing values for op-a
  (is-type-sys) \cdot (is-sys-debug + is-sys-halt + is-sys-cycle-count) \cdot (op-a) = 0
    (\mathtt{is-type-sys}) \cdot (\mathtt{is-sys-priv-input} + \mathtt{is-sys-heap-reset}) \cdot (10 - \mathtt{op-a}) = 0
   (\texttt{is-type-sys}) \cdot (\texttt{is-sys-stack-reset}) \cdot (2 - op-a) = 0
# Enforcing ranges for a-val
  (is-type-sys) \cdot (is-sys-debug + is-sys-halt + is-sys-cycle-count) \cdot (a-val) = 0
# The following range checks are being performed by the register memory component
 \# \ (\texttt{is-type-sys}) \cdot (\texttt{is-sys-priv-input} + \texttt{is-sys-heap-reset} + \texttt{is-sys-stack-reset}) \cdot (\texttt{a-val} \in \left\lceil 0, 2^{32} - 1 \right\rceil) = 0 
/\!\!/ Setting pc-next = pc when is-sys-halt = 1
• (is-type-sys) \cdot (is-sys-halt) \cdot (pc - pc-next) = 0
/\!\!/ Range check \mathrm{pc} \in \left[0, 2^{32} - 1
ight] - Guaranteed by the program memory checking
/\!\!/ Range check \mathrm{pc}	ext{-next} \in \left[0, 2^{32} - 1
ight] - Guaranteed by the program memory checking
/\!\!/ Range check a-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
/\!\!/ Range check b-val \in \left[0,2^{32}-1
ight] - Performed in the CPU component
// Range check c-val \in \left\lceil 0, 2^{32} - 1 \right\rceil - Performed in the CPU component
// Range check is-type-sys \in \{0,1\} - Performed in the CPU component
```

```
\slash\hspace{-0.4em}/\hspace{-0.4em}/\hspace{-0.4em} Setting flags depending on the value of b-val
• (is-type-sys) · (is-sys-debug)(b-val<sup>(1)</sup> - 0x00) = 0
                                                                                                                \triangleright b-val = 0x200
     (\texttt{is-type-sys}) \cdot (\texttt{is-sys-debug})(\texttt{b-val}^{(2)} - \texttt{0x02}) = 0
                                                                                                                \triangleright b-val = 0x200
     (\mathtt{is\text{-}type\text{-}sys}) \cdot (\mathtt{is\text{-}sys\text{-}halt}) (\mathtt{b\text{-}val}^{(1)} - \mathtt{0x01}) = 0
                                                                                                                 \triangleright b-val = 0x201
     (\mathtt{is\text{-}type\text{-}sys}) \cdot (\mathtt{is\text{-}sys\text{-}halt}) (\mathtt{b\text{-}val}^{(2)} - \mathtt{0x02}) = 0
                                                                                                                 \triangleright b-val = 0x201
     (is-type-sys) \cdot (is-sys-priv-input)(b-val^{(1)} - 0x00) = 0
                                                                                                                 \triangleright b-val = 0x400
     (\mathtt{is\text{-}type\text{-}sys}) \cdot (\mathtt{is\text{-}sys\text{-}priv\text{-}input}) (\mathtt{b\text{-}val}^{(2)} - \mathtt{0x04}) = 0
                                                                                                                 \triangleright b-val = 0x400
     (\texttt{is-type-sys}) \cdot (\texttt{is-sys-cycle-count}) (\texttt{b-val}^{(1)} - \texttt{0x01}) = 0
                                                                                                                \triangleright b-val = 0x401
     (\texttt{is-type-sys}) \cdot (\texttt{is-sys-cycle-count}) (\texttt{b-val}^{(2)} - \texttt{0x04}) = 0
                                                                                                                \triangleright b-val = 0x401
     (\texttt{is-type-sys}) \cdot (\texttt{is-sys-stack-reset}) (\texttt{b-val}^{(1)} - \texttt{0x02}) = 0
                                                                                                               \triangleright b-val = 0x402
     (\texttt{is-type-sys}) \cdot (\texttt{is-sys-stack-reset}) (\texttt{b-val}^{(2)} - \texttt{0x04}) = 0
                                                                                                              \triangleright b-val = 0x402
     (is-type-sys) \cdot (is-sys-heap-reset)(b-val^{(1)} - 0x03) = 0
                                                                                                              \triangleright b-val = 0x403
     (is-type-sys) \cdot (is-sys-heap-reset)(b-val^{(2)} - 0x04) = 0
                                                                                                               \triangleright b-val = 0x403
    (\mathtt{is\text{-}type\text{-}sys})\cdot(\mathtt{b\text{-}val}^{(3)})=0
    (\texttt{is-type-sys}) \cdot (\texttt{b-val}^{(4)}) = 0
/\!\!/ Enforcing flags in \{0,1\}
     (is-type-sys) \cdot (is-sys-debug) \cdot (1 - is-sys-debug) = 0
     (is-type-sys) \cdot (is-sys-halt) \cdot (1 - is-sys-halt) = 0
     (\texttt{is-type-sys}) \cdot (\texttt{is-sys-priv-input}) \cdot (1-\texttt{is-sys-priv-input}) = 0
    (\texttt{is-type-sys}) \cdot (\texttt{is-sys-cycle-count}) \cdot (1-\texttt{is-sys-cycle-count}) = 0
```

```
(\texttt{is-type-sys}) \cdot (\texttt{is-sys-stack-reset}) \cdot (1 - \texttt{is-sys-stack-reset}) = 0
    (is-type-sys) \cdot (is-sys-heap-reset) \cdot (1 - is-sys-heap-reset) = 0
# Enforcing that only one of these flags is set
  (is-type-sys) \cdot (is-sys-debug + is-sys-halt + is-sys-priv-input +
                        is-sys-cycle-count + is-sys-stack-reset + is-sys-heap-reset - 1) = 0
# Enforcing values for op-a
  (\text{is-type-sys}) \cdot (\text{is-sys-debug} + \text{is-sys-halt} + \text{is-sys-cycle-count}) \cdot (\text{op-a}) = 0
    (is-type-sys) \cdot (is-sys-priv-input + is-sys-heap-reset) \cdot (10 - op-a) = 0
    (is-type-sys) \cdot (is-sys-stack-reset) \cdot (2 - op-a) = 0
# Enforcing ranges for a-val
// Two limbs at a time
   (\text{is-type-sys}) \cdot (\text{is-sys-debug} + \text{is-sys-halt} + \text{is-sys-cycle-count}) \cdot (\text{a-val}^{(1)} + \text{a-val}^{(2)} \cdot 2^8) = 0
   (\texttt{is-type-sys}) \cdot (\texttt{is-sys-debug} + \texttt{is-sys-halt} + \texttt{is-sys-cycle-count}) \cdot (\texttt{a-val}^{(3)} + \texttt{a-val}^{(4)} \cdot 2^8) = 0
// The following range checks are being performed by the register memory component
\# (is-type-sys) \cdot (is-sys-priv-input + is-sys-heap-reset + is-sys-stack-reset) \cdot (a-val^{(1)} \in [0, 2^8 - 1]) = 0
/\!\!/ (is-type-sys) \cdot (is-sys-priv-input + is-sys-heap-reset + is-sys-stack-reset) \cdot (a-val^{(2)} \in [0, 2^8 - 1]) = 0
 /\!\!/ \text{ (is-type-sys)} \cdot \text{(is-sys-priv-input} + \text{is-sys-heap-reset} + \text{is-sys-stack-reset}) \cdot \text{(a-val}^{(3)} \in \left[0, 2^8 - 1\right]) = 0 
 /\!\!/ (\texttt{is-type-sys}) \cdot (\texttt{is-sys-priv-input} + \texttt{is-sys-heap-reset} + \texttt{is-sys-stack-reset}) \cdot (\texttt{a-val}^{(4)} \in \left[0, 2^8 - 1\right]) = 0 
/\!\!/ Setting pc-next = pc when is-sys-halt = 1
// Two limbs at a time
• (is-type-sys) \cdot (is-sys-halt) \cdot (pc<sup>(1)</sup> + pc<sup>(2)</sup> \cdot 2<sup>8</sup> - pc-next<sup>(1)</sup> - pc-next<sup>(2)</sup> \cdot 2<sup>8</sup>) = 0
    (\texttt{is-type-sys}) \cdot (\texttt{is-sys-halt}) \cdot (pc^{(3)} + pc^{(4)} \cdot 2^8 - pc \cdot \mathsf{next}^{(3)} - pc \cdot \mathsf{next}^{(4)} \cdot 2^8) = 0
/\!\!/ Range check \mathrm{pc}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check \mathrm{pc	ext{-}next}^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Guaranteed by the program memory checking
/\!\!/ Range check a-val^{(j)} \in \left[0,2^8-1
ight] for j=1,2,3,4 - Performed in the CPU component
/\!\!/ Range check b-val^{(j)}\in ar{[}0,2^8-1ar{]} for j=1,2,3,4 - Performed in the CPU component
# Range check c-val^{(j)} \in [0, 2^8 - 1] for j = 1, 2, 3, 4 - Performed in the CPU component
/\!\!/ Range check is-type-sys \in \{0,1\} - Performed in the CPU component
```

#### 8.11.2 EBREAK Instruction

The parameters and functionality for the EBREAK instruction are as follows:

- opcode: EBREAK
- Parameters: (a-val, b-val, c-val)
- Instruction selector: is-ebreak = 1
- Functionality: see Table 3
- Remark: The EBREAK behavior depends on the value of R[x17].

The mapping from the ebreak instruction in the Nexus Virtual Machine Instruction Set Table (see Table 1) is as follows:

| NVM opcode | op-a         | op-b | op-c | imm-c | a-val   | b-val             | c-val |
|------------|--------------|------|------|-------|---------|-------------------|-------|
| ebreak     | 0   x2   x10 | x17  | 0    | 1     | R[op-a] | $R[\mathtt{x17}]$ | 0     |

Remark 8.4 Since the current version of the Nexus Virtual Machine behaves similarly for EBREAK and ECALL instructions, we describe these constraints together in the ECALL section (Section 8.11.1) using the flag is-type-sys instead of is-ecall or is-ebreak.

# 8.12 Interactions with other components

As mentioned in Section 4.2.4, in the case of system instructions, the execution component may end up writing to the register memory component depending on the contents of register R[x17]. More precisely, as stated in Table 3,

- When R[x17] = 0x400, the system call reads from the private input and loads a 32-bit value onto R[x10];
- When R[x17] = 0x400, the system call overwrites the stack pointer and loads a 32-bit value onto R[x2]; and
- When R[x17] = 0x400, the system call overwrites the heap pointer and loads a 32-bit value onto R[x10].

For this reason, we specify the interaction with the register memory component explicitly here.

#### Register memory interaction

```
\label{eq:writing} \begin{tabular}{ll} \# \mbox{Writing to register op-$a$} = x2 \mbox{ for TYPE SYS instructions when is-sys-stack-reset} = 1 \\ \# \mbox{Writing to register op-$a$} = x10 \mbox{ for TYPE SYS instructions when is-sys-heap-reset} = 1 \\ \# \mbox{Writing to register op-$a$} = x10 \mbox{ for TYPE SYS instructions when is-sys-heap-reset} = 1 \\ \bullet \mbox{ (is-sys-priv-input + is-sys-heap-reset + is-sys-stack-reset)} \cdot \mbox{ (Write}_{Reg}\mbox{(op-$a$}, a-val, clk, 1) - a-val) = 0 \\ \end{tabular}
```

# 9 Experimental evaluation

We implemented the Nexus zkVM as source-available software written in Rust.<sup>2</sup> It relies on an open-source implementation of Stwo due to StarkWare which includes an optimized SIMD acceleration backend.<sup>3</sup> The implementation contains all the components of the zkVM design discussed herein: the end-to-end zkVM including proving and verification, the zkVM<sub>MA</sub> component as a standalone virtual machine, and the nexus-rt runtime. The software is under continued development, and this evaluation corresponds to its state and its performance at the time of its v0.3.0 release.

In this section, we provide an empirical evaluation of the performance of our zkVM design. Our evaluation covers two benchmarks. First, we consider a microbenchmark of just the proving component: trace-filling (i.e., witness generation) and proving of a program composed of a sequence of n add instructions for  $n \in \{2^8, 2^{10}, \ldots, 2^{18}\}$ . Our goal in this section is to evaluate the efficacy of the constraints and underlying Stwo prover at proving zkVM executions, without any of the overhead involved in the two-pass execution model.

Next, we evaluate the full end-to-end zkVM over two test programs. The first step computes  $\operatorname{fibonnaci}(n)$ , while the second computes n iterations of the keccak hash function — in both cases n is provided as a public input to the program. We profile this evaluation with respect to the main stages of the zkVM execution process (execution/tracing, proving, and then verification) for time and host memory usage, and determine an observed overhead with respect to native execution of the same program on the test machine compiled and executed as a standard Rust binary. Our goal in this section is to establish reasonable benchmarks for real-world performance in deployment as would be observed by the user. Our evaluation addresses performance statistics gathered by running the zkVM implementation on a commodity laptop: an Apple MacBook Pro with an M3 Pro processor and 18GB of RAM. Note that these benchmarks do not involve any GPU acceleration of the proving, which could be a source of significant future performance improvements.

# 9.1 Proving microbenchmarks

As described, our prover microbenchmarks are gathered by evaluating both the witness generation and proving over a pre-generated execution trace composed solely of add instructions. This provides data on the efficiency of the described constraints and the capability of Stwo to quickly prove them, as well as a lower bound on the end-to-end empirical performance of the zkVM as a whole. For ease

<sup>&</sup>lt;sup>2</sup>https://github.com/nexus-xyz/nexus-zkvm/tree/releases/0.3.0

<sup>3</sup>https://github.com/starkware-libs/stwo/commit/a194fad63ea75d93e5fe5a4ef50029dccadc51c1

| Program                               | avg prove (s) | min prove rate (kHz) | avg prove rate (kHz) | max prove rate (kHz) |
|---------------------------------------|---------------|----------------------|----------------------|----------------------|
| $\exp_2(8)$ add $\operatorname{ops}$  | 0.022         | 11.778               | 11.891               | 12.021               |
| $\exp_2(10)$ add ops                  | 0.064         | 15.629               | 16.107               | 16.418               |
| $\exp_2(12)$ add $\operatorname{ops}$ | 0.264         | 15.125               | 15.499               | 15.769               |
| $\exp_2(14)$ add $\operatorname{ops}$ | 1.330         | 11.731               | 12.316               | 14.896               |
| $\exp_2(16)$ add $\operatorname{ops}$ | 4.621         | 13.978               | 14.183               | 14.396               |
| $\exp_2(18)$ add ops                  | 20.074        | 12.869               | 13.059               | 13.203               |

Table 6: Microbenchmark results for the Stwo prover integration. Averages are taken over a minimum of twenty iterations.

of reproduction, the infrastructure for these microbenchmarks is available as a standalone Rust crate within the zkVM code repository.<sup>4</sup> Concretely, we aim to answer the following evaluation questions:

- EQ #1) What raw performance does the prover reach for a simple program, in 'execution steps proven per second' as measured in kHz?
- **EQ #2)** What shape is the performance curve of this measurement as the length of the execution trace grows over the sequence  $2^8, 2^{10}, \dots, 2^{18}$ ?

Our raw data is provided in Table 6, and graphed in Fig. 2. In the graph, we also independently chart the time taken for the main stages of trace-filling: filling the main, interaction, and preprocessed columns. These stages do not completely define the preliminary work done by the prover integration before the STARK itself is invoked, but we nonetheless draw them out as they are particularly amenable to optimization through parallelization, and so are a route to further reducing overall costs. We draw the following conclusions.

EQ #1. The prover shows a consistent average performance of  $\sim$ 12-16 kHz. At  $2^8$  add operations the performance is at its lowest measured, likely due to the overhead of trace filling and other setup that cannot be amortized as effectively given the very short proving time. In addition, there is a small and inconsistent — but nonetheless notable — decrease in the prover performance as the program becomes larger in size, up to the final benchmark of  $2^{18}$  add operations. We speculate this may be due to reliance on less-efficient memory operations and/or microarchitectural limitations within the host machine as the computation becomes more expensive, and may be specific to the details of the employed Stwo prover implementation or our integration with it.

EQ #2. As captured by the proving rate, we see that the marginal cost of proving an additional step of the program is roughly flat over time, with as noted a small decrease likely due to resource and engineering constraints on the performance of the STARK implementation or our integration with it. This is a highly desirable property of our prover, as it means (given sufficient computing and memory resources) the zkVM should be able to prove arbitrarily large programs with the claimed performance.

# 9.2 End-to-end benchmarks

Our end-to-end benchmarks are gathered by invoking the entire zkVM on a compiled binary, and then verifying the resulting proof. This provides data on the behavior that would be observed by the end-user of the zkVM when running on a machine with similar compute resources. As with the microbenchmarks, the infrastructure for these full benchmarks is available as a standalone crate within the zkVM code repository. The two chosen programs are common benchmarks for zkVMs: a program for computing the nth fibonacci number and a program that computes n repeated invocations of the keccak hash function. The latter in particular is a relevant evaluation given that many

 $<sup>^{4} \</sup>texttt{https://github.com/nexus-xyz/nexus-zkvm/tree/releases/0.3.0/prover-benches}$ 

<sup>&</sup>lt;sup>5</sup>https://github.com/nexus-xyz/nexus-zkvm/tree/releases/0.3.0/benchmarks



Figure 2: A graph of the performance of the Stwo prover integration. The timing figures (with shaded ranges of performance) are provided for the full proving as well as for the total time taken on the main stages of trace filling. The proving rate, in kilohertz, is also charted.

zkVM applications require some form of within-the-guest-program hashing. Note that the fibonacci implementation is a full Rust program, using iterative computation of the function over a 768-bit multi-limb bignum representation. Further, the keccak program is evaluated without the use of a precompile: The results presented below are computed with the keccak program being compiled into raw RISC-V32i instructions. Concretely, we aim to answer the following evaluation questions:

- EQ #3) What raw performance does the zkVM reach for a realistic program, in 'execution steps proven per second' as measured in kHz?
- EQ #4) What shape is the performance curve of this measurement as the length of the execution trace grows?
- EQ #5) What is the overhead with respect to native execution of a comparable Rust program compiled directly for the host machine?
- EQ #6) How do volatile memory resources affect the apparent performance of the zkVM?
- EQ #7) What are the relative costs of program emulation and tracing to proving? And what about of proving to verification?

Our raw data is provided in Table 7, covering the top line results, and Table 8 covering the component-wise performance and host memory usage. Key statistics are graphed as well in Fig. 3. We draw the following conclusions:

 $_{\rm S}$ 

EQ #3. The zkVM shows an average performance of ~10-12 kHz until the available volatile memory is saturated, at which point the performance drops off. There is some inconsistency in these results

| Program               | steps   | zkVM (s) | native (s) | overhead  | zkVM rate (kHz) |
|-----------------------|---------|----------|------------|-----------|-----------------|
| fibonacci(6)          | 6582    | 0.567    | 0.144      | 3.929x    | 11.603          |
| fibonacci(16)         | 12841   | 1.097    | 0.143      | 7.663x    | 11.706          |
| fibonacci(37)         | 27160   | 2.241    | 0.143      | 15.623x   | 12.120          |
| fibonacci(77)         | 73692   | 9.630    | 0.143      | 67.375x   | 7.652           |
| fibonacci(121)        | 108695  | 9.691    | 0.145      | 66.995x   | 11.216          |
| fibonacci(262)        | 250278  | 19.771   | 0.145      | 136.453x  | 12.659          |
| fibonacci(489)        | 464763  | 62.526   | 0.145      | 432.668x  | 7.433           |
| fibonacci(999)        | 808370  | 244.025  | 0.144      | 1697.127x | 3.313           |
| keccak(0)             | 39605   | 4.601    | 0.144      | 31.909x   | 8.607           |
| keccak(1)             | 105288  | 9.757    | 0.145      | 67.174x   | 10.791          |
| keccak(3)             | 236597  | 20.034   | 0.145      | 138.195x  | 11.810          |
| keccak(7)             | 499262  | 63.156   | 0.147      | 430.948x  | 7.905           |
| $\mathtt{keccak}(15)$ | 1024588 | 244.422  | 0.144      | 1693.490x | 4.192           |

Table 7: End-to-end benchmark results for the zkVM. All results are averages of wall-clock time taken over twenty iterations, and the memory consumption figures correspond to volatile memory. The inputs to each function were chosen to roughly correspond to the same growth pattern in terms of execution steps.

| Program               | emulate (ms) | prove (s) | verify (ms) | prove peak mem (gb) | verify peak mem (gb) |
|-----------------------|--------------|-----------|-------------|---------------------|----------------------|
| fibonacci(6)          | 3.121        | 0.564     | 7.941       | 0.786               | 0.963                |
| fibonacci(16)         | 5.797        | 1.091     | 14.163      | 1.345               | 1.345                |
| fibonacci(37)         | 11.380       | 2.230     | 27.706      | 1.549               | 1.566                |
| fibonacci(77)         | 29.262       | 9.601     | 115.554     | 5.152               | 5.180                |
| fibonacci(121)        | 44.005       | 9.647     | 116.320     | 5.224               | 5.234                |
| fibonacci(262)        | 99.893       | 19.671    | 236.411     | 9.475               | 9.545                |
| fibonacci(489)        | 180.146      | 62.346    | 470.523     | 13.241              | 13.309               |
| fibonacci(999)        | 329.761      | 243.695   | 933.453     | 13.309              | 13.309               |
| $\mathtt{keccak}(0)$  | 22.478       | 4.579     | 54.924      | 2.579               | 2.674                |
| $\mathtt{keccak}(1)$  | 63.353       | 9.693     | 116.358     | 5.054               | 5.059                |
| keccak(3)             | 143.348      | 19.890    | 237.839     | 9.498               | 9.614                |
| keccak(7)             | 307.031      | 62.849    | 475.329     | 13.208              | 13.313               |
| $\mathtt{keccak}(15)$ | 625.018      | 243.797   | 955.330     | 13.313              | 13.313               |

Table 8: End-to-end benchmark results capturing component-wise timings and memory usage for the zkVM. All results are averages of wall-clock time taken over twenty iterations, and the memory consumption figures correspond to volatile memory. The inputs to each function were chosen to roughly correspond to the same growth pattern in terms of execution steps.



Figure 3: A graph of the performance of the zkVM. The timing figures (with shaded ranges of performance) are provided for the full execution, tracing, and proving. The proving rate, in kilohertz, and the peak memory usage, in gigabytes, are also charted.

(in particular the fibonacci(77) benchmark) of unclear origin, perhaps due to noise or some microarchitectural details of the specific machine used. Generally speaking, we find these results are mostly consistent with what we would expect from the microbenchmarks: the more involved end-to-end tests are slower due to the time and memory resources spent parsing and loading the binary, emulating and tracing the program, and then handling the input and output (including, e.g., committing to them) within both the emulator and prover integration. However, the performance remains at the same order of magnitude as the microbenchmarks, showing that the proving time ultimately dominates.

EQ #4. Captured by the proving rate, we again see that the marginal cost of proving an additional step of the program is roughly flat, at least until volatile memory resource limits are hit at which point there is a substantial degradation in performance as more expensive memory management is required of the host machine. This behavior is particularly visible in Fig. 3, where the performance drops off exactly as the memory usage curve flattens out. This observation also further supports the supposition made in the microbenchmarks analysis that the (in that case far more slight) decrease in performance was due to memory resource limitations.

EQ #5. The chosen benchmarks are all small enough such that the native execution time for all is effectively constant, likely made up almost entirely of various operating system and architectural overhead to both commence and teardown the execution. As such, the overheads computed by comparison to the zkVM performance are effectively arbitrary. As the performance of the zkVM improves (in particular with respect to memory usage) it should become possible to derive more useful overhead figures: reasonable benchmarking programs will become of sufficient complexity to generate a distinctive performance curve even on the host machine.

**EQ** #6. As highlighted in our preceding analysis, volatile memory resource limitations are a significant factor in our results, as the performance of the zkVM drops off significantly as the peak memory usage of the host is capped by the underlying operating system.<sup>6</sup> As such, more memory efficiency from the Stwo implementation, more efficient constraints, or more available memory resources (either locally or through distribution of the proving load) are all important potential routes to increasing the scalability of the zkVM — and likely just its baseline performance as well.

 $\mathbf{EQ}$  #7. We find that both emulation and tracing of the zkVM and verifying proofs take negligible time in comparison to proving. Even for the most expensive proofs, taking over four minutes to compute, emulation time takes between a third to two-thirds of a second, while verification takes just under a second. The emulation timing shows the additional overhead of the two-pass machine architecture is effectively negligible, justifying its use to improve the user experience of developing for the zkVM. The verification timing is consistent with expectations inherent in the use of a STARK, which trade-off additional prover cost for succinctness (in proof size as well as in verification).

<sup>&</sup>lt;sup>6</sup>We say operating system, and not machine, as it is provisioned with 18GB of RAM and so the physical resources available are not being made fully available to the zkVM. Although this is tunable, we chose to run the benchmarks on a commodity laptop with default resource settings, in order to best replicate the behavior the average end-user might observe when using the software.

# References

- [AST24] Arasu Arun, Srinath T. V. Setty, and Justin Thaler. Jolt: SNARKs for virtual machines via lookups. In Marc Joye and Gregor Leander, editors, *EUROCRYPT 2024, Part VI*, volume 14656 of *LNCS*, pages 3–33. Springer, Cham, May 2024.
- [BBHR18] Eli Ben-Sasson, Iddo Bentov, Yinon Horesh, and Michael Riabzev. Scalable, transparent, and post-quantum secure computational integrity. Cryptology ePrint Archive, Report 2018/046, 2018.
- [BEG<sup>+</sup>94] Manuel Blum, William S. Evans, Peter Gemmell, Sampath Kannan, and Moni Naor. Checking the correctness of memories. Algorithmica, 12(2/3):225-244, 1994.
- [EKRN24] Liam Eagen, Sanket Kanjalkar, Tim Ruffing, and Jonas Nick. Bulletproofs++: Next generation confidential transactions via reciprocal set membership arguments. In Marc Joye and Gregor Leander, editors, EUROCRYPT 2024, Part V, volume 14655 of LNCS, pages 249–279. Springer, Cham, May 2024.
- [GPR21] Lior Goldberg, Shahar Papini, and Michael Riabzev. Cairo a Turing-complete STARK-friendly CPU architecture. Cryptology ePrint Archive, Report 2021/1063, 2021.
- [Hab22] Ulrich Haböck. Multivariate lookups based on logarithmic derivatives. Cryptology ePrint Archive, Report 2022/1530, 2022.
- [HLP24] Ulrich Haböck, David Levit, and Shahar Papini. Circle STARKs. Cryptology ePrint Archive, Report 2024/278, 2024.
- [KS22] Abhiram Kothapalli and Srinath Setty. SuperNova: Proving universal machine executions without universal circuits. Cryptology ePrint Archive, Report 2022/1758, 2022.
- [KS23] Abhiram Kothapalli and Srinath Setty. CycleFold: Folding-scheme-based recursive arguments over a cycle of elliptic curves. Cryptology ePrint Archive, Report 2023/1192, 2023.
- [KS24] Abhiram Kothapalli and Srinath T. V. Setty. HyperNova: Recursive arguments for customizable constraint systems. In Leonid Reyzin and Douglas Stebila, editors, CRYPTO 2024, Part X, volume 14929 of LNCS, pages 345–379. Springer, Cham, August 2024.
- [KST22] Abhiram Kothapalli, Srinath Setty, and Ioanna Tzialla. Nova: Recursive zero-knowledge arguments from folding schemes. In Yevgeniy Dodis and Thomas Shrimpton, editors, CRYPTO 2022, Part IV, volume 13510 of LNCS, pages 359–388. Springer, Cham, August 2022.
- [Mid] Polygon Miden VM. https://oxpolygonmiden.github.io/miden-vm/design/range.html.
- [Nex24] Nexus Laboratories, Inc. Nexus 1.0: Enabling Verifiable Computation, January 2024. Authors D. Marin, M. Abdalla, P. Govereau, J. Groth, S. Judson, K. Sosnin, G.-V. Policharla, and Y. Zhang. Available at https://whitepaper.nexus.xyz.
- [RIS19] RISC-V Foundation. The RISC-V Instruction Set Manual Volume I: Unprivileged ISA, Document Version 20191213, December 2019. Editors Andrew Waterman and Krste Asanovic. Available at https://drive.google.com/file/d/1s01ZxUZaa7eV\_00\_WsZzaurFLLww7ou5/view.

- [SIE21] Sieve: Securing information for encrypted verification and evaluation. DARPA Program. https://www.darpa.mil/research/programs/securing-information-for-encrypted-verification-and-evaluation, 2021.
- [STW24a] Srinath T. V. Setty, Justin Thaler, and Riad S. Wahby. Unlocking the lookup singularity with Lasso. In Marc Joye and Gregor Leander, editors, *EUROCRYPT 2024, Part VI*, volume 14656 of *LNCS*, pages 180–209. Springer, Cham, May 2024.
- [STW24b] Stwo Prover: The next-gen of STARK scaling is here. https://starkware.co/blog/stwo-prover-the-next-gen-of-stark-scaling-is-here/, February 2024.
- [Val08] Paul Valiant. Incrementally verifiable computation or proofs of knowledge imply time/space efficiency. In Ran Canetti, editor, *TCC 2008*, volume 4948 of *LNCS*, pages 1–18. Springer, Berlin, Heidelberg, March 2008.

# Trace Variables

| a, 18–22                                           | h-neq12-flag-aux, $98$ , $100$                   |
|----------------------------------------------------|--------------------------------------------------|
| a-val, 23, 25, 28, 31-33, 35-45, 75, 77-84,        | h-neq12-flag-aux-inv, $98$ , $100$               |
| 86–128                                             | h-neq34-flag, 98-100                             |
| a-val-effective, 28, 32, 35, 36, 39-43, 45, 48     | 1 0,                                             |
|                                                    | h-neq34-flag-aux, 98, 100                        |
| a-val-effective-flag, $28$ , $36$                  | h-neq34-flag-aux-inv, $98$ , $100$               |
| a-val-effective-flag-aux, $28, 36$                 | h-ram-base-addr, $108-120$                       |
| a-val-effective-flag-aux-inv, $28,36$              | h-ram-base-addr-aux, 110-112, 114, 115,          |
| a-val-high, $92$ - $96$                            | 117–120                                          |
| a-val-low, 92-96                                   | h-ram-sext-val, 108-112, 115                     |
| a0, 19-22                                          |                                                  |
|                                                    | h-ram-val-rem, 108-111, 116-118                  |
| a1, 19, 20, 22                                     | h-ram-val-sgn, $108-111$                         |
| a2, 22                                             | h-ram-zext-val, $113$ – $115$                    |
| a3, 19–22                                          | h-rem, 80-85, 87, 89, 90, 101-107                |
| $\mathtt{ai},19,20$                                | h-rem-a, $102$ , $106$                           |
|                                                    | h-rem-b, 82, 83, 90, 102, 103, 106, 107          |
| b, 18-22                                           |                                                  |
| b-val, 23, 25, 29, 31-33, 36-45, 48, 75, 77-127    | h-rem-c, 82, 83, 103, 107                        |
| b-val-high, 92-96                                  | h-sgn-a, 102, 103, 106, 107                      |
| b-val-low, 92-96                                   | h-sgn-b, 82, 83, 90-92, 102, 103, 106, 107       |
|                                                    | h-sgn-c, 82, 83, 103, 107                        |
| b0, 19–22                                          | h1, 75                                           |
| b1, 19, 20, 22                                     | h1-aux-borrow, 52                                |
| b2, 22                                             | h1-borrow, 51                                    |
| b3, 19-22                                          |                                                  |
| bi, 19, 20                                         | h1-carry, 51                                     |
|                                                    | h2-aux-borrow, 52                                |
| c, 18-22                                           | h2-borrow, 51                                    |
| c-val, 23-25, 28-33, 36-45, 48, 75, 77-127         | h2-carry, 51                                     |
| c-val-high, 93-97                                  | h3-aux-borrow, 52                                |
|                                                    | h3-borrow, 51                                    |
| c-val-low, 93-97                                   |                                                  |
| clk, 23, 25, 27, 28, 32, 35, 45, 47-49, 51,        | h3-carry, 51                                     |
| 64-72, 75, 108-120, 128                            | hi-borrow, 51                                    |
| $\mathtt{clk-carry}, 35$                           | hi-carry, $51$                                   |
|                                                    | hn, 75                                           |
| $\mathtt{digest}_{\mathtt{and}},\ 19,\ 20$         |                                                  |
| $\mathtt{digest}_{\mathtt{op}},\ 20-22$            | imm a 92 95 97 90 99 94 96 90 49                 |
| $digest_{or}$ , $19$ , $20$                        | imm-c, 23–25, 27–30, 32–34, 36, 39–42            |
| digest <sub>xor</sub> , 19, 20                     | instr-val, 23, 25, 29-32, 37-45, 55-61           |
| digestixor, 15, 20                                 | is-add, 24, 26-28, 30, 33, 34, 36, 40, 42, 76-79 |
| exp3, 85, 86, 88-92                                | is-alu, 27, 34, 77, 78                           |
| <del>-</del>                                       | is-alu-imm-no-shift, $27, 30, 34, 40$            |
| exp3-aux, 90-92                                    | is-alu-imm-shift, 27, 30, 34, 35, 41             |
| exp5, 84, 85, 87, 90                               | is-and, 24, 26-28, 30, 33, 34, 36, 40, 42,       |
| exp5-aux, 90                                       | 75–77, 94, 95                                    |
| h-borrow, 79-83, 101-103, 105-107                  | is-auipc, 23, 25-29, 33, 34, 36, 37, 76, 77,     |
| h-carry, 78, 79, 97-123, 125                       | 124, 125                                         |
| h-lt-flag, 103, 104, 106, 107                      | is-beq, 23, 25-28, 31, 33, 34, 36, 43, 76, 77,   |
| h-ltu-flag, 82, 83, 101–107                        | 97, 98                                           |
| <del>-</del> ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' |                                                  |
| h-neq-flag, 97-100                                 | is-bge, 24-28, 31, 33, 34, 36, 43, 76, 77,       |
| h-neq-flag-aux, 97-99                              | 106–108                                          |
| h-neq-flag-aux-inv, $97$ , $99$                    | is-bgeu, 24-28, 31, 33, 34, 36, 43, 76, 77,      |
| $\verb h-neq12-flag , 98-100$                      | 104–106                                          |
|                                                    |                                                  |

```
is-blt, 24-28, 31, 33, 34, 36, 43, 76, 77,
                                                                   119, 120
         102-104, 106
                                                          is-sys-cycle-count, 126, 127
\mathtt{is\text{-}bltu},\, 24\text{--}28,\, 31,\, 33,\, 34,\, 36,\, 43,\, 76,\, 77,\\
                                                          is-sys-debug, 126, 127
                                                          is-sys-halt, 77, 78, 126, 127
         100 - 102
is-bne, 23, 25-28, 31, 33, 34, 36, 43, 76, 77,
                                                          is-sys-heap-reset, 48, 126-128
         99, 100
                                                          is-sys-priv-input, 48, 126-128
                                                          is-sys-stack-reset, 48, 126-128
is-ebreak, 23, 25-28, 31, 33, 34, 36, 44, 76,
                                                         is-type-b, 27, 31, 32, 34, 43, 45, 48, 77
         77, 127
                                                          is-type-i, 27, 32, 34, 45, 48
is-ecall, 23, 25-28, 31, 33, 34, 36, 44, 76, 77,
                                                         is-type-i-no-shift, 27, 34, 39, 40
         125, 127
                                                         is-type-j, 27, 29, 32, 34, 38, 45, 48, 77
is-fence, 23, 25, 26, 33, 34, 76
                                                         is-type-r, 23, 27, 32, 34, 35, 42, 45, 48
is-first, 28, 35
                                                          is-type-s, 23, 27, 29, 32, 34, 35, 39, 45, 48,
is-jal, 23, 25-29, 33, 34, 36, 76, 77, 120, 121
                                                                   65, 77, 78
is-jalr, 23, 25-28, 30, 33, 34, 36, 40, 76, 122,
                                                          is-type-sys, 27, 31, 34, 44, 48, 77, 78,
                                                                   126 - 128
is-last, 61, 62, 73, 74
                                                         is-type-u, 27-29, 32, 34-37, 45, 48, 77, 78
is-lb, 24-28, 30, 33, 34, 36, 40, 76, 77, 108,
                                                          is-unimp, 23, 25, 26, 31, 33, 34, 44, 45, 76
                                                         is-xor, 24, 26-28, 30, 33, 34, 36, 40, 42, 76,
is-lbu, 24-28, 30, 33, 34, 36, 40, 76, 77, 113,
                                                                   77, 92-94
is-lh, 24-28, 30, 33, 34, 36, 40, 65, 76, 77,
                                                         lookup_{and}, 94, 95
         109 - 111
                                                         lookupor, 96
is-lhu, 24-28, 30, 33, 34, 36, 40, 65, 76, 77,
                                                         lookup_{xor}, 93
         114 - 116
is-load, 27, 30, 34, 40, 65, 77, 78
                                                         m, 17, 18
is-lui, 23, 25-29, 33, 34, 36-38, 75-77, 123,
                                                         m<sub>and</sub>, 19, 20
                                                         m_{op}, 20-22
is-lw, 24-28, 30, 33, 34, 36, 40, 65, 76, 77,
                                                         m_{or}, 19, 20
         111-113
                                                         m_{xor}, 19, 20
is-or, 24, 26-28, 30, 33, 34, 36, 40, 42, 76, 77,
         95 - 97
                                                         op-a, 23, 25, 28-32, 35-45, 128
is-pad, 24, 26, 28, 33, 35, 37, 48, 57, 76
                                                         op-a0, 37-43
is-pc-inc-std, 77, 78
                                                         op-a1, 37-39, 41-43
is-reg-addr, 54
                                                         op-a1-4, 37-43
is-sb, 24, 26-28, 33, 34, 36, 39, 76, 77, 116,
                                                         op-b, 23, 25, 28-32, 36-45
                                                         op-b-flag, 23, 25, 28, 32, 33, 36, 45, 48
is-sh, 24, 26-28, 33, 34, 36, 39, 65, 76, 77,
                                                         op-b0, 40-42
         117, 118
                                                         op-b0-3, 39, 43
is-sll, 24, 26-28, 30, 33, 34, 36, 41, 42, 76,
                                                         op-b1, 39-42
         77, 84–86
                                                         op-b1-4, 40-42
is-slt, 24, 26-28, 30, 33, 34, 36, 40, 42, 76,
                                                         op-b4, 39, 43
         77, 82, 83
                                                         op-c, 23-25, 28-33, 36-45
is-sltu, 24, 26-28, 30, 33, 34, 36, 40, 42, 76,
                                                         op-c0, 39
         77, 80, 81
                                                         op-c0-10, 30
is-sra, 24, 26-28, 30, 33, 34, 36, 41, 42, 76,
                                                          op-c0-3, 39-42
         77, 89–92
                                                         op-c0-4, 29, 30
is-srl, 24, 26-28, 30, 33, 34, 36, 41, 42, 76,
                                                         op-c1-10, 24, 25, 29
         77, 87, 88
                                                         op-c1-3, 38
is-sub, 24, 26-28, 30, 33, 34, 36, 42, 76, 77,
                                                         op-c1-4, 31, 39, 43
                                                         op-c11, 24, 25, 29-31, 38-40, 43
         79, 80
is-sw, 24, 26-28, 33, 34, 36, 39, 65, 76, 77,
                                                         op-c12, 31, 43
```

```
op-c12-15, 36-38
                                                      qt4, 85, 86, 88, 89, 91, 92
op-c12-19, 24, 25, 29
                                                      qtj, 85, 86, 88, 89, 91, 92
op-c12-31, 28, 29
                                                      ram-addr, 63-65, 67, 70, 71
op-c16-19, 38
                                                      ram-addr-borrow, 73, 74
op-c16-23, 36, 37
                                                      ram-addr-cur, 73, 74
op-c20, 24, 25, 29, 38
                                                      ram-addr-diff, 73, 74
op-c24-31, 36, 37
                                                      ram-addr-next, 73, 74
op-c4, 41, 42
                                                      ram-base-addr, 64-68, 70-72
op-c4-7, 38-40
                                                      ram-init, 72
op-c5-10, 29, 31
                                                      rami-ts-prev, 68-70
op-c5-7, 39, 43
                                                      ram-final, 72
op-c8-10, 38-40, 43
                                                      ram-init-final-addr, 72-74
opcode, 23, 26, 32-34, 45
                                                      ram-init-final-flag, 72-74
                                                      ram-init-final-flag-next, 73, 74
pc, 23, 25, 27, 28, 32, 35, 45, 55–61, 75, 77, 78
pc-aux, 28, 35
                                                      ramj-ts-prev, 66, 67, 71
                                                      ram-read-set, 63
pc-carry, 77, 78, 121-123
pc-next-aux, 122, 123
                                                      ram-read-digest, 64, 67, 68, 70, 71
                                                      ram-read-final-digest, 67, 68, 70, 71, 74
pc-qt-aux, 122, 123
                                                      ram-ts, 63, 64, 67, 70
pc-rem-aux, 122, 123
                                                      ram-ts-cur, 63, 64
pc-next, 23, 25, 28, 32, 35, 45, 75, 77, 78
                                                      ram-ts-final, 67, 70-72, 74
prog-accessed, 57
prog-addr-aux, 62
                                                      ram-ts-prev, 63, 64
prog-addr-borrow, 62
                                                      ram-val, 63, 64, 67, 70, 71
                                                      ram-val-cur, 63, 64
prog-addr-cur, 62
                                                      ram-val-final, 72-74
prog-addr-diff, 62
prog-addr-next, 62
                                                      ram-vali, 110, 112, 115, 118
prog-ctr, 55, 56, 58, 60
                                                      ram-val-init, 72-74
                                                      ram-valj, 111, 112, 115, 118-120
prog-ctr-carry, 58, 59
                                                      ram-val-prev, 63, 64
prog-ctr-cur, 55, 56, 58-61
                                                      ram-val1, 66, 68, 108-120
prog-ctr-final, 56, 58, 60-63
                                                      ram-val2, 66, 68, 110-112, 115, 117-120
prog-ctr-prev, 55-61
prog-init-base-addr, 61-63
                                                      ram-val3, 66, 68, 112, 119, 120
                                                      ram-val4, 66, 68, 112, 119, 120
prog-init-flag, 61-63
                                                      ram-write-set, 63, 64
prog-init-flag-next, 62
                                                      {\tt ram\text{-}write\text{-}digest, \, 64, \, 67, \, 68, \, 70\text{--}72, \, 74}
prog-read-digest, 55, 58-61
                                                      ram-write-init-digest, 65, 67, 68, 70, 71, 74
prog-read-final-digest, 58-63
prog-read-set, 56
                                                      ram1-accessed, 64, 65, 67, 68, 70-72
                                                      ram1-ts-prev, 64, 66-71
prog-val-init, 61-63
{\tt prog-write-digest},\, 56,\, 59\text{--}61,\, 63
                                                      ram1-ts-prev-aux, 66, 68, 69
                                                      ram1-ts-prev-borrow, 69
prog-write-init-digest, 56, 58-63
                                                      ram1-val-cur, 64, 67, 68, 70-72
prog-write-set, 56
                                                      ram1-val-prev, 64, 66-68, 70, 71
pub-in-addr, 65, 67, 70
                                                      ram2-accessed, 64, 65, 67, 68, 70-72
pub-in-flag, 72-74
                                                      ram2-ts-prev, 66-71
pub-in-val, 65, 72-74
pub-io-addr, 72-74
                                                      ram2-ts-prev-aux, 66, 69
                                                      ram2-ts-prev-borrow, 69
pub-out-flag, 72-74
                                                      ram2-val-cur, 67, 68, 70-72
pub-out-val, 72-74
                                                      ram2-val-prev, 66-68, 70, 71
at1.84-92
                                                      ram3-accessed, 64, 65, 67, 68, 70-72
qt2, 85, 86, 88, 89, 91, 92
                                                      ram3-ts-prev, 66-71
qt3, 85, 86, 88, 89, 91, 92
                                                      ram3-ts-prev-aux, 66, 69
```

| ram3-ts-prev-borrow, 69                                | reg2-ts-prev-aux, 50, 52             |
|--------------------------------------------------------|--------------------------------------|
| ram3-val-cur, 67, 68, 70-72                            | reg2-val-cur, 47-50, 53              |
| ram3-val-prev, 66-68, 70, 71                           | reg2-val-prev, 47-50, 52, 53         |
| ram4-accessed, $64$ , $65$ , $67$ , $68$ , $70$ - $72$ | reg3, 48, 49                         |
| $\verb ram4-ts-prev , 64, 66-71 $                      | reg3-accessed, $47-50$ , $52$ , $53$ |
| ram4-ts-prev-aux, 66, 69                               | reg3-addr, 47-50, 52, 53             |
| ram4-ts-prev-borrow, 69                                | reg3-ts-cur, 47-53                   |
| ram4-val-cur, 64, 67, 68, 70-72                        | reg3-ts-prev, $47-50$ , $52$ , $53$  |
| ram4-val-prev, 64, 66-68, 70, 71                       | reg3-ts-prev-aux, $50$ , $52$        |
| rami-ts-prev-aux, 69                                   | reg3-val-cur, $47-50$ , $53$         |
| rami-ts-prev-borrow, 69                                | reg3-val-prev, $47-50$ , $52$ , $53$ |
| ramj-accessed, $65$ , $66$ , $70$                      | regi-ts-cur, $48$ , $50$             |
| ramj-ts-prev, 70                                       | regi-ts-prev, $48$ , $50$ , $51$     |
| ramj-ts-prev-aux, 66                                   | regi-ts-prev-aux, $50, 51$           |
| ramj-val-cur, 66, 70                                   | regj-accessed,48,49,53               |
| ramj-val-prev, 66, 67, 70, 71                          | ${\sf regj-addr}, 47, 49, 52, 53$    |
| reg-addr, 46, 47, 49, 52, 53                           | regj-ts-cur, $47, 49, 52, 53$        |
| reg-init-final-addr, 54                                | regj-ts-prev, 47, 49, 52, 53         |
| reg-read-set, 46                                       | regj-val-cur, 47, 49, 52, 53         |
| reg-read-digest, 47, 49, 50, 52, 53                    | regj-val-prev, 47, 49, 52, 53        |
| reg-read-final-digest, 49, 50, 53, 54                  | rem1, 84-91                          |
| reg-read-write-digest, 50                              | rem1-aux, 84, 87, 88, 90, 91         |
| reg-ts, 46, 47, 52                                     | rem2, 85, 86, 88, 89, 91             |
| reg-ts-cur, 46, 47                                     | rem2-aux, 88, 91                     |
| reg-ts-final, 49, 53, 54                               | rem3, 85, 86, 88, 89, 91             |
| reg-ts-prev, 46, 47                                    | rem3-aux, $88, 91$                   |
| reg-val, 46, 47, 52                                    | rem4, 85, 86, 88, 89, 91             |
| reg-val-cur, 46, 47                                    | rem4-aux, 88, 91                     |
| reg-val-final, 49, 53, 54                              | remj, 85, 86, 88, 89, 91             |
| reg-val-prev, 46, 47                                   | remj-aux, 86, 89                     |
| reg-write-set, 46                                      | row-index, $54$                      |
| reg-write-digest, 47, 49, 50, 53, 54                   |                                      |
| reg-write-init-digest, 47, 49, 50, 53, 54              | sh1, 84–92                           |
| reg1, 49                                               | sh2, 84-92                           |
| reg1-accessed, 47-50, 52, 53                           | sh3, 84-92                           |
| reg1-addr, 47-50, 52, 53                               | sh4, 84-92                           |
| reg1-ts-cur, 47-53                                     | sh5, 84-92                           |
| reg1-ts-prev, 47-50, 52, 53                            | sra-mask, 91                         |
| reg1-ts-prev-aux, 50, 52                               | srl1, 91, 92                         |
| reg1-val-cur, 47-50, 53                                | srl2, 91, 92                         |
| reg1-val-prev, 47-50, 52, 53                           | srl3, 91, 92                         |
| reg2, 49                                               | srl4, 91, 92                         |
| reg2-accessed, 47-50, 52, 53                           | 17 10                                |
| reg2-addr, 47-50, 52, 53                               | v, 17, 18                            |
| reg2-ts-cur, 47-53                                     | x, 17, 18                            |
| reg2-ts-prev, 47-50, 52, 53                            | x <sub>range</sub> , 17, 18          |
| 1082 00 PICV, 11 00, 02, 00                            | range, 11, 10                        |